Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Mar 12.
Published in final edited form as: Methods Mol Biol. 2017;1613:125–159. doi: 10.1007/978-1-4939-7027-8_7

Extracting the Strongest Signals from Omics Data: Differentially Expressed Pathways and Beyond

Galina Glazko, Yasir Rahmatallah, Boris Zybailov, Frank Emmert-Streib
PMCID: PMC5846121  NIHMSID: NIHMS945361  PMID: 28849561

Abstract

The analysis of gene sets (in a form of functionally related genes or pathways) has become the method of choice for extracting the strongest signals from omics data. The motivation behind using gene sets instead of individual genes is two-fold. First, this approach incorporates pre-existing biological knowledge into the analysis and facilitates the interpretation of experimental results. Second, it employs a statistical hypotheses testing framework. Here, we briefly review main Gene Set Analysis (GSA) approaches for testing differential expression of gene sets and several GSA approaches for testing statistical hypotheses beyond differential expression that allow extracting additional biological information from the data. We distinguish three major types of GSA approaches testing: (1) differential expression (DE), (2) differential variability (DV), and (3) differential co-expression (DC) of gene sets between two phenotypes. We also present comparative power analysis and Type I error rates for different approaches in each major type of GSA on simulated data. Our evaluation presents a concise guideline for selecting GSA approaches best performing under particular experimental settings. The value of the three major types of GSA approaches is illustrated with real data example. While being applied to the same data set, major types of GSA approaches result in complementary biological information.

Keywords: Omics data, Gene set analysis approaches, Hypotheses testing, Self-contained, Competitive, Differential expression, Differential co-expression, Differential variability

1 Introduction

Biological systems are living proofs of Aristotle’s idea that the whole is greater than the sum of its parts. For example, cell is a product of synergistic actions of its constituents (genes, proteins, metabolites, just to name a few). Together with cellular environment this synergy defines what we call the cell type (e.g., a stem cell or dendritic cell). At the level of cell’s key molecules (nucleic acids and proteins) the idea of synergy also holds true the following: genes work together in biological pathways, proteins form protein complexes, that is genes and proteins are organized in functional units acting overall differently than a single gene or a single protein would. Thus, when an investigator studies omics data, the idea to consider functional units instead of individual components comes naturally to mind. In fact, this idea was first employed for the analysis of gene expression data more than a decade ago [1]. Analyzing microarray data from diabetics vs. healthy controls Mootha and colleagues [1] did not find a single gene to be differentially expressed. However, when genes were analyzed at the pathway level using Gene Set Enrichment Analysis (GSEA) approach, it was found that genes involved in oxidative phosphorylation showed reduced expression in diabetics although the average decrease per gene was only 20% [1]. There were two reasons behind the success of the pathway analysis approach in this case. First, the number of hypotheses to test by arranging genes into pathways is dramatically reduced, which leads to the increase in power. Second, in metabolic diseases such as diabetes changes in gene expression are moderate and therefore can be overlooked by using methods focusing on each gene individually. These two reasons explain why pathway analysis has become the method of choice in analyzing omics data in general and expression data in particular. Nowadays, we also recognize yet another important reason to employ pathway (gene set) analysis for omics data. Gene Set Analysis (GSA) approaches provide flexibility to test different statistical hypotheses, thus increasing the biological interpretability of experimental results. Here, we briefly review main Gene Set Analysis (GSA) approaches for testing differential expression of gene sets and several GSA approaches for testing statistical hypotheses beyond differential expression, which allow extracting additional biological information from the data.

We distinguish the three major types of GSA approaches that test statistically and biologically different hypotheses: (1) differential expression (DE), (2) differential variability (DV), and (3) differential co-expression (DC) of gene sets between two phenotypes. All major types of GSA approaches can be univariate (gene-level) or multivariate (accounting for intergene correlations). The chapter is organized as follows: In the first part of Subheading 2, we discuss GSA approaches developed for identification of differentially expressed pathways applicable for the analysis of microarrays and RNA-seq data (GSA-DE). The traditional GSA-DE framework aims to identify pathways with significant changes in mean gene expressions and it is well understood. In the second part of Subheading 2, DV analysis in application to gene sets (GSA-DV) is considered. The analysis of differential variability (DV) is somewhat appreciated with regards to individual genes, when the aim is to find genes with significant changes in expression variance between two phenotypes [26]. It was shown that many statistically significant DV genes are relevant to disease development and that DV is an indication of changes in gene regulation [2, 3]. Moreover, it was found that there are genes showing consistently higher across-sample variability in tumors of different origin as compared to normal samples [7]. These DV genes can serve as a robust molecular signature for multiple cancer types [7, 8]. Given the evidence that DV genes may play an important role in observed phenotypes, and given the popularity of GSA approaches one would expect there are many approaches implementing GSA-DV test. Our group was the first to suggest extending the DV analysis to a multivariate GSA-DV case using multivariate statistical test [9, 10]. In the same publication we further demonstrated that for three different cancer types GSA-DV approach was able to identify cancer-specific pathways, while pathways identified using conventional GSA-DE approaches were shared between the three cancer types. Thus, GSA-DV approach provides additional biological information beyond GSA-DE. It should be noted that there are other approaches claiming to perform GSA-DV test, e.g., DIRAC and EVA [11], but because they compare variability in gene ranks within a pathway between two phenotypes rather than variance estimates, these approaches are out of the scope of this chapter. We discuss two principally different GSA-DV approaches: (1) nonparametric multivariate GSA-DV approach, “radial” Kolmogorov-Smirnov (RKS) [9] and (2) new gene-level GSA-DV test we suggest here for the first time. This gene-level GSA-DV approach applies Fisher Method (FM) [12] for combining P-values from gene-level F-test for differential variability [3]. It should be noted that currently GSA-DV approaches are applicable only to micro-array data, because RNA-seq read counts are most frequently modeled with Negative Binomial distribution that has complex dependence between mean and variance. In the third part of Subheading 2, GSA approaches estimating differential co-expression of gene sets between two phenotypes (GSA-DC) are considered. In a pathway, genes are working together, i.e., they form a co-expression network. For finding DC pathways GSA-DC approaches with or without network inference step can be employed. The most general GSA-DC approach with a network inference step is based on a Gaussian Graphical Model (GGM) [13]. In this approach, the network structure of a pathway for each phenotype is estimated and the null hypothesis to test is that the network structure across phenotypes is the same [13]. The network inference step per se is challenging because there are too many ways of estimating network structure. For example, the implementation of network inference in Bioconductor package nethet (that provides two-sample testing in GGMs) includes several options, such as the Graphical Lasso (GL) [14], the Meinshausen-Buhlmann approach [15], and the approach proposed by Schafer and Strimmer based on shrinkage estimation of the covariance matrix [16]. Needless to say, the nethet results for networks comparison will vary significantly depending on the algorithm selected for the network inference step. In addition, many approaches for network inference (e.g., GGM) require the assumption of normality that may or may not be met in the real data. This is why we present in this review only GSA-DC approaches that do not require a network inference step. The simplest GSA-DC approach, the gene sets co-expression analysis (GSCA) [17] is purely univariate. GSCA calculates the Euclidian distance between two correlation vectors (constructed from diagonal matrices of pairwise correlations for different conditions) and the significance of the difference is estimated using permutation test. The gene sets net correlations analysis (GSNCA) [18] assesses multivariate changes in the gene co-expression network between two conditions but does not require network inference step. Net correlation changes are estimated by introducing for each gene a weight factor that characterizes its cross-correlations in the co-expression networks. Weight vectors in both conditions are found as eigenvectors of correlation matrices with zero diagonal elements. Gene sets net correlations analysis (GSNCA) tests the hypothesis that for a gene set there is no difference in the gene weight vectors between two conditions [18]. The Co-expression Graph Analysis (CoGA) identifies co-expressed gene sets by statistically testing the equality in the spectral distributions [19]. For each phenotype CoGA constructs a full network from pairwise correlations between gene expressions. Then the structural properties of the two networks are compared by applying Jensen-Shannon divergence as a distance measure between the graph spectrum distributions [19, 20]. All methods are supplied with the implementation reference if available.

In Subheading 3, we first present a comparative power analysis and Type I error rates for different approaches in each major type of GSA on simulated data. Second, the value of applying the three major types of GSA approaches is illustrated with real data example, where these approaches provide different biological information obtained on the same data set.

2 Methods

2.1 Gene Set Analysis Approaches for Testing Differential Expression (GSA-DE)

There are many GSA-DE approaches readily distinguished based on the null hypothesis they test. According to Goeman and Buhlmann [21] the formulation can be either self-contained or competitive. Self-contained approaches compare whether a gene set is differentially expressed between different conditions, while competitive (e.g., GSEA) approaches compare a gene set against its complement that contains all genes except genes in the set [21, 22]. Self-contained approaches can be (1) univariate, in a sense that they use gene-level tests for GSA and combine univariate statistics for individual genes into a single test score [10, 23, 24]; and (2) multivariate, when a multivariate statistic is used to address the null hypothesis. In a real biological setting, moderate [25] and extensive [26] correlations between genes in gene sets are well documented [27] and that may result in a decrease of the power for gene-level tests compared to multivariate tests [24, 2729]. In turn, competitive GSA approaches can be (1) “supervised,” when the class labels are known; or (2) “unsupervised,” when the enrichment score is computed for each gene set and individual sample [30]. For GSA-DE the “supervised” term indicates that the samples classification is known, while the “unsupervised” term indicates that the samples classification is unknown [30]. A number of review articles concerning the different aspects of GSA-DE approaches developed for microarrays data analysis have been published [21, 23, 3136].

To summarize, GSA-DE approaches that test intrinsically statistically different null hypotheses developed thus far are: self-contained (univariate, multivariate) and competitive (supervised, unsupervised). Figure 1 illustrates different null hypotheses tested by various GSA-DE approaches together with R packages implementing each test. For the sake of generality, all power and Type I error rate estimates for GSA-DE approaches are presented for simulated RNA-seq counts.

Fig. 1.

Fig. 1

Schematic overview illustrating the breakup of the GSA-DE methods into different categories based on the null hypotheses they test

Null Hypotheses

Consider two different biological phenotypes, with n1 samples of measurements for the first and n2 samples of the same measurements for the second. Let the two random vectors of X = (X1,…, Xn1) and Y = (Y1,…, Yn2) represent the measurements of p gene expressions (constituting a pathway) in two phenotypes where Xi is the ith p-dimensional sample in one phenotype and Yi is the ith p-dimensional sample in the other phenotype. Let X, Y be independent and identically distributed with the distribution functions Fx, Fy, mean vectors μ̄x and μ̄y, and p × p positive-definite and symmetric covariance matrices Σx and Σy.

H0 for self-contained tests

For multivariate self-contained tests we consider the problem of testing the general hypothesis H0: Fx = Fy against an alternative FxFy, or a restricted hypothesis H0: μ̄x = μ̄y against an alternative μ̄xμ̄y depending on a test statistic.

Gene-level GSA approaches test a null hypothesis that the gene set-associated score does not differ between phenotypes. The score can be calculated, for example, as an L2-norm of the moderated t-statistics [37] or as combined P-values [24]. In all cases statistical significance is evaluated by comparing the observed score with the null distribution, obtained by permuting sample labels.

H0 for competitive tests

The Gene Set Enrichment Analysis (GSEA) method [1, 38] is one of the most widely used competitive approaches. As a local test statistic, it uses a signal-to-noise ratio and a weighted Kolmogorov-Smirnov as a global test statistic (enrichment score, normalized to factor out the gene set size dependence) [34, 38]. Assuming a null distribution F0perm induced by permuting sample labels, GSEA evaluates significance of the global test statistic ζkGSEA by estimating nominal P-value from F0perm [34, 38]. Thus, GSEA tests the null hypothesis that the genes in a gene set are randomly associated with the phenotype.

Most competitive GSA approaches are supervised, in a sense that sample labels are known (that is, there are at least two different phenotypes). Recently, the concept of unsupervised GSEA where an enrichment score is computed for each gene set and individual sample was introduced [30]. Essentially, unsupervised GSEA transforms a matrix of gene expressions across samples into a matrix of gene sets enrichment scores across the same samples. It makes the choice of null hypothesis flexible and context dependent. For example, Barbie et al. [39] use unsupervised competitive GSEA to test the null hypothesis that the Spearman correlation between gene sets enrichment scores is zero, while Hazelmann et al. [30] test the hypothesis that gene set enrichment score does not differ between two phenotypes.

SELF-CONTAINED GENE-LEVEL TESTS FOR GSA

Gene-level tests for GSA can be easily designed in three steps: (1) select a gene-level score based on a univariate test statistic (e.g., a value of t-test), (2) transform a score (e.g., take an absolute value of t-statistic, or consider its P-value), and (3) summarize gene-level scores into a gene set statistic (e.g., take an average of transformed scores or use combining P-values approach) [10, 23, 24].

Gene-level GSA-DE tests that combine genes P-values

Gene-level tests for GSA that combine P-values from individual tests for microarray data were studied in [40]. As a gene-level test, the authors used an F-statistic for the correlation between the gene expression and phenotype (F = (N − 2)[r2/(1 − r2)] (not to confuse with F-test) and compared several approaches for combining P-values: Fisher’s method (FM) [12], Stouffer’s method (SM) [41], tail strength (TS) [42], and a modified tail strength statistic (MTS) [40]. It was found that FM outperformed all the other methods for combining P-values [40].

Gene-level tests for GSA that combine P-values from individual tests for RNA-seq data were studied in [24]. In what follows, we briefly reiterate the conclusions from comparative power and Type I error rate analyses of different gene-level GSA tests [24]. There are two popular univariate tests specifically developed for RNA-seq data that rely on Negative Binomial model for read counts: edgeR [43] and DESeq [44]. Empirical Bayes method eBayes [45] correctly identifies hypervariable genes and can be adapted for RNA-seq data through VOOM normalization [46]. When applied correctly the gene-level test does not per se influence the performance of a gene-level GSA approach as much as the procedure used to combine univariate statistics into a single test score does [24]. Among many approaches available for combining P-values from gene-level tests, we have shown that, similar to the results for microarray data, the safest option is to use FM [24, 47]. Here, for comparative power and Type I error rate estimates eBayes in combination with FM is selected.

Gene-level GSA-DE test that combines statistics

In the analysis of microarrays, shrinking the standard error of a test statistic (e.g., a t-test) in testing DE of individual genes improves the power of the test. Several shrinkage approaches at the level of individual genes were suggested, including the Significance Analysis of Microarrays (SAM) test [48], the regularized t-test [49], and the moderated t-test [50]. In particular, an extension of SAM test to gene set analysis (SAM-GS) has been demonstrated to outperform several conventional self-contained tests and even the original competitive GSEA approach for microarray data [10, 37, 51, 52].

SAM-GS can be applied to RNA-seq count data by using the VOOM normalization [46] prior to the test to find the log-scale counts per million (CPM) of the raw counts normalized for library sizes. The test statistic is the L2-norm of the moderated t-statistics for the gene expressions:

TSAMGS=i=1p(XiYisi+s0)2

where i and Ȳi are respectively the mean expression levels for gene i under phenotypes X and Y, si is a pooled standard deviation over the samples in the two phenotype, s0 is a small positive constant to adjust for small variability, and p is the number of genes in the gene set.

SELF-CONTAINED MULTIVARIATE TESTS FOR GSA

Based on their high power and popularity we consider two multivariate test statistics.

N-statistic

N-statistic [53, 54] tests the most general hypothesis H: Fx = Fy against a two-sided alternative FxFy:

Nn1n2=n1n2n1+n2[1n1n2i=1n1j=1n2L(Xi,Yj)12n12i=1n1j=1n2L(Xi,Xj)12n22i=1n1j=1n2L(Yi,Yj)]1/2

Here, we consider only L(X, Y) = XY, the Euclidian distance in Rp. N-statistic was applied tomicroarray data and was shown to outperform other univariate and multivariate GSA-DE tests under different parameter settings [10, 28]. After VOOM normalization [46] N-statistic can also be applied to RNA-seq data and also was shown to outperform other GSA-DE tests [24, 47].

ROAST

In the context of microarray data, a parametric multivariate rotation gene set test (ROAST) has become popular for the self-contained GSA approaches [55]. ROAST uses the framework of linear models and tests whether for all genes in a set, a particular contrast of the coefficients is nonzero [55]. It can account for correlations between genes and has the flexibility of using different alternative hypotheses, testing whether the direction of changes in mean is up, down, or mixed (up or down) [55]. For microarrays it was shown that when correlations are low ROAST performance is similar to N-statistic [10]. Using ROAST with RNA-seq count data requires proper normalization. The VOOM normalization [46] was proposed specifically for this purpose where log counts per million, normalized for library size are used. In addition to counts normalization, VOOM calculates associated precision weights that can be incorporated into the linear modeling process within ROAST to eliminate the mean-variance trend in the normalized counts [46].

SUPERVISED COMPETETIVE TESTS FOR GSA

ROMER

The first competitive GSA test for microarray data analysis GSEA [1] was developed a decade ago. The original GSEA was sensitive to the gene set size and the influence of other gene sets [56], so it was subsequently upgraded into GSEA-P that used a correlation-weighted KS statistic, an improved enrichment normalization, and an FDR-based estimate of significance [34, 38]. For the sake of simplicity, we will only consider the GSEA version implemented in Bioconductor package limma function ROMER (the rotation testing using mean ranks) [57]. ROMER is a parametric method developed originally for microarray data and uses the framework of linear models [46] and rotations instead of permutations (see ref. 55 for more detail). In contrast to ROAST, the limma implementation of ROMER does not incorporate the weights, estimated by VOOM into the linear modeling process to account for the mean-variance trend in the data.

UNSUPERVISED COMPETETIVE TESTS FOR GSA

The goal of unsupervised competitive approaches is to characterize the degree of expression enrichment of a gene set in each sample within a given data set [39]. The term “competitive” is reminiscent of the way the enrichment score is calculated: as a function of gene expression inside and outside the gene set.

Gene set variation analysis (GSVA)

GSVA can be applied to microarray expression values or RNA-seq counts. Depending on the data type, expression values (counts) are first transformed using a Gaussian (or discrete Poisson) kernel into expression-level statistics [30]. The sample-wise enrichment score for a gene set is calculated using KS-like random walk statistic. An enrichment statistic (GSVA score) can be calculated as its maximum deviation from zero over all genes (similar to the original GSEA) or as the difference between the largest positive and negative deviations from zero (see ref. 30 for more detail).

Single sample extension of GSEA (ssGSEA)

The difference between GSVA and ssGSEA stems from the way an enrichment score is calculated. In ssGSEA the enrichment score for a gene set under one sample is calculated as a sum of the differences between two weighted empirical cumulative distribution functions of gene expressions inside and outside the set [39]. The approach, together with GSVA, is implemented in the Bioconductor GSVA package [30].

2.2 Gene Set Analysis Approaches for Testing Differential Variability (GSA-DV)

It is well recognized that multivariate statistics have more power than univariate in the case of GSA-DE when intergene correlations are high [24, 2729]; however, in the case of GSA-DV, this question was not studied at all. Here, we address this shortcoming by providing comparative power analysis for RKS, N-statistic, and gene-level approach for GSA-DV (see below).

Null Hypotheses

H0 for GSA-DV

While H0 for RKS is the same general hypothesis tested, e.g., by N-statistic, namely H0: Fx = Fy, an alternative in this case is not FxFy or μ̄xμ̄y but σ̄xσ̄y, i.e., differences in scale. N-statistic tests an alternative FxFy. Because this general alternative implicitly includes inequality of variances for distribution functions Fx and Fy, N-statistic can also capture differences in scale, so if H0 is rejected by N-statistic the true alternative is unknown. N-statistic is included in comparative power analysis for GSA-DV.

Gene-level GSA-DV approach we suggest here tests a null hypothesis that the gene-set-associated score does not differ between phenotypes. The score here is calculated by applying FM to combine P-values from gene-level F-test of the equality of two variances.

Gene-level GSA-DV test that combines genes P-values

To find genes with significant variability we suggest using F-test, similar to what was described for individual genes by Ho and colleagues [3]. Gene-level GSA-DV test is designed by combining P-values of individual F-tests for genes in a pathway. Because for gene level GSA-DE FM was found to be the best performing approach for combining P-values among many others [24, 40] FM is also applied here to combine P-values of F-tests. This method tests the alternative hypothesis that there are genes DV between two phenotypes.

Radial Kolmogorov Smirnov (RKS)

The basic operational procedure employed in the univariate Kolmogorov-Smirnov test is to sort pooled observations in ascending order. The difficulty in extending this procedure to multivariate observations is that the notion of a sorted list cannot be immediately generalized [9]. Friedman and Rafsky suggested overcoming this difficulty using the Minimum Spanning Trees (MSTs) [9]. The multivariate generalization of KS ranks multivariate observations based on their MST. The purpose of MST ranking is to obtain the strong relation between observations differences in ranks and their distances in Rp. The ranking algorithm can be designed specifically to confine a particular alternative hypothesis more power. The general scheme is to root MST tree at a node with the largest geodesic distance and then rank the nodes in the “height directed preorder” traversal of the tree. If one is interested in a test with high power toward changes in the variance structure of the distribution, the ranking is implemented differently, aiming to give higher ranks to more distant points in Rp. That is, MST tree is rooted at the node with the smallest geodesic distance (centroid) and nodes with the largest depths are assigned higher ranks [9]. This “radial” Kolmogorov-Smirnov (RKS) test is sensitive to alternatives having similar mean vectors but differences in scale. The test statistic considering N samples under two phenotypes X and Y is the maximum absolute difference

D=|sX(i)NXsY(i)NY|

where sX(i) and sY(i) are respectively the number of observations in X and Y ranked lower than i, 1 ≤ iN, NX and NY are respectively the number of samples under phenotypes X and Y. The null distribution of the test statistic is estimated by a permutation procedure and P-value is defined as

Pvalue=k=1NpermI[Dperm(k)Dobs]+1Nperm+1

where Dperm(k) is the test statistic of permutation k, Dobs is the observed test statistic from the original data, Nperm is the number of permutations, and I is the indicator function. RKS is implemented in Bioconductor package Gene Set Analysis in R ( GSAR) [10, 18].

2.3 Gene Set Analysis Approaches for Testing Differential Co-Expression (GSA-DC)

Null Hypotheses

Each individual GSA-DC approach we consider has its own null hypothesis (see below).

Gene Sets Co-Expression Analysis (GSCA)

Briefly, GSCA works as follows [17]. For all p(p−1)/2 gene pairs, GSCA calculates intergene correlations under the two biological conditions. The test statistic is the Euclidean distance, adjusted for the size of a gene set,

DGSCA=1p(p1)/2k=1p(p1)/2(ρk(1)ρk(2))2

where k is the index of the gene pair within the gene set and ρk(i) denotes the correlation of gene pair k in condition i. GSCA tests the hypothesis H0: DGSCA =0 against the alternative H1: DGSCA ≠0.

Gene Sets Net Correlations Analysis (GSNCA)

In order to quantitatively characterize the importance of gene i in a correlation network, we introduce a weight (wi) and set wi to be proportional to a gene’s cross-correlation with all the other genes in the gene set [24]. Then, the objective is to find a weight vector w, which achieves equality between a gene weight and the sum of its weighted cross-correlations for all genes simultaneously. Thus, genes with high cross-correlations will have high weights that may indicate their regulatory importance. This problem can be formulated as a system of linear equations

wi=jiwjrij,1ip

where rij is the absolute correlation coefficient between genes i and j, and p is the gene set size. Equivalently, this system of linear equations can be represented in the matrix form

(RI)w=w

where R is the correlation matrix. This is an eigenvector problem that has a unique solution when the eigenvalue λ(RI) = 1, w > 0. Because the matrix (RI) is not guaranteed to have eigenvalue λ(RI) = 1, we introduce a multiplicative factor, γ, which ensures a proper scaling for eigenvalues and solves the following problem:

γ(RI)w=w

The unique solution w is an eigenvector of matrix γ(RI) corresponding to λγ(RI) = 1 [24]. As a test statistic, wGSNCA, we use the L1 norm between the scaled weight vectors w(1) and w(2) (each vector is multiplied by its norm to scale the weight factor values around one) between two conditions,

wGSNCA=i=1p|wi(1)wi(2)|

This statistic tests the hypothesis H0: wGSNCA = 0 against the alternative H1: wGSNCA ≠ 0. P-values for the test statistic are obtained by comparing the observed value of the test statistic to its null distribution, which is estimated using a permutation approach. GSNCA is implemented in Bioconductor package GSAR [10, 18].

Co-expression Graph Analyzer (CoGA)

Let G = (V, E) be an undirected graph with the adjacency matrix A. The spectrum of G is a set of eigenvalues of its adjacency matrix A [20]. The spectrum of a graph describes several of its structural properties, such as diameter, number of walks, and cliques. Takahashi and colleagues [20] suggested that the graph spectrum distribution is a better characterization of graph’s properties than conventionally used measures such as number of edges, average path length, and clustering coefficient. Co-expression Graph Analyzer (CoGA) constructs co-expression graphs and identifies differentially co-expressed gene sets by testing the equality of the spectral distributions for two graphs by calculating Jensen-Shannon divergence between spectral densities of two adjacency matrices [19]. Let Θ measure the distance between structural properties of two graphs. CoGa tests H0: Θ = 0 against H1: Θ > 0 [19]. CoGA is implemented in Biocnductor package CoGA [20].

3 Data Analysis

3.1 Comparative Power Analysis and Type I Error Rate: Simulation Setup

Simulation Setup for GSA-DE

Due to the increasing popularity of RNA-seq data as compared to microarrays the simulation setup here is presented in the context of RNA-seq data. It is conventionally assumed that RNA-seq count data follow Poisson or Negative Binomial (NB) distribution. Here, the count for gene i in sample j is modeled by a random variable Cij with NB distribution

Cij~NB(mean=μij, var=μij(1+μijφij))=NB(μij,φij)

where μij and φij are respectively the mean and dispersion parameters of gene i in sample j. For each gene, a vector of realistic values of mean count, dispersion, and gene length information (μi, φi, Li) is randomly picked from a pool of vectors derived from a real RNA-seq dataset. As a real dataset, we selected a subset of the Pickrell et al. [58] dataset of sequenced cDNA libraries generated from 69 lymphoblastoid cell lines that were derived from Yoruban Nigerian individuals. Samples from 58 unrelated individuals were considered (29 males and 29 females). Dispersion parameters for individual genes were estimated using the Bioconductor edgeR package [43].

Type I error rate

To simulate the null hypothesis H0: Fx = Fy, we generated a dataset consisting of N samples (equally separated between two different phenotypes) and S = 1000 nonoverlapping gene sets, each of size p. The randomly selected parameter vector (μi, φi, Li) is used to generate counts from the Negative Binomial distribution for gene i under all the samples in the dataset. Gene length information is used for expression normalization if necessary. To examine the effects of different sample and gene set sizes, we estimated Type I error rate under different parameter settings. We chose p ∈ {16, 60, 100} and N ∈ {10, 20, 40, 60}. Type I error rate for a statistical test is calculated as the proportion of gene sets detected by the test. The results were averaged over ten independent datasets to obtain more stable estimates.

Detection Power

A differentially expressed (DE) gene set in real data may include up-regulated, down-regulated, and equally regulated genes between two phenotypes. To mimic real data we introduce three simulation parameters: β, the proportion of gene sets in the dataset that have truly DE genes; γ, the proportion of genes, truly DE in each gene set; and FC, the fold change in gene counts between two phenotypes. We consider β ∈ {0.05, 0.25}, γ ∈ {0.125, 0.25, 0.5}, and FC ∈ [1.2, 3]. Two different biological conditions are represented by two groups of samples with equal size N/2 where N = 40. Under each condition, S = 1000 nonoverlapping gene sets were formed, each consists of p = 16 random realizations from the Negative Binomial distribution. The power for all statistical methods was estimated by testing the hypothesis H0: μx = μy (or H0: FC = 1) against the alternative H1: μxμy (or H1: FC ≠ 1) for all gene sets. For each of the (1 − β)S non-DE gene sets p random realizations of NB(μi, φi) were sampled, 1 ≤ ip, under both phenotypes. For each of the βS gene sets that have truly DE genes, half of the γp DE genes in each gene set were up-regulated and half were down-regulated between the two phenotypes. Specifically, γp/2 random realizations fromNB(μi, φi) and NB(FCμi, φi) were sampled respectively under phenotype 1 and phenotype 2 for 1 ≤ iγp/2 and another γp/2 random realizations from NB(FC μi, φi) and NB(μi, φi) were sampled respectively under phenotype 1 and phenotype 2 for (γp/2) + 1 ≤ iγp.

Simulation Setup for GSA-DV

Typically, RNA-seq counts are modeled using Poisson or Negative Binomial (NB) distribution. Since in the case of Poisson distribution variance is equal to the mean and in the case of NB distribution variance depends on the mean, there is no GSA-DV test for RNA-seq data. Therefore, we present simulation setup assuming multivariate normal distribution of gene expressions that is a standard assumption for microarray data.

Type I error rate

We generated two samples of equal size,N/2 from the p-dimensional normal distribution N(0, Ip×p) where Ip×p is a p × p identity matrix and p represent the gene set size. 1000 non-overlapping gene sets were generated and Type I error rate for a statistical test is calculated as the proportion of gene sets detected by the test. We consider p ∈ {20, 60, 100} and N ∈ {20, 40, 60}.

Detection Power

In a real gene set, the proportion of DV genes, the amount of difference in variance, and the intergene correlation vary. Therefore, three parameters: γ, the proportion of genes truly DV in a gene set, σ, the fold change in variance, and r, the strength of intergene correlation were introduced. We examine how these parameters influence the power of different tests. Two groups of samples of equal size, N/2 from p-dimensional normal distributions N(0, Σx) and N(0, Σy) to represent two biological phenotypes were generated. We consider the relationship between the covariance and correlation matrices where the correlation matrix R = D−1ΣD−1, D=diag() and Σ is the covariance matrix.

Let Σx and Σy be p × p positive definite and symmetric covariance matrices. The diagonal elements of Σx are equal to 1 and off-diagonal elements are equal to r. Matrix Σy is defined as

y=[ABCD]

where A is a γp × γp matrix with Aij = σ for i = j and Aij = for ij, B and C are respectively γp × (1−γ)p and (1−γ)p × γp matrices where Bij=Cij=σ r for all i and j, and D is a (1−γ) p × (1−γ)p matrix with Dij = 1 for i = j and Dij = r for ij. We consider the parameters γ ∈ {0.25, 0.5, 0.75, 1}, r ∈ {0.1, 0.5, 0.9}, σ ∈ [1, 5], p = 20, and N = 40.

Simulation Setup for GSA-DC

Since GSA-DC approaches are not yet frequently applied to RNA-seq data here again the simulation setup is presented for microarray data, assuming multivariate normal distribution of gene expressions. Let X and Y be independent p-dimensional vectors with distribution functions Fx = N(0, Σx) and Fy = N(0, Σy).

Type I error rate

To simulate the null hypothesis H0: Σx = Σy, we generated two samples of equal size, N/2 from the p-dimensional normal distribution N(0, Ip×p) where Ip×p is a p × p identity matrix. We generated 1000 gene sets and Type I error rate for a statistical test is the proportion of gene sets detected by the test. We consider p ∈ {20, 100, 200} and N ∈ {20, 40, 60}.

Detection Power

In a real biological setting, the proportion of co-expressed genes in a gene set varies and intergene correlations vary in strength. Therefore, two parameters: γ, the proportion of genes truly co-expressed in a gene set, and r, the strength of intergene correlation were introduced. We examine how these parameters influence the power of different tests. We simulated two groups of samples of equal size, N/2 (N = 40) from p-dimensional normal distributions N(0, Σx) and N(0, Σy) to represent two biological phenotypes where p ∈ {20, 100, 200}. We test the null hypothesis H0: Σx = Σy, against the alternative Σx ≠ Σy. To ensure that Σx and Σy are positive-definite and symmetric, two different scenarios for the alternative hypothesis were studied.

First, Σx was set to Ip×p and Σy was set such that its elements are

σij={rij,i,jγp0ij,i,j>γp1i=j.

We consider γ ∈ {0.25, 0.5, 0.75, 1} and r ∈ {0.1, 0.2, …, 0.9}. Figure 2 (parts A and B) depicts the covariance matrices Σx and Σy under this scenario for p = 20 and γ = 0.25. Dark and light colors represent high and low correlations, respectively. This design presents a gene set with low intergene correlations under one phenotype (Fig. 2A) and one group of highly co-expressed genes under the second phenotype (Fig. 2B).

Fig. 2.

Fig. 2

The correlation matrices of the two simulation setups with sample size N = 40 and gene set size p = 20. Parts (A) and (B) respectively represent the correlation matrices of two conditions when the alternative hypothesis of setup 1 is true and γ = 0.25. Parts (C) and (D) respectively represent the correlation matrices of two conditions when the alternative hypothesis of setup 2 is true, β = 0.25 and γ = 0.6. Dark and light colors respectively represent high and low correlation coefficients

Second, both Σx and Σy are set such that they have diagonal blocks of equal size βp, where β is the ratio of block size to gene set size. For each of the diagonal blocks, the first scenario is reproduced. Therefore, each diagonal block has γβp genes with intergene correlation specified by r while all the other genes in the block have zero correlations. The locations of the γβp co-expressed genes inside each block are assigned differently for Σx and Σy under the alternative hypothesis. While for Σx these genes occupy the upper-left corner of the block, for Σy they occupy the lower-right corner. Figure 2 (C, D) depicts this scenario for p = 20, β = 0.25, and γ = 0.6. Dark and light colors represent high and low correlations, respectively. Depending on γ, the diagonal blocks in Σx and Σy may have a few common genes (when β > 0.5) or may be exclusive (when β ≤ 0.5). We consider the case β = 0.25 and let γ = 0.6, 0.4, and 0.5 respectively when p = 20, 100, and 200 to allow γβp to be an integer number. These settings yield diagonal blocks of 3, 10, and 25 genes respectively when p = 20, 100, and 200. All intergene correlations outside the diagonal blocks are set to zero. This setup presents a gene set with low intergene correlations except for a selected group of highly co-expressed genes and the membership of the genes in this group is changing between the two phenotypes.

3.2 Comparative Power Analysis and Type I Error Rate: Results

Results for GSA-DE

Type I error rate

Table 1 presents the estimates of the attained significant levels for all GSA tests considered (α = 0.05). Overall, self-contained and competitive tests control Type I error rate near nominal α = 0.05. For more detailed discussion of Type I error rates for self-contained and competitive GSA-DE tests, see [47].

Table 1.

Estimated Type I error rates for GSA-DE methods, α = 0.05

Self-contained N-statistic SAM-GS ROAST




Competitive GSVA ssGSEA ROMER




Method placement Combined P-value eBayes_FM
P = 16 P = 60 P = 100

N = 10 Self. 0.049 0.044 0.084 0.048 0.045 0.042 0.048 0.045 0.041
Comp. 0.025 0.042 0.047 0.017 0.047 0.050 0.013 0.045 0.047
Comb. 0.047 0.042 0.044

N = 20 Self. 0.052 0.046 0.044 0.055 0.050 0.047 0.051 0.055 0.050
Comp. 0.040 0.047 0.051 0.038 0.041 0.054 0.037 0.050 0.053
Comb. 0.048 0.051 0.054

N = 40 Self. 0.054 0.054 0.051 0.047 0.047 0.044 0.050 0.053 0.055
Comp. 0.051 0.044 0.050 0.057 0.048 0.045 0.060 0.049 0.052
Comb. 0.051 0.047 0.055

N = 60 Self. 0.051 0.051 0.052 0.046 0.047 0.048 0.049 0.054 0.054
Comp. 0.060 0.046 0.051 0.061 0.051 0.049 0.066 0.047 0.050
Comb. 0.052 0.046 0.055
Power

Figure 3 presents the power estimates when H1: μxμy is true (N = 20, p = 16) (see ref. 47 for more detail). Self-contained methods have higher power than competitive methods and because they test a hypothesis about a single gene set by considering only its gene expressions and ignoring the rest of the dataset, they are not affected by the proportion of gene sets in the dataset that have truly differentially expressed genes (β parameter). Overall, all self-contained GSA-DE tests (ROAST, N-statistic, SAM-GS, eBayes_FM) have virtually the same power. It should be noted that the simulation setup here does not include intergene correlations. This is why there is no difference in power of multivariate and univariate self-contained approaches. For simulation setup that includes intergene correlations, we refer the reader to [10, 28]. The power of ROMER demonstrates dependence on the proportion of truly DE genes in a gene set (parameter γ). While the power is relatively low at γ = 0.125, it increases drastically at higher γ values. Competitive methods have slightly lower power for higher values of β especially ROMER. This observation can be explained by the fact that competitive methods are influenced by adding more genes to the dataset where adding non-DE genes enhances their power [36], while adding DE genes may decrease it.

Fig. 3.

Fig. 3

The power of different DE tests to detect differential expression between two phenotypes of samples when the alternative hypothesis μ̄xμ̄y is true with different settings (values of β, γ and FC). The gene set size is p = 16 and the sample size in each group is N/2 (N = 20). Half of the γ × p DE genes in a gene set are up-regulated under one phenotype and the other half are up-regulated under the other phenotype

The lack of power under all settings demonstrated by unsupervised competitive methods (especially GSVA) can be explained by the sample-wise ranking they perform to calculate the enrichment scores for gene sets [47]. While half of the genes in the gene set are up-regulated under one phenotype, the other half are up-regulated under the other phenotype. This setup maintains a stable enrichment score for the gene set under all samples and hence the gene set is found non-DE between the two phenotypes. When all DE genes in the gene set are up-regulated under one phenotype only, samples under that phenotype would have had higher gene set enrichment scores compared to the samples under the second phenotype. To substantiate this explanation with simulation results, we consider two hypothetical cases of expression patterns in a gene set consisting of 16 genes. In the first case, all DE genes in a gene set are up-regulated in phenotype 1 compared to phenotype 2. These genes normally have higher ranks in samples under phenotype 1 compared to samples under phenotype 2, and hence the gene set has higher enrichment score under phenotype 1 as compared to phenotype 2. This case is expected to demonstrate high power as shown in Fig. 4. Consider the second case where DE genes in a gene set are equally divided into up-regulated genes between phenotype 1 and phenotype 2, similar to the simulation setup that produced Fig. 3. While the up-regulated genes under phenotype 1 have higher ranks under phenotype 1 as compared to phenotype 2, the up-regulated genes under phenotype 2 are exactly the opposite. This case yields high (however lower than the first case) enrichment score for the gene set under all samples. Due to the expected small difference (if any) in average enrichment score between the two phenotypes, low power is expected (see Fig. 4). Since it is more likely to have both up-regulated and down-regulated genes between two phenotypes in a real gene set than having all up-regulated or down-regulated genes, the power of supervised competitive methods is likely to be consistently lower than other methods for real expression data. It should be noted that the authors of the ssGSEA method expected their enrichment score to be slightly more robust and more sensitive to differences in the tails of the distributions compared to the Kolmogorov–Smirnov-like statistic [39]. The simulation results in Fig. 4 confirm this expectation.

Fig. 4.

Fig. 4

The power of unsupervised competitive tests (GSVA and ssGSEA) to detect differences between two phenotypes when the alternative hypothesis μ̄xμ̄y is true with different settings (values of β, γ, and FC). The gene set size p = 16 and the sample size in each group is N/2 (N = 20). In case 1, all the γp DE genes in a gene set are up-regulated in phenotype 1 as compared to phenotype 2. In case 2, half of the γp DE genes in a gene set are up-regulated in phenotype 1 and the other half are up-regulated in phenotype 2. Both GSVA and ssGSEA have much higher power under case 1

Results for GSA-DV

Type I error rate

Table 2 presents the estimates of the attained significant levels for all GSA-DV approaches considered (α = 0.05). Overall, RKS test is slightly more conservative than N-statistic and gene-level GSA-DV test that combines F-tests P-values with FM.

Table 2.

Estimated Type I error rates for GSA-DV methods, α = 0.05

Method N-statistic F-test RKS
p 20 60 100 20 60 100 20 60 100
N = 20 0.050 0.047 0.050 0.044 0.040 0.043 0.036 0.025 0.049
N = 40 0.052 0.060 0.045 0.057 0.040 0.052 0.047 0.038 0.039
N = 60 0.053 0.035 0.049 0.053 0.054 0.056 0.041 0.038 0.034
Power

Figure 5 presents the power estimates for the three GSA-DV approaches considered against the alternative hypothesis σ̄xσ̄y. It appears that in contrast with GSA-DE approaches, where multivariate tests always outperform univariate tests when correlation increases, multivariate N-statistic and RKS have lower power than gene-level GSA-DV test that combines F-test P-values with FM in all settings. Gene-level GSA-DV test has the highest power, RKS test has an intermediate power, and N-statistic has the lowest power in all settings (Fig. 5). This pattern can be explained by the fact that it is much easier to satisfy the alternative hypothesis tested by the gene-level GSA-DV under our simulation setup than the alternatives tested by both N-statistic and RKS. N-statistic and RKS both test H0: Fx = Fy, with different alternatives FxFy and σ̄xσ̄y, respectively. Thus, the rejection of H0 in the case of N-statistic can happen when μ̄xμ̄y, σ̄xσ̄y or other higher order moments of Fx, Fy are not equal. The rejection of H0 in the case of RKS test is supposedly happened when σ̄xσ̄y, but not necessary so because the RKS test is just “more sensitive” to “differences in scale” as compared to “shift differences” [9]. It means that both tests are sensitive to not strictly one alternative, while gene-level GSA-DV test that combines F-test P-values with FM is sensitive to only the case when genes in a gene set are DV genes between two conditions. Figure 6 illustrates this point by showing the estimated power when the alternative hypothesis μ̄xμ̄y is true. The power trend is just the opposite of the trend presented in Fig. 5. Here, N-statistic has the highest power, RKS test has an intermediate power, and gene-level GSA-DV test has the lowest power in all settings (Fig. 6).

Fig. 5.

Fig. 5

The power of three GSA-DV tests to detect differential expression between two phenotypes of samples when the alternative hypothesis σ̄xσ̄y is true with different settings (values of β, γ, and σ). The gene set size p = 20 and the sample size in each group is N/2 (N = 20)

Fig. 6.

Fig. 6

The power of three GSA-DV tests to detect differential expression between two phenotypes of samples when the alternative hypothesis μ̄xμ̄y is true with different settings (values of β, γ, and σ). The gene set size p = 20 and the sample size in each group is N/2 (N = 20)

Results for GSA-DC

Type I error rate

Table 3 presents the estimates of the attained significant levels for the three GSA-DC tests considered (α = 0.05). Overall, all tests control Type I error rate near nominal α = 0.05.

Table 3.

Estimated Type I error rates for GSA-DC methods, α = 0.05

Method GSCA GSNCA CoGA
p 20 100 200 20 100 200 20 100 200
N = 20 0.057 0.050 0.044 0.052 0.042 0.045 0.053 0.048 0.056
N = 40 0.046 0.045 0.059 0.036 0.051 0.051 0.043 0.052 0.050
N = 60 0.052 0.049 0.047 0.054 0.047 0.054 0.043 0.048 0.050
Power

Figure 7 presents power estimates under the first simulation setup (see the simulation setup for GSA-DC) for different parameter settings. For each parameter setting, the results are obtained from 1000 independent gene sets. First, consider the case when only 25% of genes in a gene set are co-expressed (γ = 0.25). This case is highly plausible in real expression data since only a few genes in a gene set are expected to be highly co-expressed [25, 27]. GSNCA has the highest power followed respectively by GSCA and CoGA for all settings (p = {20, 100, 200}). Second, consider the case when 50% of genes in a gene set are co-expressed (γ = 0.5). While all tests show similar power when the size of gene set is relatively small (p = 20), GSNCA outperforms both GSCA and CoGA when the size of gene set is relatively large (p = 100 and p = 200). Third, consider the case when 75% of genes in a gene set are co-expressed (γ = 0.75). While GSCA and CoGA outperform GSNCA when the size of gene set is relatively small (p = 20), all tests have virtually the same power when the number of genes is relatively large (p = 100 and p = 200). Fourth, consider the case when 100% of genes in a gene set are co-expressed (γ = 1). GSCA and CoGA have similar power and GSNCA has virtually no power.

Fig. 7.

Fig. 7

The power of three GSA-DC tests to detect differential expression between two phenotypes of samples when the alternative hypothesis of the first simulation setup is true with different settings (values of γ and r). The gene set size p = {20, 100, 200} and the sample size in each group is N/2 (N = 40)

The statistic used in GSCA depends on the average pairwise correlation difference between the two phenotypes. Hence, power increases when γ becomes higher as shown in Fig. 7. Similar argument can be applied to CoGA where larger γ causes larger changes in the spectral distribution of the correlation matrix in one phenotype as compared to the other. When intergene correlation (r) is uniformly low in one phenotype and uniformly high in another phenotype (γ = 1 and r is high), eigenvectors corresponding to the largest eigenvalues for both correlation matrices remain unchanged while the eigenvalues (spectral distribution) change. Therefore, GSNCA does not detect changes regardless of the value of r when γ = 1, while CoGA shows high power. This case illustrates the fundamental difference between GSNCA and both GSCA and CoGA. Both GSCA and CoGA detect any differences in pairwise correlations, while GSNCA detects differences in the co-expression structure, i.e., when some pairwise correlations change relative to others in the same phenotypes. The greatest change in the co-expression structure between two phenotypes in the first simulation setup occurs when γ = 0.5 and hence GSNCA is expected to show highest power as shown in Fig. 7.

Figure 8 presents power estimates under the second simulation setup (see simulation setup for GSA-DC) for different parameter settings. When p = 20 and γ = 0.6 (diagonal block size γβp = 3), GSCA outperforms GSNCA. When p = 100 and γ = 0.4 (diagonal block size = 10), both GSCA and GSNCA show similar power. When p = 200 and γ = 0.5 (diagonal block size = 25), GSNCA outperforms GSCA. The increment in the size of the diagonal block of differential correlations results in increased detection power when the gene set size increases. When p = 200 and γ = 0.5, power follows similar pattern to what has been shown in Fig. 7 when γ = 0.5, i.e., GSNCA outperforms GSCA. The difference in power between GSNCA and GSCA when p = 20 and γ = 0.6 follows a similar pattern to what has been observed in Fig. 7 when γ = 0.75 and could be attributed to the correlation matrix in one phenotype moving closer to a uniformly high correlation pattern. CoGa has almost no power for all settings. This is explained by the fact that unlike eigenvectors the eigenvalues remain unchanged when the number of pairwise intergene correlations with value r remains unchanged but the set of pairwise correlations having value r differs between phenotypes.

Fig. 8.

Fig. 8

The power of three GSA-DC tests to detect differential expression between two phenotypes of samples when the alternative hypothesis of the second simulation setup is true with different settings (values of β, γ, and r). The gene set size p = {20, 100, 200} and the sample size in each group is N/2 (N = 40)

3.3 Application to Expression Data

We illustrate the use of GSA-DE, GSA-DV, and GSA-DC tests applied to the NCI-60 cell lines (p53) dataset. The p53 dataset comprises 50 samples of NCI-60 cell lines differentiated based on the status of the TP53 gene: 17 cell lines carrying normal (wild type, WT) TP53 gene and 33 cell lines carrying mutated TP53 (MUT) [38, 59]. For this data set, probe-level intensities were quantile normalized and transformed to the log scale. Gene sets were taken from the C2 pathways set of the molecular signaturedatabase (MSigDB version 5.1) [38, 60, 61]. Pathways with less than 10 or more than 500 genes were discarded and the resulted dataset comprised 4256 gene sets.

Results for GSA-DE

To find pathways, differentially expressed between cancer cell lines with and without p53 mutation we applied SAM-GS. We choose SAM-GS because it tests a fairly simple null hypothesis, namely whether the difference in moderated t-statistics averaged over all pathway genes, is zero between two phenotypes. SAM-GS detected 44 gene sets at the given significance level (P < 0.001) (Table 4). All but one detected pathways were significantly enriched with p53 target genes (Table 4). This is not a surprise because if the expression level of a regulator changes, so do the levels of the regulated genes, leading to significant differences in the average expression of pathways, enriched with p53 targets.

Table 4.

Pathways differentially expressed between p53WT and p53MUT cell linesa)

Gene set name Size TP53
targets
Phypergeo Target genes
1 KEGG_P53_SIGNALING_PATHWAY 53 34 1.1E-26 SFN_TSC2_CDK1_RCHY1_IGFBP3_SERPINB5_PPM1D_BID_MDM4
_BAX_MDM2_TP53I3_PMAIP1_ATM_ATR_CHEK2_APAF1_CDKN2A_
RRM2_CDKN1A_BBC3_GADD45A_TNFRSF10B_DDB2_SIAH1_PTEN
_CDK2_CHEK1_SERPINE1_TP53_CCNE2_CCNB1_CCNG1_CCNG2
2 KEGG_AMYOTROPHIC_LATERAL_SCLEROSIS_ALS 48 12 9.3E-05 MAPK11_BID_MAPK13_BAX_DAXX_MAPK14
_MAPK12_APAF1_GPX1_BCL2_BCL2L1_TP53
3 BIOCARTA_CHEMICAL_PATHWAY 22 12 4.7E-09 PTK2_BCL2_BID_BCL2L1_STAT1_
PRKCA_APAF1_BAX_CASP6_TP53_ATM_PARP1
4 BIOCARTA_ATM_PATHWAY 19 15 1.4E-14 JUN_CHEK2_MAPK8_MRE11A_BRCA1
_RELA_CHEK1_MDM2_ABL1_CDKN1A_
TP53_GADD45A_ATM_RAD51_NFKBIA
5 BIOCARTA_CERAMIDE_PATHWAY 22 6 3.4E-03 BCL2_MAPK8_RELA_BAX_MAPK3_MAPK1
6 BIOCARTA_HIVNEF_PATHWAY 56 17 1.6E-07 CHUK_CFLAR_PRKCD_PTK2_BID_MDM2
_CASP6_RB1_DAXX_PRKDC_NFKBIA_
MAPK8_RELA_APAF1_TRAF1_BCL2_PARP1
7 BIOCARTA_P53HYPOXIA_PATHWAY 21 17 1.0E-16 MAPK8_CSNK1D_CSNK1A1_EP300_BAX_HIF1A
_HIC1_MDM2_TAF1_CDKN1A_TP53_NQO1_
GADD45A_HSP90AA1_IGFBP3_ATM_RPA1
8 BIOCARTA_IL22BP_PATHWAY 13 2 2.3E-01 STAT1_STAT6
9 BIOCARTA_P53_PATHWAY 16 12 2.0E-11 BCL2_E2F1_APAF1_BAX_CDK2_MDM2_
RB1_CDKN1A_TP53_GADD45A_ATM_PCNA
10 BIOCARTA_BAD_PATHWAY 26 6 8.3E-03 IGF1R_BCL2_BCL2L1_BAX_MAPK3_MAPK1
11 SA_G1_AND_S_PHASES 13 6 1.3E-04 E2F1_CDK2_MDM2_TP53_CDKN2A_CDKN1A
12 SA_PROGRAMMED_CELL_DEATH 10 6 2.0E-05 BCL2_BAX_BCL2L1_BAK1_BID_APAF1
13 PID_P73PATHWAY 69 41 5.0E-30 TP53I3_PML_GRAMD4_WT1_EP300_GDF15_RCHY1_BAX_MDM2_PLK3_
CCNB1_S100A2_AFP_MAPK11_KAT5_SERPINE1_WWOX_BRCA2_CCNA2_
PIN1_MYC_CDK1_MAPK14_BBC3_ABL1_CDK2_PLK1_SFN_RELA_ITCH_
SP1_RB1_RAD51_TP63_CHEK1_BAK1_UBE4B_CCNE2_BUB1_FOXO3_CDKN1A
14 PID_HDAC_CLASSIII_PATHWAY 18 11 4.1E-09 FOXO1_CREBBP_HDAC4_BAX_TP53_XRCC6
_FOXO3_EP300_KAT2B_CDKN1A_TUBB2A
15 PID_REG_GR_PATHWAY 78 36 1.9E-21 BAX_NR4A1_MAPK8_HDAC1_STAT1_HDAC2_EGR1_NCOA2_GSK3B_
SMARCD1_RELA_SUMO2_TP53_SMARCA4_TBP_SFN_MDM2_MAPK3_
CREBBP_EP300_FOS_MAPK9_MAPK10_NR3C1_HSP90AA1_TSG101
_MAPK1_MAPK14_MAPK11_NCOA1_SMARCC1_JUN_AFP_CREB1_CDKN1A_CDK5
16 PID_P53_DOWNSTREAM_PATHWAY 110 97 1.6E-99 DDIT4_FDXR_SERPINB5_IGFBP3_GPX1_ATF3_MAP4K4_BNIP3L_
TSC2_BCL2_TP63_TP53_MMP2_TNFRSF10D_SP1_BBC3_PRKAB1_
TYRP1_CEBPZ_NFYB_MLH1_PCNA_SMARCA4_BAK1_JUN_CARM1_
VCAN_TAF9_AFP_CSE1L_IRF5_PRDM1_NFYA_CCNG1_MDM2_MET
_PTEN_TFDP1_CAV1_CCNB1_CX3CL1_ARID3A_PML_DDX5_BDKRB2
_TP53I3_HIC1_GADD45A_TGFA_APC_NFYC_SERPINE1_PRMT1_
BTG2_SH2D1A_BAX_TRIAP1_RB1_VDR_KAT2A_TRRAP_TNFRSF10B
_EP300_HTT_NDRG1_MSH2_PPP1R13B_DDB2_CASP10_GDF15_
LIF_CASP6_EGFR_PLK3_SNAI2_DKK1_CTSD_EPHA2_COL18A1_
RCHY1_PCBP4_SFN_BCL2L2_E2F1_BCL6_BID_S100A2_BCL2L1_
DUSP5_CDKN1A_APAF1_CREBBP_TP53BP2_HDAC2_MCL1_EDN2_PMAIP1
17 PID_RXR_VDR_PATHWAY 25 9 3.0E-05 NR4A1_VDR_NCOA1_THRB_RARA_TGFB1_SREBF1_BCL2_MED1
18 PID_TAP63_PATHWAY 48 29 7.6E-22 EP300_NQO1_SERPINB5_YWHAQ_S100A2_CDKN1A_PML_BBC3_
GADD45A_CHUK_TP63_SP1_MDM2_NOC2L_PMAIP1_ITCH_PRKCD_IGFBP3
_GPX2_TFAP2C_TP53I3_CDKN2A_VDR_PLK1_BAX_IKBKB_FDXR_GDF15_ABL1
19 PID_P53_REGULATION_PATHWAY 46 43 1.0E-46 NEDD8_PPM1D_HIPK2_CHEK2_TP53_TRIM28_CCNG1_HUWE1_
CSNK1E_SKP2_ATM_CSE1L_CSNK1D_CSNK1A1_CDK2_PIN1_CHEK1
_MDM4_KAT5_CCNA2_SMYD2_EP300_KAT2B_KAT8_DAXX_DYRK2
_RPL11_PRKCD_MDM2_CDKN2A_ATR_ABL1_CSNK1G2_GSK3B_
CREBBP_MAPK14_MAPK9_RPL23_USP7_RCHY1_UBE2D1_YY1_MAPK8
20 PID_RB_1PATHWAY 58 28 1.3E-17 SMARCB1_CDKN1A_JUN_CREBBP_TBP_SMARCA4_MAPK14_TFDP1_DNMT1
_CTBP1_PAX3_MET_MAPK11_SKP2_RBBP4_ABL1_EP300_HDAC3
_CDKN2A_CCNA2_E2F1_CDK2_HDAC1_TAF1_MAPK9_MDM2_RB1_CEBPB
21 REACTOME_SIGNALING_BY_ERBB2 85 19 5.4E-06 CDKN1A_CHUK_CREB1_EGFR_FOXO1_FOXO3_MTOR_GRB2_NR4A1_
HSP90AA1_MDM2_PRKCA_PRKCD_MAPK1_MAPK3_PTEN_RPS27A_TSC2_CDK1
22 REACTOME_PI3K_EVENTS_IN_ERBB2_SIGNALING 36 12 3.6E-06 CDKN1A_CHUK_CREB1_EGFR_FOXO1_FOXO3_MTOR_GRB2_NR4A1_MDM2_PTEN_TSC2
23 REACTOME_PI3K_AKT_ACTIVATION 31 10 3.3E-05 CDKN1A_CHUK_CREB1_FOXO1_FOXO3_MTOR_NR4A1_MDM2_PTEN_TSC2
24 REACTOME_AKT_PHOSPHORYLATES_TARGETS_IN_THE_CYTOSOL 11 4 5.5E-03 CDKN1A_CHUK_MDM2_TSC2
25 REACTOME_GAB1_SIGNALOSOME 30 12 3.7E-07 CDKN1A_CHUK_CREB1_EGFR_FOXO1_FOXO3_MTOR_GRB2_NR4A1_MDM2_PTEN_TSC2
26 REACTOME_DOWNSTREAM_SIGNAL_TRANSDUCTION 81 18 1.0E-05 CDKN1A_CHUK_CREB1_FOXO1_FOXO3_MTOR_GRB2_NR4A1_MDM2_
PRKCA_PRKCD_MAPK1_MAPK3_PTEN_STAT1_STAT6_TSC2_CDK1
27 REACTOME_PIP3_ACTIVATES_AKT_SIGNALING 22 10 8.7E-07 CDKN1A_CHUK_CREB1_FOXO1_FOXO3_MTOR_NR4A1_MDM2_PTEN_TSC2
28 REACTOME_INTRINSIC_PATHWAY_FOR_APOPTOSIS 26 13 4.3E-09 E2F1_BBC3_APAF1_NMT1_PMAIP1_MAPK8_BAK1_BAX_BCL2
_BCL2L1_BID_TFDP1_TP53
29 GAZDA_DIAMOND_BLACKFAN_ANEMIA_MYELOID_UP 26 4 1.0E-01 BAX_TNFRSF10B_DDB2_SRSF1
30 GARGALOVIC_RESPONSE_TO_OXIDIZED_PHOSPHOLIPIDS_BLACK_UP 20 3 1.6E-01 PLAGL1_CDKN1A_UBB
31 NUNODA_RESPONSE_TO_DASATINIB_IMATINIB_DN 13 6 1.3E-04 CDKN1A_IKBKB_CASP10_STAT6_STAT1_BAX
32 GALLUZZI_PERMEABILIZE_MITOCHONDRIA 38 19 1.0E-12 NR4A1_BBC3_PRKCD_SERPINB5_FDXR_BAK1_SIVA1_BCL2L1_
PPID_BAX_BID_STK11_MAPK8_ABL1_TP53_MCL1_BCL2_PMAIP1_GSK3B
33 DUTTA_APOPTOSIS_VIA_NFKB 30 9 1.5E-04 BAX_TP53_MDM2_CFLAR_BCL2L1_AFP_TNFRSF10B_TRAF1_BCL2
34 GALLUZZI_PREVENT_MITOCHONDIAL_PERMEABILIZATION 20 9 3.4E-06 BCL2_MAPK14_MCL1_BCL2L1_BAK1_MUC1_TXN_BAX_BCL2L2
35 SCHAVOLT_TARGETS_OF_TP53_AND_TP63 14 11 6.1E-11 CDKN1A_SERPINB5_BAX_TP53I3_SFN_PMAIP1_EPHA2_VCAN_FDXR_MDM2_PCNA
36 AMUNDSON_DNA_DAMAGE_RESPONSE_TP53 15 8 2.4E-06 LIF_DDB2_MDM2_CDKN1A_CCNG1_CTSD_BTG2_PPM1D
37 FLECHNER_BIOPSY_KIDNEY_TRANSPLANT_OK_VS_DONOR_DN 25 5 2.8E-02 BAX_FOS_XRCC6_MYC_EIF2AK2
38 MA_MYELOID_DIFFERENTIATION_UP 35 9 5.6E-04 BNIP3L_PPM1G_CDKN1A_UBB_MT1A_RB1_CCT5_MDM2_NCL
39 GENTILE_UV_LOW_DOSE_UP 24 8 1.6E-04 CDKN1A_SOX4_BTG2_FDXR_SAT1_GDF15_PMAIP1_CSNK1G2
40 INGA_TP53_TARGETS 15 12 5.3E-12 MDM2_SFN_BAX_TNFRSF10B_FOS_PCNA_CCNG1_
PMAIP1_BBC3_IGFBP3_CDKN1A_GADD45A
41 DELLA_RESPONSE_TO_TSA_AND_BUTYRATE 19 5 8.8E-03 HSPB1_CDKN1A_PRKCD_NR4A1_GADD45A
42 ONO_FOXP3_TARGETS_DN 34 5 8.8E-02 PARP1_CCNA2_CDKN1A_E2F1_BCL2
43 WARTERS_RESPONSE_TO_IR_SKIN 37 13 7.2E-07 MDM2_DDB2_FDXR_TRIAP1_PPM1D_TP53TG1_BTG2_
GDF15_BBC3_CDKN1A_CCNG1_STRA13_SERPINB5
44 WARTERS_IR_RESPONSE_5GY 21 12 2.3E-09 MDM2_CDKN1A_GDF15_FDXR_BBC3_DDB2_PPM1D_
PLK3_ATF3_BTG2_PCNA_GADD45A
a)

P-values were calculated using hypergeometric test. The names of p53 target genes for a pathway are listed

Results for GSA-DV

To find pathways with differential variability between cancer cell lines with and without p53 mutation, the gene-level test combining F-test P-values was applied. It detected only three pathways, between WT and MUT phenotypes at a significance level P < 0.001. These pathways are “BANDRES RESPONSE TO CARMUSTIN WITHOUT MGMT 24HR UP,” “BANDRES RESPONSE TO CARMUSTIN MGMT 48HR UP,” and “ONGUSAHA BRCA1 TARGETS DN.” These pathways are not significantly enriched in p53 targets. The first two pathways represent cellular response to carmustine treatment that involves regulation of complex pathways responsible for cell death [62]. All of them employ directly or indirectly expression of p53 gene and expectedly mutation in this gene results in different variability in these pathways. The “ONGUSAHA BRCA1 TARGETS DN” pathway consists of BRCA1 target genes [63] and because the p53 protein regulates BRCA1 transcription, mutation in p53 interferes with gene’ functions, in particular regulation of BRCA1. This may cause indirect mixed effects on regulation of BRCA1 targets.

Results for GSA-DC

To find pathways, differentially co-expressed between cancer cell lines with and without p53 mutation GSNCA test was applied. GSNCA detected only four pathways differentially co-expressed between two phenotypes at a significance level P < 0.001. Two of them (“KEGG PEROXISOME,” “REACTOME NOREPINEPHRINE NEUROTRANSMITTER RELEASE CYCLE”) are related to crucial metabolic processes such as fatty acid oxidation, biosynthesis of ether lipids, and free radical detoxification and release of noradrenalin synaptic vesicle. One is related to changes in DNA methylation and histone acetylation (“ZHONG RESPONSE TO AZACITIDINE AND TSA UP”) and one with changes in gene expressions related to intercellular matrix (“PEDERSEN METASTASIS BY ERBB2 ISOFORM 4”). These pathways are also not significantly enriched in p53 target genes.

In addition to detecting DC pathways GSNCA identifies hub genes—genes with the largest weights in each pathway. Hub genes provide useful biological information beyond the test result that a pathway is differentially co-expressed between two conditions. For example, pathway KEGG PEROXISOME (Fig. 9) presents genes that play key roles in redox signaling and lipid homeostasis. For p53 wild-type data, hub gene is MVK (mevalonate kinase Fig. 9A), which encodes the peroxisomal enzyme mevalonate kinase, a key early enzyme in isoprenoid and sterol synthesis. When p53 is mutated (Fig. 9B), hub gene becomes ACOX1 (acyl-coenzyme A oxidase 1, palmitoy1) that is the first enzyme of the fatty acid beta-oxidation pathway which catalyzes the desaturation of acyl-CoAs to 2-transenoyl-CoAs. That is in p53 MUT phenotype a shift from isoprenoid and sterol synthesis to fatty acid beta-oxidation pathway may happen. For more discussion of hub genes the reader is referred to [18].

Fig. 9.

Fig. 9

MST2s of pathway “KEGG PEROXISOME” co-expression network under both (A) wild-type (WT) and (B) mutated (MUT) p53 phenotypes

4 Conclusion

In this chapter, we provided an in-depth review of univariate and multivariate Gene Set Analysis approaches (GSA-DE, GSA-DV, GSA-DC) for testing different statistical hypotheses. A comparative power analysis and Type I error rate estimates for different approaches in each major type of GSA provide concise guidelines for selecting GSA approaches that are best performing under particular experimental settings. An example was presented applying the methods GSA-DE, GSA-DV, GSA-DC on a p53 data set. This analysis demonstrated that different GSA types are allowing to obtain new and complementary biological information for the same underlying data set.

Acknowledgments

We would like to thank Bárbara Macías Solís for proof reading of the manuscript. Support has been provided in part by the Arkansas INBRE program, with grants from the National Center for Research Resources (P20RR016460) and the National Institute of General Medical Sciences (P20 GM103429) from the National Institutes of Health. Large-scale computer simulations were implemented using the High Performance Computing (HPC) resources at the UALR Computational Research Center supported by the following grants: National Science Foundation grants CRI CNS-0855248, EPS-0701890, MRI CNS-0619069 and OISE-0729792.

References

  • 1.Mootha VK, et al. PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet. 2003;34(3):267–273. doi: 10.1038/ng1180. [DOI] [PubMed] [Google Scholar]
  • 2.Bar HY, Booth JG, Wells MT. A mixture-model approach for parallel testing for unequal variances. Stat Appl Genet Mol Biol. 2012;11(1) doi: 10.2202/1544-6115.1762. p. Article 8. [DOI] [PubMed] [Google Scholar]
  • 3.Ho JW, et al. Differential variability analysis of gene expression and its application to human diseases. Bioinformatics. 2008;24(13):i390–i398. doi: 10.1093/bioinformatics/btn142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Hulse AM, Cai JJ. Genetic variants contribute to gene expression variability in humans. Genetics. 2013;193(1):95–108. doi: 10.1534/genetics.112.146779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Mar JC, et al. Variance of gene expression identifies altered network constraints in neurological disease. PLoS Genet. 2011;7(8):e1002207. doi: 10.1371/journal.pgen.1002207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Xu Z, et al. Antisense expression increases gene expression variability and locus interdependency. Mol Syst Biol. 2011;7:468. doi: 10.1038/msb.2011.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Bravo HC, et al. Gene expression anti-profiles as a basis for accurate universal cancer signatures. BMC Bioinform. 2012;13:272. doi: 10.1186/1471-2105-13-272. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Dinalankara W, Bravo HC. Gene expression signatures based on variability can robustly predict tumor progression and prognosis. Cancer Informat. 2015;14:71–81. doi: 10.4137/CIN.S23862. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Friedman JH, Rafsky LC. Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests. Ann Stat. 1979;7(4):697–717. [Google Scholar]
  • 10.Rahmatallah Y, Emmert-Streib F, Glazko G. Gene set analysis for self-contained tests: complex null and specific alternative hypotheses. Bioinformatics. 2012;28(23):3073–3080. doi: 10.1093/bioinformatics/bts579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Afsari B, Geman D, Fertig EJ. Learning dysregulated pathways in cancers from differential variability analysis. Cancer Informat. 2014;13(Suppl 5):61–67. doi: 10.4137/CIN.S14066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Fisher R. Statistical methods for research workers. Oliver and Boyd; Edinburg: 1932. [Google Scholar]
  • 13.Stadler N, Mukherjee S. Multivariate gene-set testing based on graphical models. Biostatistics. 2015;16(1):47–59. doi: 10.1093/biostatistics/kxu027. [DOI] [PubMed] [Google Scholar]
  • 14.Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 2008;9(3):432–441. doi: 10.1093/biostatistics/kxm045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Meinshausen N, Bühlmann P. High-dimensional graphs and variable selection with the lasso. Ann Stat. 2006;34(3):1436–1462. [Google Scholar]
  • 16.Schafer J, Strimmer K. A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat Appl Genet Mol Biol. 2005;4(1) doi: 10.2202/1544-6115.1175. Article 32. [DOI] [PubMed] [Google Scholar]
  • 17.Choi Y, Kendziorski C. Statistical methods for gene set co-expression analysis. Bioinformatics. 2009;25(21):2780–2786. doi: 10.1093/bioinformatics/btp502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Rahmatallah Y, Emmert-Streib F, Glazko G. Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets. Bioinformatics. 2014;30(3):360–368. doi: 10.1093/bioinformatics/btt687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Santos Sde S, et al. CoGA: an R package to identify differentially co-expressed gene sets by analyzing the graph spectra. PLoS One. 2015;10(8):e0135831. doi: 10.1371/journal.pone.0135831. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Takahashi DY, et al. Discriminating different classes of biological networks by analyzing the graphs spectra distribution. PLoS One. 2012;7(12):e49949. doi: 10.1371/journal.pone.0049949. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Goeman JJ, Buhlmann P. Analyzing gene expression data in terms of gene sets: methodological issues. Bioinformatics. 2007;23(8):980–987. doi: 10.1093/bioinformatics/btm051. [DOI] [PubMed] [Google Scholar]
  • 22.Tian L, et al. Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci U S A. 2005;102(38):13544–13549. doi: 10.1073/pnas.0506577102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Ackermann M, Strimmer K. A general modular framework for gene set enrichment analysis. BMC Bioinform. 2009;10(1):47. doi: 10.1186/1471-2105-10-47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Rahmatallah Y, Emmert-Streib F, Glazko G. Comparative evaluation of gene set analysis approaches for RNA-Seq data. BMC Bioinform. 2014;15(1):397. doi: 10.1186/s12859-014-0397-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Montaner D, et al. Gene set internal coherence in the context of functional profiling. BMC Genomics. 2009;10:197. doi: 10.1186/1471-2164-10-197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Gatti DM, et al. Heading down the wrong pathway: on the influence of correlation within gene sets. BMC Genomics. 2010;11:574. doi: 10.1186/1471-2164-11-574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Tripathi S, Emmert-Streib F. Assessment method for a power analysis to identify differentially expressed pathways. PLoS One. 2012;7(5):e37510. doi: 10.1371/journal.pone.0037510. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Glazko GV, Emmert-Streib F. Unite and conquer: univariate and multivariate approaches for finding differentially expressed gene sets. Bioinformatics. 2009;25(18):2348–2354. doi: 10.1093/bioinformatics/btp406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wang X, et al. Linear combination test for hierarchical gene set analysis. Stat Appl Genet Mol Biol. 2011;10(1) doi: 10.2202/1544-6115.1641. Article 13. [DOI] [PubMed] [Google Scholar]
  • 30.Hanzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinform. 2013;14:7. doi: 10.1186/1471-2105-14-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Khatri P, Sirota M, Butte AJ. Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Comput Biol. 2012;8(2):e1002375. doi: 10.1371/journal.pcbi.1002375. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Maciejewski H. Gene set analysis methods: statistical models and methodological differences. Brief Bioinform. 2014;15(4):504–518. doi: 10.1093/bib/bbt002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Nam D, Kim SY. Gene-set approach for expression pattern analysis. Brief Bioinform. 2008;9(3):189–197. doi: 10.1093/bib/bbn001. [DOI] [PubMed] [Google Scholar]
  • 34.Tamayo P, et al. The limitations of simple gene set enrichment analysis assuming gene independence. Stat Methods Med Res. 2012;25(1):472–487. doi: 10.1177/0962280212460441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Tarca AL, Bhatti G, Romero R. A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity. PLoS One. 2013;8(11):e79217. doi: 10.1371/journal.pone.0079217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Tripathi S, Glazko GV, Emmert-Streib F. Ensuring the statistical soundness of competitive gene set approaches: gene filtering and genome-scale coverage are essential. Nucleic Acids Res. 2013;41(7):e82. doi: 10.1093/nar/gkt054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Dinu I, et al. Improving gene set analysis of microarray data by SAM-GS. BMC Bioinform. 2007;8:242. doi: 10.1186/1471-2105-8-242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Subramanian A, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Barbie DA, et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature. 2009;462(7269):108–112. doi: 10.1038/nature08460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Fridley BL, Jenkins GD, Biernacka JM. Self-contained gene-set analysis of expression data: an evaluation of existing and novel methods. PLoS One. 2010;5(9) doi: 10.1371/journal.pone.0012693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Stouffer S, DeVinney L, Suchmen E. The American soldier: adjustment during army life. Vol. 1. Princeton University Press; Princeton, NJ: 1949. [Google Scholar]
  • 42.Taylor J, Tibshirani R. A tail strength measure for assessing the overall univariate significance in a dataset. Biostatistics. 2006;7(2):167–181. doi: 10.1093/biostatistics/kxj009. [DOI] [PubMed] [Google Scholar]
  • 43.Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Smyth G. Limma: linear models for microarray data. In: Smyth G, Gentleman R, Carey V, Dudoit S, Irizarry R, Huber W, editors. Bioinformatics and computational biology solutions using r and bioconductor. Springer; New York: 2005. pp. 397–420. [Google Scholar]
  • 46.Law CW, et al. Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014;15(2):R29. doi: 10.1186/gb-2014-15-2-r29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Rahmatallah Y, Emmert-Streib F, Glazko G. Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline. Brief Bioinform. 2016;17(3):393–407. doi: 10.1093/bib/bbv069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A. 2001;98(9):5116–5121. doi: 10.1073/pnas.091062498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Baldi P, Long AD. A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes. Bioinformatics. 2001;17(6):509–519. doi: 10.1093/bioinformatics/17.6.509. [DOI] [PubMed] [Google Scholar]
  • 50.Smyth GK. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004;3:3. doi: 10.2202/1544-6115.1027. [DOI] [PubMed] [Google Scholar]
  • 51.Dinu I, et al. Gene-set analysis and reduction. Brief Bioinform. 2009;10(1):24–34. doi: 10.1093/bib/bbn042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Liu Q, et al. Comparative evaluation of gene-set analysis methods. BMC Bioinform. 2007;8:431. doi: 10.1186/1471-2105-8-431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Baringhaus L, Franz C. On a new multivariate two-sample test. J Multivar Anal. 2004;88:190–206. [Google Scholar]
  • 54.Klebanov L, et al. A multivariate extension of the gene set enrichment analysis. J Bioinforma Comput Biol. 2007;5(5):1139–1153. doi: 10.1142/s0219720007003041. [DOI] [PubMed] [Google Scholar]
  • 55.Wu D, et al. ROAST: rotation gene set tests for complex microarray experiments. Bioinformatics. 2010;26(17):2176–2182. doi: 10.1093/bioinformatics/btq401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Damian D, Gorfine M. Statistical concerns about the GSEA procedure. Nat Genet. 2004;36(7):663. doi: 10.1038/ng0704-663a. author reply 663. [DOI] [PubMed] [Google Scholar]
  • 57.Ritchie ME, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47. doi: 10.1093/nar/gkv007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Pickrell JK, et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature. 2010;464(7289):768–772. doi: 10.1038/nature08872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Olivier M, et al. The IARC TP53 database: new online mutation analysis and recommendations to users. Hum Mutat. 2002;19(6):607–614. doi: 10.1002/humu.10081. [DOI] [PubMed] [Google Scholar]
  • 60.Liberzon A, et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27(12):1739–1740. doi: 10.1093/bioinformatics/btr260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Wu D, Smyth GK. Camera: a competitive gene set test accounting for inter-gene correlation. Nucleic Acids Res. 2012;40(17):e133. doi: 10.1093/nar/gks461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Bandres E, et al. Gene expression profile induced by BCNU in human glioma cell lines with differential MGMT expression. J Neuro-Oncol. 2005;73(3):189–198. doi: 10.1007/s11060-004-5174-5. [DOI] [PubMed] [Google Scholar]
  • 63.Ongusaha PP, et al. BRCA1 shifts p53-mediated cellular outcomes towards irreversible growth arrest. Oncogene. 2003;22(24):3749–3758. doi: 10.1038/sj.onc.1206439. [DOI] [PubMed] [Google Scholar]

RESOURCES