DANUBE: Data-driven meta-ANalysis using UnBiased Empirical distributions—applied to biological pathway analysis

Tin Nguyen; Cristina Mitrea; Rebecca Tagett; Sorin Draghici

doi:10.1109/jproc.2015.2507119

. Author manuscript; available in PMC: 2018 Apr 27.

Published in final edited form as: Proc IEEE Inst Electr Electron Eng. 2016 Mar 31;105(3):496–515. doi: 10.1109/jproc.2015.2507119

DANUBE: Data-driven meta-ANalysis using UnBiased Empirical distributions—applied to biological pathway analysis

Tin Nguyen ¹, Cristina Mitrea ², Rebecca Tagett ³, Sorin Draghici ⁴

PMCID: PMC5919277 NIHMSID: NIHMS854489 PMID: 29706661

Abstract

Identifying the pathways and mechanisms that are significantly impacted in a given phenotype is challenging. Issues include patient heterogeneity and noise. Many experiments do not have a large enough sample size to achieve the statistical power necessary to identify significantly impacted pathways. Meta-analysis based on combining p-values from individual experiments has been used to improve power. However, all classical meta-analysis approaches work under the assumption that the p-values produced by experiment-level statistical tests follow a uniform distribution under the null hypothesis. Here we show that this assumption does not hold for three mainstream pathway analysis methods, and significant bias is likely to affect many, if not all such meta-analysis studies. We introduce DANUBE, a novel and unbiased approach to combine statistics computed from individual studies. Our framework uses control samples to construct empirical null distributions, from which empirical p-values of individual studies are calculated and combined using either a Central Limit Theorem approach or the additive method. We assess the performance of DANUBE using four different pathway analysis methods. DANUBE is compared with five meta-analysis approaches, as well as with a pathway analysis approach that employs multiple datasets (MetaPath). The 25 approaches have been tested on 16 different datasets related to two human diseases, Alzheimer’s disease (7 datasets) and acute myeloid leukemia (9 datasets). We demonstrate that DANUBE overcomes bias in order to consistently identify relevant pathways. We also show how the framework improves results in more general cases, compared to classical meta-analysis performed with common experiment-level statistical tests such as Wilcoxon and t-test.

Index Terms: meta-analysis, p-values, empirical distribution, pathway analysis, Alzheimer’s disease, acute myeloid leukemia

I. Introduction

The proliferation of high-throughput genomics technologies has resulted in an abundance of data, for many different biomedical conditions. Large public repositories such as Gene Expression Omnibus [1, 2], The Cancer Genome Atlas (cancergenome.nih.gov), ArrayExpress [3, 4], and Therapeutically Applicable Research to Generate Effective Treatments (ocg.cancer.gov/programs/target) store thousands of datasets, within which there are independent experimental series with similar patient cohorts and experiment design. Gene expression data, as measured by microarrays, are particularly prevalent in public databases, such that some disease conditions are represented by half a dozen studies or more.

Experiments comparing two phenotypes, such as disease and control, yield lists of genes that are differentially expressed (DE). However, lists of DE genes obtained from similar but independent experiments tend to have little in common, and taken alone, they usually fail to elucidate the underlying biological mechanisms. Effective meta-analysis approaches are needed to unify the biological knowledge spread out over such similar studies with apparently incongruent results.

The goal of the meta-analysis is to combine the results of independent but related studies and provide increased statistical power and robustness compared to individual studies analyzed alone [5, 6]. In spite of the numerous sophisticated tools for meta-analysis, many biological applications still use only Venn diagrams (intersection/union) or vote counting for combining multiple studies [7, 8]. Such approaches are useful for demonstrating consistency when combining a few studies. However, when combining many studies, Venn diagrams are either too conservative (for intersection) or too anti-conservative (for union), while vote counting is statistically inefficient [5, 9, 10]. Regarding microarray data, meta-analysis has been used at both gene level [5, 7, 11–13] and pathway level [11, 14]. Pathway analysis [15–18] was developed to correlate differential gene expression evidence with a-priori defined functional modules, organized into biological pathway databases, such as Kyoto Encyclopedia of Genes and Genomes (KEGG) [19, 20], Reactome [21], Biocarta (www.biocarta.com), or Molecular Signatures Database (MSigDB) [22].

One straightforward and flexible way of integrating diverse studies is to combine the individual p-values provided by each study. Classical meta-analysis methods of combining p-values have been reviewed and compared in [23]. These include Fisher’s method based on the chi-squared distribution [24], the additive method [25] using the Irwin-Hall distribution [26, 27], minP [28], and maxP [29].

In an early study, Rhodes and others [13] collected multiple prostate cancer microarray datasets and combined p-values using Fisher’s method. Since then, other sophisticated approaches have been proposed including the weighted Fisher’s method [30] and the latent variable approach [31, 32].

The major drawback of the available p-value-based meta-analysis frameworks is that they work under the assumption that the p-values provided by the individual statistical tests follow a uniform distribution under the null hypothesis. Previous reports describe non-uniform distributions of p-values under the null as due to specific factors such as improper normalization, cross-hybridization, poorly characterized variance, and heteroskedasticity in microarray data analysis [33, 34], or even due to properties of some more general distributions [35]. Here we show that this assumption also does not hold in the realm of pathway analysis methods, severely compromising the reliability of the results. In addition to strong statistical assumptions, the current methods for combining p-values are sensitive to outliers. For example, using Fisher’s method, a p-value of zero in one individual case will result in a combined p-value of zero regardless of the other p-values. The same is true for the minP and maxP statistics, where outliers greatly influence the combined p-value.

Here we propose DANUBE (Data-driven meta-ANalysis using UnBiased Empirical distributions), a new meta-analysis framework which can combine the p-values of multiple studies in a better way. Our contribution is two-fold. First, we use empirical null distributions to calculate p-values for individual studies. This approach learns from the data under the null hypothesis and compensates for any bias potentially introduced by an individual pathway analysis method. Second, we combine the individual p-values using a method based on the Central Limit Theorem. This is less sensitive to outliers and provides more reliable results. Our simulation experiments demonstrate that both type I and type II errors of DANUBE are better than those of classical meta-analysis approaches using both parametric and non-parametric tests.

We apply DANUBE in the context of pathway analysis using 16 public gene expression datasets from two biological conditions, and 4 different pathway analysis methods. Gene Set Enrichment Analysis (GSEA) [36] and Gene Set Analysis (GSA) [37] are Functional Class Scoring methods [36–39], Down-weighting of Overlapping Genes (PADOG) [38] is an enrichment method [40–42], and Signaling Pathway Impact Analysis (SPIA) [43, 44] is a topology-aware method [43, 45]. These pathway analysis methods are applied on the human signaling pathways from KEGG [19, 20].

We show that with the exception of GSEA, each of the other three methods GSA, SPIA, and PADOG have different biases, leading to non-uniform distributions of p-values under the null hypothesis. Not surprisingly, when combining p-values using classical methods such as Fisher’s or the additive method, each of the three pathway analysis methods (GSA, SPIA, and PADOG) yields a very different list of significantly impacted pathways. We then apply the DANUBE framework using the empirical distributions characteristic to each of these methods. The DANUBE results yield much more consistent lists of significant pathways that are also pertinent to the phenotypes.

II. Background

We first recapitulate the classical methods of combining p-values, such as Fisher’s method [24] and the additive method [25–27]. We then demonstrate the shortcomings of existing approaches in pathway analysis.

A. Fisher’s method

Fisher’s method [24] is one of the most widely used methods for combining independent p-values. Considering a set of m independent significance tests, the resulting p-values P₁, P₂, …, P_m are independent and uniformly distributed on the interval [0, 1] under the null hypothesis. Denoting X_i = −2 ln P_i (i ∈ {1, 2, …, m}) as new random variables, the cumulative distribution function of X_i can be calculated as follows:

\begin{array}{l} F_{i} (x) = P r (X_{i} \leq x) = P r (- 2 ln P_{i} \leq x) = P r (P_{i} \leq e^{\frac{x}{2}}) \\ = \int_{e^{- \frac{x}{2}}}^{1} f (p) d p = 1 - e^{- \frac{x}{2}} \end{array}

The above function is the cumulative distribution function of a chi-squared distribution with two degrees of freedom $(χ_{2}^{2})$ . Since the sum of chi-squared random variables is also a chi-squared random variable, $- 2 \sum_{i = 1}^{m} ln (P_{i})$ follows a chi-squared distribution with 2m degrees of freedom $(χ_{2 m}^{2})$ . In summary, the log product of m independent p-values follows a chi-squared distribution with 2m degrees of freedom:

X = - 2 \sum_{i = 1}^{m} ln (P_{i}) ~ χ_{2 m}^{2}

(1)

We note that if one of the individual p-values approaches zero, which is often the case for empirical p-values, then the combined p-value approaches zero as well, regardless of other individual p-values. For example, if P₁ → 0, then X → ∞ and therefore, Pr(X) → 0 regardless of P₂, P₃, …, P_m. Therefore, we see that Fisher’s method is sensitive to outliers.

In practice, most pathway analysis methods use some kind of permutation or bootstrap approach to construct an empirical distribution of a statistic under the null. For example, the empirical null distribution of the t statistic is ξ_t = {t₁, t₂, …, t_N}. The empirical p-value calculated from such a distribution is the fraction of the statistics’ values in the N random trials performed that are more extreme than the observed one. Many times, there are no occurrences of values more extreme than the observed one, yielding an empirical p-value of zero. In this situation, the combined p-value calculated using Fisher’s method will be zero, even if all other p-values are equal to one. It is important to note that this phenomenon occurs because many methods choose to round the reported empirical p-value down to zero (when in fact, the real p-value is somewhere in the interval [0, 1/N]), and not because of the mathematical formulation of Fisher’s method.

B. Additive method

The additive method proposes an alternative approach that uses the sum of p-values instead of the log product. Consider m random variables P₁, P₂, …, P_m that are independent and uniformly distributed on the interval [0, 1]. Denoting $X = \sum_{i = 1}^{m} P_{i}$ as a new random variable, then X follows the Irwin-Hall distribution [26, 27]. The cumulative distribution function of X can be calculated as follows:

F (x) = \frac{1}{2} + \frac{1}{2 m!} \sum_{i = 0}^{m} {(- 1)}^{i} (\begin{matrix} m \\ i \end{matrix}) {(x - i)}^{m} sgn (x - i)

(2)

Using the above cumulative distribution function, we can calculate the probability of observing the sum $X = \sum_{i = 1}^{m} P_{i}$ . We note that the concept of the additive method was also presented in [25] with a slightly different formulation and proof than in [26, 27]. However, they are equivalent and can be transformed into one another.

The additive method is not as sensitive to extremely small individual p-values as Fisher’s method. However, both methods assume the uniformity of the p-values under the null hypothesis. We will show that this assumption does not hold for three mainstream pathway analysis methods. The inherent bias of these pathway analysis methods is most likely to affect the classical meta-analysis in most cases, and thus lead to systematic bias in identifying significant pathways.

C. Pitfalls of the existing approaches

Null distributions are used to model populations so that statistical tests can determine whether an observation is unlikely to occur by chance. The p-values produced by a sound statistical test must be uniformly distributed in the interval [0,1] when the null hypothesis is true [33–35, 46]. For example, the p-values that result from comparing two groups using a t-test should be distributed uniformly if the data are normally distributed [35]. When the assumptions of statistical models do not hold, the resulting p-values are not uniformly distributed under the null hypothesis. We will demonstrate this fact using gene expression data and pathway analysis.

Using only the control samples from 7 publicly available Alzheimer’s datasets (N=74), we simulate 40, 000 datasets as follows. We randomly label 37 as “control” samples and the remaining 37 as “disease” samples. We repeat this procedure 10, 000 times to generate different groups of 37 control and 37 disease samples. To make the simulation more general, we also create 10, 000 datasets consisting of 10 control and 10 disease samples, 10, 000 datasets consisting of 10 control and 20 disease samples, and 10, 000 datasets consisting of 20 control and 10 disease samples. We then calculate the p-values of the KEGG (version 65) human signaling pathways (extracted as graph objects by the R package ROntoTools1.2.0 [44] version 1.2.0) using the following methods: GSEA [36], GSA [37], SPIA [43, 44], and PADOG [38].

Figure 1 displays the empirical null distributions of p-values using GSA, SPIA, and PADOG. The horizonal axes represent p-values while the vertical axes represent p-value densities. Blue panels (A0–A6) show p-value distributions from GSA, while purple (B0–B6) and green (C0–C6) panels show p-value distributions from SPIA and PADOG, respectively. For each method, the larger panel (A0, B0, and C0) shows the cumulative p-values from all KEGG signaling pathways. The small panels, 6 per method, display extreme examples of non-uniform p-value distributions for specific pathways. For each method, we show three distributions severely biased towards zero (eg. A1–A3), and three distributions severely biased towards one (eg. A4–A6).

Fig. 1 — The empirical null distributions of p-values using: Gene Set Analysis (GSA) - top, Signaling Pathway Impact Analysis (SPIA) - middle, and Down-weighting of Overlapping Genes (PADOG) - bottom. The distributions are generated by re-sampling from 74 control samples obtained from 7 public Alzheimer’s datasets. The horizontal axes display the p-values while the vertical axes display the p-value densities. Panels A0–A6 (blue) show the distributions of p-values from GSA; panels B0–B6 (purple) show the distribution of p-values from SPIA; panels C0–C6 (green) show the distribution of p-values from PADOG. The large panels on the left, A0, B0, and C0, display the distributions of p-values cumulated from all KEGG signaling pathways. The smaller panels on the right display the p-value distributions of selected individual pathways, which are extreme cases. For each method, the upper three distributions, for example A1–A3, are biased towards zero and the lower three distributions, for example A4–A6, are biased towards one. Since none of these p-value distributions are uniform, there will be systematic bias in identifying significant pathways using any one of the methods. Pathways that have p-values biased towards zero will often be falsely identified as significant (false positives). Likewise, pathways that have p-values biased towards one are more likely to be among false negative results even if they may be implicated in the given phenotype.

These results show that, contrary to generally accepted beliefs, the p-values are not uniformly distributed for three out of the four methods considered. Therefore one should expect a very strong and systematic bias in identifying significant pathways for each of these methods. Pathways that have p-values biased towards zero will often be falsely identified as significant (false positives). Likewise, pathways that have p-values biased towards one are likely to rarely meet the significance requirements, even when they are truly implicated in the given phenotype (false negatives). Systematic bias, due to non-uniformity of p-value distributions, results in failure of the statistical methods to correctly identify the biological pathways implicated in the condition, and also leads to inconsistent and incorrect results. For example, all three of the zero-biased GSA pathways shown in Figure 1: Prostate cancer (A1), Adherens junction (A2), and Pathways in cancer (A3), are reported as statistically significant in the results shown in Table I even though these data were collected in an experiment comparing Alzheimer’s disease patients vs. healthy subjects, an experiment that has nothing to do with cancer.

TABLE I.

The 17 top ranked pathways and FDR-corrected p-values obtained by combining the GSA p-values using 6 meta-analysis methods for Alzheimer’s disease. Stouffer’s method, the additive method, and DANUBE, identify the target pathway as significant and rank it in positions 11^th, 6^th, and 2^nd, respectively. DANUBE yields the best ranking.

	GSA + Stouffer’s method		GSA + Z-method		GSA + Brown’s method
	Pathway	pvalue.fdr	Pathway	pvalue.fdr	Pathway	pvalue.fdr
1	Vasopressin-regulated water reabsorption	< 10⁻⁴	Vasopressin-regulated water reabsorption	< 10⁻⁴	Vasopressin-regulated water reabsorption	< 10⁻⁴
2	Pathogenic Escherichia coli infection	< 10⁻⁴	Pathogenic Escherichia coli infection	< 10⁻⁴	Pathogenic Escherichia coli infection	< 10⁻⁴
3	Prostate cancer	< 10⁻⁴	Prostate cancer	0.0307	Prostate cancer	0.0418
4	Pathways in cancer	0.0003	Pathways in cancer	0.1352	Adherens junction	0.1722
5	Adherens junction	0.0003	Adherens junction	0.1352	Pathways in cancer	0.1722
6	Hippo signaling pathway	0.0004	Hippo signaling pathway	0.1352	Hippo signaling pathway	0.1765
7	Synaptic vesicle cycle	0.0032	Synaptic vesicle cycle	0.2443	Synaptic vesicle cycle	0.2625
8	Vibrio cholerae infection	0.0032	Vibrio cholerae infection	0.2443	Endocrine and other factor-regulated calcium reabsorption	0.2625
9	Endocrine and other factor-regulated calcium reabsorption	0.0032	Endocrine and other factor-regulated calcium reabsorption	0.2443	Vibrio cholerae infection	0.2625
10	Shigellosis	0.0071	Shigellosis	0.2808	Pancreatic cancer	0.2625
11	Alzheimer’s disease	0.0073	Alzheimer’s disease	0.2808	Focal adhesion	0.2950
12	Bacterial invasion of epithelial cells	0.0073	Bacterial invasion of epithelial cells	0.2808	Shigellosis	0.3027
13	Pancreatic cancer	0.0095	Pancreatic cancer	0.2808	Bacterial invasion of epithelial cells	0.3034
14	Focal adhesion	0.0112	Focal adhesion	0.2808	Notch signaling pathway	0.3254
15	Parkinson’s disease	0.0112	Parkinson’s disease	0.2808	Alzheimer’s disease	0.3254
16	Huntington’s disease	0.0112	Huntington’s disease	0.2808	HIF-1 signaling pathway	0.3274
17	Wnt signaling pathway	0.0112	Wnt signaling pathway	0.2808	SNARE interactions in vesicular transport	0.3274

	GSA + Fisher’s method		GSA + Additive method		GSA + DANUBE
	Pathway	pvalue.fdr	Pathway	pvalue.fdr	Pathway	pvalue.fdr

1	Vasopressin-regulated water reabsorption	< 10⁻⁴	Prostate cancer	< 10⁻⁴	Cardiac muscle contraction	0.0014
2	Pathogenic Escherichia coli infection	< 10⁻⁴	Pathways in cancer	0.0002	Alzheimer’s disease	0.0014
3	Prostate cancer	< 10⁻⁴	Hippo signaling pathway	0.0005	Huntington’s disease	0.0014
4	Adherens junction	0.0019	Adherens junction	0.0015	Parkinson’s disease	0.0014
5	Pathways in cancer	0.0023	Endocrine and other factor-regulated calcium reabsorption	0.0042	Hippo signaling pathway	0.0025
6	Hippo signaling pathway	0.0030	Alzheimer’s disease	0.0042	Vibrio cholerae infection	0.0047
7	Synaptic vesicle cycle	0.0097	Vibrio cholerae infection	0.0057	Synaptic vesicle cycle	0.0081
8	Vibrio cholerae infection	0.0121	Shigellosis	0.0057	Prostate cancer	0.0112
9	Endocrine and other factor-regulated calcium reabsorption	0.0133	Huntington’s disease	0.0057	Vasopressin-regulated water reabsorption	0.0112
10	Pancreatic cancer	0.0133	Bacterial invasion of epithelial cells	0.0057	Epithelial cell signaling in Helicobacter pylori infection	0.0118
11	Focal adhesion	0.0190	Parkinson’s disease	0.0057	Systemic lupus erythematosus	0.0150
12	Shigellosis	0.0222	Glioma	0.0057	Amyotrophic lateral sclerosis (ALS)	0.0174
13	Bacterial invasion of epithelial cells	0.0245	Vasopressin-regulated water reabsorption	0.0057	Shigellosis	0.0193
14	Alzheimer’s disease	0.0334	Cardiac muscle contraction	0.0057	Endocrine and other factor-regulated calcium reabsorption	0.0193
15	Notch signaling pathway	0.0334	Wnt signaling pathway	0.0057	Phagosome	0.0302
16	SNARE interactions in vesicular transport	0.0465	Synaptic vesicle cycle	0.0057	Lysosome	0.0302
17	Wnt signaling pathway	0.0465	Dorso-ventral axis formation	0.0119	Ribosome biogenesis in eukaryotes	0.0302

Open in a new tab

The horizontal lines show the 1% significance threshold. The target pathway Alzheimer’s disease is highlighted in green. Pathways highlighted in red are examples of false positives. These pathways were expected to be reported as false positives because their null distributions are very skewed toward zero (see Figure 1 panels A1–A3 and Supplementary Figure S3). These include Adherens junction and several cancer-related pathways, which are not considered to be implicated in Alzheimer’s disease.

The effect of combining control (i.e. healthy) samples from different experiments is to uniformly distribute all sources of bias among the random groups of samples. If we compare groups of control samples based on experiments, there could be true differences due to batch effects. By pooling them together, we form a population which is considered the reference population. This approach is similar to selecting from a large group of people that may contain different sub-groups (e.g. different ethnicities, gender, race, or living conditions). When we randomly select samples (for the two random groups to be compared) from the reference population, we expect all bias (e.g. ethnic subgroups) to be represented equally in both random groups and therefore, we should see no difference between these random groups, no matter how many distinct ethnic subgroups were present in the population at large. Therefore, the p-values of a test for difference between the two randomly selected groups should be equally probable between zero and one (see Supplementary Section 4 and Figures S10–S11 for more discussion).

We apply this procedure for the popular Gene Set Enrichment Analysis (GSEA) [36] using the exact same 40, 000 datasets simulated from the pool of control samples of Alzheimer’s data. The resulting p-value distributions are uniform, as displayed in Supplementary Figure S1, showing not only that our resampled data correctly models the null, but also that GSEA is an unbiased test. This supports the idea that the non-uniformity of the distributions is due to the methods rather than the data. We also plot the top 24 most biased null distributions of GSEA (Figures S2) using the exact same data and exact same random grouping of samples. In each figure, the panels are sorted by the distribution means. The distributions of GSEA (Figures S2, S6) are uniform while those of GSA (Figures S3, S7), SPIA (Figures S4, S8), and PADOG (Figures S5, S9) are biased. Therefore, the bias is indeed due to the methods and not to one specific pathway.

III. Methods

In this section we introduce the DANUBE framework and its application in the context of pathway analysis.

A. The DANUBE framework

We propose a new framework for meta-analysis that makes no assumptions on the data and is therefore expected to perform much better than any of the classical methods when the individual p-values are not distributed uniformly, as we have shown that it is the case for the pathway analysis methods. Figure 2 displays a flowchart comparison between classical meta-analysis and DANUBE. Both approaches take m independent studies as input. The pipeline marked by blue arrows (I–II) shows the classical meta-analysis, and the one marked by black arrows (1–4) is DANUBE.

Fig. 2 — The DANUBE framework for meta-analysis. The blue arrows (I and II) show the classical meta-analysis pipeline while black arrows (1–4) show the pipeline of DANUBE. The first step (I) of the classical approach is to perform a parametric or non-parametric test for each study. This step provides individual p-values which are independent and identically distributed (i.i.d.), but not necessarily uniformly distributed under the null, as shown in Fig. 1. The second step (II) of the classical approach is to use a classical method, such as Fisher’s, to combine the individual p-values, relying heavily on the assumption of uniformity under the null. In step (1) of DANUBE, we choose the discriminating statistic and calculate the values of this statistic in each study (t₁, t₂, …, *t_m*). In step (2), we generate the empirical distribution *ξ_T* of the discriminating statistic under the null hypothesis. In step (3), we calculate the probability of observing t₁, t₂, …, *t_m* using *ξ_T*. In step (4), we combine the m empirical p-values using either the additive method or the Central Limit Theorem (CLT).

The classical approach first calculates a p-value for each study using a parametric or non-parametric test, then it combines the individual p-values into one. The main limitation of the classical approach is that it relies on the assumption of uniformity of the p-values under the null hypothesis, which often does not hold true. As shown in Figure 1, this assumption is not true for real transcriptomics data and KEGG pathways.

In the DANUBE framework, instead of modeling the data under a specific assumption, we construct empirical distributions and use them to calculate empirical p-values. Following the black arrows (1–4) in Figure 2, we initially calculate the values t₁, t₂, …, t_m of the discriminating statistic for the m studies in step (1). For example, instead of using a statistical test to directly calculate the p-values, we could calculate the means of the data samples over the m studies. In step (2), we construct the empirical null distribution ξ_T for the chosen statistic. In step (3), we calculate the empirical p-values ep₁, ep₂, …, ep_m for the m studies with respect to the empirical null distribution ξ_T. For all i ∈ {1, 2, …, m}, ep_i is calculated as the number of elements in ξ_T more extreme than t_i, divided by the total number of elements in ξ_T. We will prove that the resulting empirical p-values are uniformly distributed under the null hypothesis.

Lemma 1

Let T be a random variable with the empirical distribution ξ_T and the cumulative distribution function F_T (T). We define the new random variable X as follows:

X = \frac{| {x : x \in ξ_{T} \land x \leq T} |}{| ξ_{T} |}

(3)

where the numerator represents the number of elements of ξ_T that are smaller than or equal to T. If ξ_T consists of enough data points to be considered as continuous, then X is uniformly distributed on the interval [0,1].

Proof

Denote F_T (T) as the cumulative distribution function of T. For any value t ∈ ξ_T, F_T(t) can be calculated as follows:

F_{T} (t) = \frac{| {x : x \in ξ_{T} \land x \leq t} |}{| ξ_{T} |}

(4)

We can see that X = F_T (T). In addition, F_T(t) is a strictly increasing function for all values t ∈ ξ_T. Let F_X(X) be the cumulative distribution function of X, we have the following formula:

\begin{array}{l} F_{X} (x) = P r (X \leq x) \\ = P r (F_{T} (T) \leq F_{T} (t)) \\ = P r (T \leq t) = F_{T} (t) = x \end{array}

(5)

We note that F_X(x) = x is the cumulative distribution function of the continuous uniform distribution on [0,1]. Therefore, if we have enough data for F_T(T) to be considered continuous, then X will be a uniformly distributed random variable. ■

In step (4), we combine the empirical p-values using either the additive method or the Central Limit Theorem (CLT). According to Lemma 1, the resulting p-values after step (3) are now truly uniformly distributed under the null hypothesis and thus can be combined using the additive method as described in equation (2). However, the additive method can be computationally intensive when m is large. For this reason, we use the CLT to approximate the combined p-value [47]. The uniform distribution has mean and variance of $\frac{1}{2}$ and $\frac{1}{12}$ , respectively. According to the CLT, the average of m independent and identically distributed (i.i.d.) variables (with large m) follows a normal distribution with mean $μ = \frac{1}{2}$ and variance $σ^{2} = \frac{1}{12 m}$ . By default, we use this to approximate the combined p-value when m ≥ 20. We note that the additive method of combining p-values in our framework may be substituted by any other method of combining p-values.

B. The application of DANUBE in pathway analysis

Here we present the application of DANUBE in the context of pathway analysis (Figure 3). Let us consider a method M, which can be GSEA, GSA, SPIA, or PADOG, or any other method that outputs a p-value for each pathway in the pathway database. We treat this p-value as the discriminating statistic. In step (1), we calculate the p-values of the pathways using the method M. A pathway i will have m p-values (p_i₁, p_i₂, …, p_im) for the m studies. The m p-values for a pathway are independent and identically distributed (i.i.d.). However, these p-values are not necessarily uniformly distributed under the null hypothesis (see Figure 1). Therefore, combining these p-values will lead to systematic bias in identifying significant pathways as shown in Section II-C and as will be further illustrated in Section IV. Instead of combining these p-values, we treat them as observed values of the discriminating statistic.

Fig. 3 — DANUBE’s application in pathway analysis. The input is m studies (datasets), and a pathway database, such as KEGG. Each dataset has a certain number of control and disease samples. Step (1): perform pathway analysis using a method M (eg. GSA, SPIA, or PADOG). For each pathway, the resulting m p-values are independent and identically distributed (i.i.d.). However, these p-values are not uniformly distributed under the null hypothesis (see Figure 1), and therefore combining them would result in systematic bias. Step (2): pool the control samples from the m datasets to produce a large set of control samples. Step (3): generate k simulated datasets by randomly sampling from the pool. Since the “disease” and “control” samples in each of the simulated datasets were chosen only from the control samples of the original m studies, the resulting p-values are calculated under the null hypothesis. Step (4): perform pathway analysis on the simulated data. Step (5): build an empirical distribution for each pathway, which consists of k p-values obtained under the null hypothesis. Step (6): calculate an empirical p-value for each p-value obtained from step (1). For example, using the empirical distribution ξ₁, we calculate the empirical p-value ep₁₁ as the probability of observing a p-value more extreme than p₁, i.e., ep₁₁ = |{sp₁*_i ≤ p*₁₁, i ∈ [1..k])|. Step (7): combine the m empirical p-values obtained for each pathway using either the additive method or the Central Limit Theorem.

To calculate the probability of observing such values, we need to construct the empirical distribution under the null hypothesis as described in steps (2–5) above. In step (2), we take all of the control samples from the m studies to create a set of control samples as shown in (C) in Figure 3. In step (3), we generate the k synthetic datasets by random sampling from the pool of control samples. For example, for a simulation, we choose two groups of samples from the pool and label them as controls and diseases. In our case study using the Alzheimer’s datasets, as described in Section II-C, we generated 10, 000 simulations of 10 control and 10 disease samples, 10, 000 simulations of 10 control and 20 disease samples, 10, 000 of 20 control and 10 disease samples, and 10, 000 of 37 control and 37 disease samples, for a total of 40, 000 simulations.

After generating k simulations from the control samples, we proceed to calculate the p-values for each pathway and each simulation using the same method M. For a pathway i, we have a set of p-values sp_i₁, sp_i₂, …, sp_ik. Since all of these p-values are calculated from the real control samples (i.e. healthy people), they can be considered as p-values under the null hypothesis. These p-values will be used to construct the empirical distribution ξ_i in step (5). In summary, steps (2–5) produce an empirical distribution for each pathway, resulting in a total of n empirical distributions for n pathways. These distributions will be used to calculate the empirical p-values of the measurements done in step (1).

After steps (1–5), for a pathway i, we have m p-values p_i₁, p_i₂, …, p_im and an empirical distribution ξ_i. Using the formula described in Equation (2), we calculate the empirical p-values ep_i₁, ep_i₂, …, ep_im. As we showed in the Methods section, these empirical p-values are independent and uniformly distributed under the null hypothesis. In step (7), we combine these empirical p-values using the additive method to have a single p-value pDANUBE_i for pathway i.

IV. Results and Validation

In this section we illustrate the limitations of combining p-values using classical meta-analysis approaches, and show that DANUBE overcomes these limitations. Sections IV-A and IV-B compare the classical approaches with DANUBE for the specific application domain of pathway analysis. Sections IV-C and IV-D compare the classical meta-analysis approaches with DANUBE in the general case, applicable to any meta-analysis.

For the pathway analysis applications on which we focus in this paper, we compare DANUBE with 5 other classical meta-analysis methods: Stouffer’s, Z-method, Brown’s, Fisher’s, and the additive method [14, 24, 48, 49], each of them combined with each of the 4 pathway analysis methods (GSEA, GSA, SPIA, and PADOG). We also compare these methods with a stand-alone meta-analysis method, MetaPath. In total, we analyze the results of 25 approaches: 6 meta-analyses combined with 4 pathway analysis methods, plus MetaPath [11, 50]. Each of these methods is tested on two diseases, one is Alzheimer’s disease with 7 and the other is acute myeloid leukemia (AML) with 9 datasets. These conditions were selected for two reasons. First, there is a pathway in KEGG for each of the diseases. We refer to this as the target pathway, and use it to validate the methods. Second, there are multiple experiments available in the public domain for both of these diseases.

A. Pathway analysis applications: Alzheimer’s disease

The Alzheimer’s datasets we use in our data analysis are GSE28146 (hippocampus) and GSE5281 (6 different tissues: entorhinal cortex (EC), hippocampus (HIP), medial temporal gyrus (MTG), posterior cingulate (PC), superior frontal gyrus (SFG), and primary visual cortex (VCX)). The 4 pathway analysis methods, GSEA, GSA, SPIA, and PADOG, were used to process the expression data in each study and output a p-value for each study and for each pathway. Details of all datasets are provided in Supplementary Section 3.

The rankings and FDR-corrected p-values of the target pathway Alzheimer’s disease for the 7 Alzheimer’s datasets are displayed in Figure 4. The graphs demonstrate that the adjusted p-values and rankings of the target pathway vary substantially between the 4 methods for a given study, and from one study to the next. Furthermore, both GSA and PADOG report the target pathway Alzheimer’s disease as not significant in all 7 studies.

Fig. 4 — Ranks (panel A) and p-values (panel B) of the KEGG target pathway, *Alzheimer’s disease*, for 7 Alzheimer’s datasets, using the pathway analysis methods: Gene Set Enrichment Analysis (GSEA), Gene Set Analysis (GSA), Signaling Pathway Impact Analysis (SPIA), and Down-weighting of Overlapping Genes (PADOG). The horizontal axes show the 7 Alzheimer’s datasets. The vertical axis in panel (A) shows the rankings of the target pathway for each dataset using the 4 methods. The vertical axis in panel (B) shows the FDR-corrected p-values of the target pathway. The red horizontal line in (B) shows the threshold 0.01. Note how the rankings and p-values of the target pathway vary greatly across different datasets and methods, making the interpretation of the results very difficult.

We combine the 4 pathway analysis methods with 6 meta-analyses: Stouffer’s, Z-method, Brown’s, Fisher’s, the additive method, and DANUBE. Using a pathway analysis method M, each pathway has 7 p-values – one per study. These 7 p-values are combined using each of the 6 meta analysis methods Therefore, each pathway analysis method produces 6 lists of pathways. Each list has 150 pathways ranked according to the combined p-values. We then adjusted the combined p-values for multiple comparisons in each list using FDR.

In order to run DANUBE, we generated the null distributions from control samples as described in Section III-B. We took the 74 control samples from the 7 Alzheimer’s datasets, and randomly divided them into “control” and “disease” subgroups. We generated 10, 000 simulations of 10 controls and 10 diseases, 10, 000 simulations of 10 controls and 20 diseases, 10, 000 of 20 controls and 10 diseases, and 10, 000 of 37 controls and 37 diseases, for a total of 40, 000 simulations. For each pathway analysis method, we constructed 150 empirical distributions for 150 KEGG signaling pathways (totally 600 empirical distributions for the 4 methods GSEA, GSA, SPIA, and PADOG). We used these empirical distributions to calculate the empirical p-values before applying the additive method to combine the empirical p-values for each pathway, resulting in 150 combined p-values. We then adjusted the combined p-values for multiple comparisons using FDR. Running time is reported in Supplementary Section 5 and Tables S1–S2.

Table I displays the results using GSA combined with the 6 meta-analysis methods. The horizontal line across each list marks the 1% significance threshold. The pathway highlighted green is the target pathway Alzheimer’s disease. Pathways highlighted in red are examples of false positives. These pathways were expected to be reported as false positives because their null distribution is very skewed towards zero (see Figure 1 panels A1–A3 and Supplementary Figure S3). These include Adherens junction and several cancer-related pathways, none of which are known to be implicated in Alzheimer’s disease. Stouffer’s method, the additive method, and DANUBE identify the target pathway as significant. DANUBE yields the best ranking.

Both Stouffer’s and the additive method identify the target pathway as significant using GSA, as shown in Table I. However, the inherent bias of the null distribution brings irrelevant results into the list of significant pathways. For Stouffer’s method, pathways having p-values biased toward zero, such as Prostate cancer, Adherens junction, Pathways in cancer, and Pancreatic cancer are still among the significant pathways. For the additive method, pathways having p-values biased toward zero, such as Prostate cancer, Adherens junction and Pathways in cancer are still among the significant pathways.

Table II displays the results using PADOG combined with the 6 meta-analysis methods. Only DANUBE identifies the target pathway as significant. Z-method and Brown’s method return no significant pathways. For Stouffer’s, Fisher’s, and the additive method, the systematic bias of the pathway analysis method greatly influences the outcome of the meta-analyses. Pathways having p-values biased toward zero, such as Adherens junction and cancer related pathways (see Figure 1 panels C1–C3 and Supplementary Figure S5) are among the significant pathways.

TABLE II.

The 20 top ranked pathways and FDR-corrected p-values obtained by combining the PADOG p-values using 6 meta-analysis methods for Alzheimer’s disease. Only DANUBE identifies the target pathway Alzheimer’s disease as significant and ranks it in position 6^th.

	PADOG + Stouffer’s method		PADOG + Z-method		PADOG + Brown’s method
	Pathway	pvalue.fdr	Pathway	pvalue.fdr	Pathway	pvalue.fdr
1	Adherens junction	< 10⁻⁴	Adherens junction	0.6725	HIF-1 signaling pathway	0.6495
2	Shigellosis	0.0002	Shigellosis	0.6725	Adherens junction	0.6495
3	Renal cell carcinoma	0.0002	Renal cell carcinoma	0.6725	Gap junction	0.6495
4	Prostate cancer	0.0005	Prostate cancer	0.6725	Long-term potentiation	0.6495
5	Bacterial invasion of epithelial cells	0.0014	Bacterial invasion of epithelial cells	0.6725	Long-term depression	0.6495
6	Long-term depression	0.0036	Long-term depression	0.6725	Endocrine and other factor-regulated calcium reabsorption	0.6495
7	Pathogenic Escherichia coli infection	0.0036	Pathogenic Escherichia coli infection	0.6725	Bacterial invasion of epithelial cells	0.6495
8	Colorectal cancer	0.0036	Colorectal cancer	0.6725	Vibrio cholerae infection	0.6495
9	Gap junction	0.0036	Gap junction	0.6725	Pathogenic Escherichia coli infection	0.6495
10	Glioma	0.0036	Glioma	0.6725	Shigellosis	0.6495
11	Pancreatic cancer	0.0036	Pancreatic cancer	0.6725	Colorectal cancer	0.6495
12	Vibrio cholerae infection	0.0036	Vibrio cholerae infection	0.6725	Renal cell carcinoma	0.6495
13	Endocrine and other factor-regulated calcium reabsorption	0.0043	Endocrine and other factor-regulated calcium reabsorption	0.6725	Pancreatic cancer	0.6495
14	ErbB signaling pathway	0.0053	ErbB signaling pathway	0.6725	Endometrial cancer	0.6495
15	Endometrial cancer	0.0063	Endometrial cancer	0.6725	Glioma	0.6495
16	HIF-1 signaling pathway	0.0063	HIF-1 signaling pathway	0.6725	Prostate cancer	0.6495
17	Neurotrophin signaling pathway	0.0067	Neurotrophin signaling pathway	0.6725	ErbB signaling pathway	0.6533
18	Long-term potentiation	0.0076	Long-term potentiation	0.6725	Neurotrophin signaling pathway	0.6533
19	Synaptic vesicle cycle	0.0160	Synaptic vesicle cycle	0.7324	mRNA surveillance pathway	0.7157
20	VEGF signaling pathway	0.0317	VEGF signaling pathway	0.7324	MAPK signaling pathway	0.7157

	PADOG + Fisher’s method		PADOG + Additive method		PADOG + DANUBE
	Pathway	pvalue.fdr	Pathway	pvalue.fdr	Pathway	pvalue.fdr

1	Adherens junction	0.0008	Adherens junction	< 10⁻⁴	Vibrio cholerae infection	< 10⁻⁴
2	Shigellosis	0.0022	Renal cell carcinoma	< 10⁻⁴	Shigellosis	< 10⁻⁴
3	Renal cell carcinoma	0.0022	Shigellosis	< 10⁻⁴	Parkinson’s disease	0.0007
4	Prostate cancer	0.0049	Prostate cancer	0.0001	Synaptic vesicle cycle	0.0007
5	Bacterial invasion of epithelial cells	0.0065	Long-term depression	0.0006	Gap junction	0.0007
6	Pathogenic Escherichia coli infection	0.0149	Colorectal cancer	0.0009	Alzheimer’s disease	0.0007
7	Endocrine and other factor-regulated calcium reabsorption	0.0199	Gap junction	0.0011	Pathogenic Escherichia coli infection	0.0007
8	Glioma	0.0199	ErbB signaling pathway	0.0013	Cardiac muscle contraction	0.0007
9	Pancreatic cancer	0.0199	Bacterial invasion of epithelial cells	0.0013	Epithelial cell signaling in Helicobacter pylori infection	0.0009
10	Long-term depression	0.0199	Vibrio cholerae infection	0.0013	Huntington’s disease	0.0013
11	Gap junction	0.0199	Pancreatic cancer	0.0021	Renal cell carcinoma	0.0024
12	Colorectal cancer	0.0199	Glioma	0.0022	Vasopressin-regulated water reabsorption	0.0047
13	Vibrio cholerae infection	0.0199	Neurotrophin signaling pathway	0.0028	VEGF signaling pathway	0.0052
14	Long-term potentiation	0.0226	HIF-1 signaling pathway	0.0037	Endocrine and other factor-regulated calcium reabsorption	0.0072
15	Endometrial cancer	0.0226	Pathogenic Escherichia coli infection	0.0042	Bacterial invasion of epithelial cells	0.0078
16	HIF-1 signaling pathway	0.0257	Endometrial cancer	0.0052	GABAergic synapse	0.0102
17	ErbB signaling pathway	0.0326	VEGF signaling pathway	0.0052	Adherens junction	0.0103
18	Neurotrophin signaling pathway	0.0352	Endocrine and other factor-regulated calcium reabsorption	0.0052	Long-term depression	0.0103
19	Synaptic vesicle cycle	0.0600	Synaptic vesicle cycle	0.0086	Salmonella infection	0.0134
20	Dopaminergic synapse	0.1305	Long-term potentiation	0.0106	Colorectal cancer	0.0198

Open in a new tab

Supplementary Table S3 displays the results using SPIA combined with the 6 meta-analysis methods. The target pathway is significant and is ranked near the top for all methods. DANUBE yields the shortest list of significant pathways. All the 5 significant pathways, Parkinson’s disease, Alzheimer’s disease, Synaptic vesicle cycle, Cardiac muscle contration, and Huntington’s disease are also significant when we combine DANUBE with GSA and PADOG.

Supplementary Table S4 displays the results using GSEA combined with the 6 meta-analysis methods. The horizontal line across each list marks the cutoff FDR = 0.01. The pathway highlighted green is the target pathway Alzheimer’s disease. The target pathway is significant for all the 6 meta-analysis methods. Because GSEA is unbiased, the additive method and DANUBE have equivalent results. These two methods have a shorter list of significant pathways and rank the target pathway higher than other methods. In addition, all the 4 significant pathways, Cardiac muscle contration, Huntington’s disease, Alzheimer’s disease, and Parkinson’s disease appear in the lists of significant pathways when we combine DANUBE with GSA, PADOG, and SPIA.

There is no gold standard for assigning true or false values to each of the results, apart from the expectation that a disease under study should impact its namesake pathway. Indeed, the target pathway Alzheimer’s disease is ranked as significant for all of the 4 pathway analysis methods when combined with DANUBE. The target pathway is also ranked higher when using DANUBE compared to the results of other 5 meta-analysis methods. In addition, the pathways Parkinson’s disease, Alzheimer’s disease, Cardiac muscle constration, and Huntington’s disease, consistently appear as significant in the results of all the 4 pathway analysis methods when combined with DANUBE.

Alzheimer’s, Parkinson’s, and Huntington’s diseases are three neurological disorders that have many commonalities including abnormal protein folding, endoplasmic reticulum stress, and ubiquitin mediated breakdown of proteins, leading to programmed cell death. Given that the pathway Alzheimer’s disease is influenced by the mitochondrial compartment, which is strongly implicated in the disease [51–54], it is not surprising that other pathways with strong mitochondrial components also garner high rankings. Previous studies [55] have shown the presence of a cross-talk that makes the neurological disease pathways, Alzheimer’s disease, Parkinson’s disease and Huntington’s disease, along with Cardiac muscle contraction, appear as significant simultaneously, due to their dominant mitochondrial module. Cardiac muscle contraction has a strong mitochondrial component and is highly dependent on calcium signaling, which is also prevalent in Synaptic vesicle cycle, Alzheimer’s disease, and Huntington’s disease. Ca2+ regulates mitochondrial metabolism, but calcium overload to mitochondria can result in cell damage from reactive oxygen [56].

We also use MetaPath to combine the 7 studies. MetaPath is a stand-alone meta-analysis method, which does not need an external pathway analysis tool. This method performs meta-analysis at both gene (MAPE_G) and pathway levels (MAPE_P), and then combines the results (MAPE_I) to give the final p-value and ranking of pathways. Supplementary Table S5 shows the top 7 pathways using MetaPath for the 7 Alzheimer’s datasets. The target pathway Alzheimer’s disease is not significant and is outranked by 6 other pathways.

B. Pathway analysis applications: AML

The AML datasets we use in our data analysis are GSE14924 (CD4 and CD8 T cells), GSE17054 (stem cells), GSE12662 (CD34+ cells, promyelocytes, and neutrophils and PR9 cell line), GSE57194 (CD34+ cells), GSE33223 (peripheral blood, bone marrow), GSE42140 (peripheral blood, bone marrow), GSE8023 (CD34+ cells), and GSE15061 (bone marrow). The rankings and FDR-corrected p-values of the target pathway Acute myeloid leukemia for the 9 AML datasets are displayed in Supplementary Figure S12. The graphs demonstrate that the adjusted p-values and rankings of the target pathway vary substantially between the 4 methods for a given study, and from one study to the next. Furthermore, the AML pathway was not found to be significant by any method in any dataset.

We combine the 4 pathway analysis methods with the 6 meta-analysis methods. Using a pathway analysis method M, each pathway has 9 p-values – one per study. These 9 p-values are combined using each of the 6 meta-analysis methods Therefore, each pathway analysis method produces 6 lists of pathways. Each list has 150 pathways ranked according to the combined p-values. We then adjust the combined p-values for multiple comparisons in each list using FDR.

In order to run DANUBE, we generated the null distributions from control samples as described in Section III-B. We took the 140 control samples of the 9 AML datasets, and randomly designated “control” and “disease” subgroups. We generated 10, 000 simulations of 10 controls and 10 diseases, 10, 000 simulations of 30 controls and 50 diseases, 10, 000 of 50 controls and 30 diseases, and 10, 000 of 70 controls and 70 diseases, for a total of 40, 000 simulations. For each pathway analysis method, we constructed 150 empirical distributions for 150 KEGG signaling pathways (totally 600 empirical distributions for the 4 pathway analysis methods). We then used the empirical distributions to calculate the empirical p-values before applying the additive method to combine the empirical p-values for each pathway, resulting in 150 combined p-values. Finally, we adjusted the combined p-values for multiple comparisons using FDR.

Table III displays the results of GSA combined with the 6 meta-analysis methods, ordered by the FDR corrected p-values. We place a horizontal line across each list to mark our 1% cutoff. Stouffer’s method, the additive method, and DANUBE identify the target pathway as significant. DANUBE yields the best ranking (ranked 1^st), followed by the additive (2^nd) and Stouffer’s method (13^th). In addition, the target pathway is the only significant pathway in DANUBE’s result.

TABLE III.

The 21 top ranked pathways and FDR-corrected p-values obtained by combining the GSA p-values using 6 meta-analysis methods for acute myeloid leukemia (AML). The target pathway Acute myeloid leukemia is significant for Stouffer’s, the additive method, and DANUBE with rankings 13^th, 2^nd, and 1^st, respectively.

	GSA + Stouffer’s method		GSA + Z-method		GSA + Brown’s method
	Pathway	pvalue.fdr	Pathway	pvalue.fdr	Pathway	pvalue.fdr
1	ErbB signaling pathway	< 10⁻⁴	ErbB signaling pathway	< 10⁻⁴	ErbB signaling pathway	< 10⁻⁴
2	Sulfur relay system	< 10⁻⁴	Sulfur relay system	< 10⁻⁴	Sulfur relay system	< 10⁻⁴
3	Adherens junction	< 10⁻⁴	Adherens junction	< 10⁻⁴	Adherens junction	< 10⁻⁴
4	Tight junction	< 10⁻⁴	Tight junction	< 10⁻⁴	Tight junction	< 10⁻⁴
5	Circadian rhythm	< 10⁻⁴	Circadian rhythm	< 10⁻⁴	Circadian rhythm	< 10⁻⁴
6	Alcoholism	< 10⁻⁴	Alcoholism	< 10⁻⁴	Alcoholism	< 10⁻⁴
7	Shigellosis	< 10⁻⁴	Shigellosis	< 10⁻⁴	Shigellosis	< 10⁻⁴
8	Transcriptional misregulation in cancer	< 10⁻⁴	Transcriptional misregulation in cancer	< 10⁻⁴	Transcriptional misregulation in cancer	< 10⁻⁴
9	Renal cell carcinoma	< 10⁻⁴	Renal cell carcinoma	< 10⁻⁴	Renal cell carcinoma	< 10⁻⁴
10	Glioma	< 10⁻⁴	Glioma	< 10⁻⁴	Glioma	< 10⁻⁴
11	Systemic lupus erythematosus	< 10⁻⁴	Systemic lupus erythematosus	< 10⁻⁴	Systemic lupus erythematosus	< 10⁻⁴
12	Non-small cell lung cancer	0.0003	Non-small cell lung cancer	0.0606	Non-small cell lung cancer	0.1250
13	Acute myeloid leukemia	0.0012	Acute myeloid leukemia	0.1011	mTOR signaling pathway	0.2120
14	VEGF signaling pathway	0.0017	VEGF signaling pathway	0.1139	VEGF signaling pathway	0.2120
15	Endometrial cancer	0.0025	Endometrial cancer	0.1298	Pathways in cancer	0.2120
16	Pathways in cancer	0.0029	Pathways in cancer	0.1352	Acute myeloid leukemia	0.2120
17	mTOR signaling pathway	0.0033	mTOR signaling pathway	0.1386	HIF-1 signaling pathway	0.2252
18	Chronic myeloid leukemia	0.0081	Chronic myeloid leukemia	0.1933	Endometrial cancer	0.2252
19	Prostate cancer	0.0081	Prostate cancer	0.1933	Prostate cancer	0.2252
20	Pancreatic cancer	0.0097	Pancreatic cancer	0.2037	Insulin signaling pathway	0.2379
21	HIF-1 signaling pathway	0.0150	HIF-1 signaling pathway	0.2394	Pancreatic cancer	0.2628

	GSA + Fisher’s method		GSA + Additive method		GSA + DANUBE
	Pathway	pvalue.fdr	Pathway	pvalue.fdr	Pathway	pvalue.fdr

1	ErbB signaling pathway	< 10⁻⁴	Non-small cell lung cancer	0.0003	Acute myeloid leukemia	0.0065
2	Sulfur relay system	< 10⁻⁴	Acute myeloid leukemia	0.0003	Transcriptional misregulation in cancer	0.0231
3	Adherens junction	< 10⁻⁴	VEGF signaling pathway	0.0005	VEGF signaling pathway	0.0489
4	Tight junction	< 10⁻⁴	ErbB signaling pathway	0.0005	Alcoholism	0.1161
5	Circadian rhythm	< 10⁻⁴	Endometrial cancer	0.0008	Non-small cell lung cancer	0.5968
6	Alcoholism	< 10⁻⁴	Transcriptional misregulation in cancer	0.0020	Bladder cancer	0.5968
7	Shigellosis	< 10⁻⁴	Chronic myeloid leukemia	0.0038	HIF-1 signaling pathway	0.5968
8	Transcriptional misregulation in cancer	< 10⁻⁴	mTOR signaling pathway	0.0043	Apoptosis	0.5968
9	Renal cell carcinoma	< 10⁻⁴	Pathways in cancer	0.0043	mTOR signaling pathway	0.5968
10	Glioma	< 10⁻⁴	Colorectal cancer	0.0084	Cocaine addiction	0.5968
11	Systemic lupus erythematosus	< 10⁻⁴	Glioma	0.0108	Autoimmune thyroid disease	0.6141
12	Non-small cell lung cancer	0.0048	Pancreatic cancer	0.0108	Amyotrophic lateral sclerosis (ALS)	0.6458
13	Pathways in cancer	0.0153	Prostate cancer	0.0108	Notch signaling pathway	0.6458
14	Acute myeloid leukemia	0.0181	Small cell lung cancer	0.0177	ErbB signaling pathway	0.6458
15	mTOR signaling pathway	0.0188	Bacterial invasion of epithelial cells	0.0177	HTLV-I infection	0.6458
16	VEGF signaling pathway	0.0188	Adherens junction	0.0184	Natural killer cell mediated cytotoxicity	0.6458
17	Endometrial cancer	0.0243	Renal cell carcinoma	0.0239	Chronic myeloid leukemia	0.6458
18	HIF-1 signaling pathway	0.0252	Melanoma	0.0326	Endocytosis	0.6458
19	Prostate cancer	0.0252	Endocytosis	0.0403	Small cell lung cancer	0.6458
20	Insulin signaling pathway	0.0295	HIF-1 signaling pathway	0.0447	Fc gamma R-mediated phagocytosis	0.6458
21	Pancreatic cancer	0.0378	Circadian rhythm	0.0447	African trypanosomiasis	0.6458

Open in a new tab

The horizontal lines show the 1% significance threshold. The target pathway Acute myeloid leukemia is highlighted in green.

Table IV shows the results of PADOG combined with the 6 meta-analysis methods. The target pathway is significant for the 4 methods: DANUBE, Stouffer’s, Fisher’s, and the additive method. For DANUBE, Acute myeloid leukemia is ranked 1^st compared to 7^th using the other three meta-analysis methods. There are no significant pathways using the Z-method and Brown’s method.

TABLE IV.

The 23 top ranked pathways and FDR-corrected p-values obtained by combining the PADOG p-values using 6 meta-analysis methods for acute myeloid leukemia (AML). The target pathway Acute myeloid leukemia is significant for Stouffer’s, Fisher’s, the additive method and DANUBE. DANUBE yields the best ranking.

	PADOG + Stouffer’s method		PADOG + Z-method		PADOG + Brown’s method
	Pathway	pvalue.fdr	Pathway	pvalue.fdr	Pathway	pvalue.fdr
1	Non-small cell lung cancer	< 10⁻⁴	Non-small cell lung cancer	0.0705	Chronic myeloid leukemia	0.0412
2	Chronic myeloid leukemia	< 10⁻⁴	Chronic myeloid leukemia	0.0705	Non-small cell lung cancer	0.0412
3	Glioma	< 10⁻⁴	Glioma	0.2152	Glioma	0.1240
4	ErbB signaling pathway	< 10⁻⁴	ErbB signaling pathway	0.2239	ErbB signaling pathway	0.2149
5	Colorectal cancer	< 10⁻⁴	Colorectal cancer	0.2565	VEGF signaling pathway	0.2806
6	Prostate cancer	< 10⁻⁴	Prostate cancer	0.2565	Pathways in cancer	0.2806
7	Acute myeloid leukemia	< 10⁻⁴	Acute myeloid leukemia	0.2565	Colorectal cancer	0.2806
8	VEGF signaling pathway	0.0001	VEGF signaling pathway	0.2565	Pancreatic cancer	0.2806
9	Endometrial cancer	0.0001	Endometrial cancer	0.2565	Prostate cancer	0.2806
10	Pancreatic cancer	0.0001	Pancreatic cancer	0.2565	Acute myeloid leukemia	0.2806
11	Pathways in cancer	0.0001	Pathways in cancer	0.2565	Endometrial cancer	0.3398
12	Transcriptional misregulation in cancer	0.0005	Transcriptional misregulation in cancer	0.3509	mTOR signaling pathway	0.4198
13	T cell receptor signaling pathway	0.0012	T cell receptor signaling pathway	0.4055	T cell receptor signaling pathway	0.4198
14	mTOR signaling pathway	0.0012	mTOR signaling pathway	0.4055	Circadian rhythm	0.4198
15	Circadian rhythm	0.0015	Circadian rhythm	0.4061	Insulin signaling pathway	0.4198
16	Neurotrophin signaling pathway	0.0021	Neurotrophin signaling pathway	0.4184	Transcriptional misregulation in cancer	0.4198
17	Small cell lung cancer	0.0024	Small cell lung cancer	0.4184	Small cell lung cancer	0.4491
18	Renal cell carcinoma	0.0054	Renal cell carcinoma	0.4837	Neurotrophin signaling pathway	0.4568
19	Insulin signaling pathway	0.0063	Insulin signaling pathway	0.4837	mRNA surveillance pathway	0.4695
20	Endocytosis	0.0070	Endocytosis	0.4837	MAPK signaling pathway	0.4695
21	Adherens junction	0.0070	Adherens junction	0.4837	HIF-1 signaling pathway	0.4695
22	Wnt signaling pathway	0.0168	Wnt signaling pathway	0.5674	Endocytosis	0.4695
23	Melanoma	0.0195	Melanoma	0.5674	Wnt signaling pathway	0.4695

	PADOG + Fisher’s method		PADOG + Additive method		PADOG + DANUBE
	Pathway	pvalue.fdr	Pathway	pvalue.fdr	Pathway	pvalue.fdr

1	Chronic myeloid leukemia	< 10⁻⁴	Non-small cell lung cancer	< 10⁻⁴	Acute myeloid leukemia	< 10⁻⁴
2	Non-small cell lung cancer	< 10⁻⁴	Chronic myeloid leukemia	< 10⁻⁴	VEGF signaling pathway	0.0007
3	Glioma	< 10⁻⁴	ErbB signaling pathway	< 10⁻⁴	Non-small cell lung cancer	0.0008
4	ErbB signaling pathway	< 10⁻⁴	Endometrial cancer	< 10⁻⁴	T cell receptor signaling pathway	0.0021
5	Colorectal cancer	0.0003	Glioma	< 10⁻⁴	Colorectal cancer	0.0023
6	Prostate cancer	0.0006	Colorectal cancer	< 10⁻⁴	Chronic myeloid leukemia	0.0027
7	Acute myeloid leukemia	0.0006	Acute myeloid leukemia	< 10⁻⁴	Endometrial cancer	0.0057
8	Pancreatic cancer	0.0007	Prostate cancer	< 10⁻⁴	Transcriptional misregulation in cancer	0.0095
9	VEGF signaling pathway	0.0007	Transcriptional misregulation in cancer	0.0001	Glioma	0.0153
10	Pathways in cancer	0.0009	VEGF signaling pathway	0.0001	mTOR signaling pathway	0.0160
11	Endometrial cancer	0.0021	Pathways in cancer	0.0001	Prostate cancer	0.0203
12	Transcriptional misregulation in cancer	0.0056	Pancreatic cancer	0.0002	Apoptosis	0.0239
13	T cell receptor signaling pathway	0.0080	mTOR signaling pathway	0.0005	ErbB signaling pathway	0.0390
14	mTOR signaling pathway	0.0098	Neurotrophin signaling pathway	0.0005	B cell receptor signaling pathway	0.0464
15	Insulin signaling pathway	0.0098	Renal cell carcinoma	0.0006	Circadian rhythm	0.0521
16	Circadian rhythm	0.0098	T cell receptor signaling pathway	0.0006	Thyroid cancer	0.0844
17	Small cell lung cancer	0.0138	Circadian rhythm	0.0006	Progesterone-mediated oocyte maturation	0.1040
18	Neurotrophin signaling pathway	0.0165	Small cell lung cancer	0.0011	Oocyte meiosis	0.1040
19	Adherens junction	0.0318	Endocytosis	0.0036	Systemic lupus erythematosus	0.1441
20	Endocytosis	0.0356	Adherens junction	0.0052	Neurotrophin signaling pathway	0.1697
21	Renal cell carcinoma	0.0502	Melanoma	0.0072	Shigellosis	0.1697
22	Axon guidance	0.0564	Bacterial invasion of epithelial cells	0.0081	Fc epsilon RI signaling pathway	0.1697
23	Wnt signaling pathway	0.0564	Wnt signaling pathway	0.0128	Pancreatic cancer	0.2083

Open in a new tab

The horizontal lines show the 1% significance threshold. The target pathway Acute myeloid leukemia is highlighted in green.

Supplementary Table S6 shows the results of SPIA combined with the 6 meta-analysis methods, ordered by the FDR corrected p-value. Again, the target pathway is significant using Stouffer’s, Fisher’s, the additive method, and DANUBE. The additive method and DANUBE have the same list of significant pathways. In addition, both methods place the target pathway higher than the other two methods.

Supplementary Table S7 displays the results of GSEA combined with the 6 meta-analysis methods. The target pathway Acute myeloid leukemia is highlighted in green. For all 6 meta-analyses, the target pathway is not significant despite being ranked among the top pathways. Since GSEA has no bias, the additive method and DANUBE yield similar results. In essence, even though it is completely unbiased, GSEA lacks the power to identify the Acute myeloid leukemia (AML) as significant in the AML data.

We also use MetaPath to combine the 9 acute myeloid leukemia studies. Supplementary Table S8 shows the top 5 pathways using MetaPath. The target pathway is not significant (p=0.4), and is outranked by 2 other pathways.

Table V summarizes all the results for the 25 approaches (4 pathway analysis methods each combined with one of 6 meta-analysis approaches, plus MetaPath). On average, DANUBE performs best in terms of ranking, as well as in terms of identifying the target pathway as significant at the 1% cutoff.

TABLE V.

Ranking and significance of the target pathway for Alzheimer’s disease and acute myeloid leukemia (AML). The first and second columns show the disease and the pathway analysis methods. The next 6 columns show the ranking of the target pathways for 6 meta-analysis combined with the 4 pathway analysis methods. Each row shows the result of the 6 meta-analysis methods combined with the same pathway analysis method. Each cell shows the ranking of the target pathways. The Y(es) or N(o) letters next to the ranking denote if the target pathway is significant or not. Cells highlighted in green are those that are significant and have the best rankings in their row. The last column shows the result of MetaPath. For both diseases, and for all the 4 pathway analysis methods, the target pathway is significant and is ranked the highest when using DANUBE. The target pathway is not significant for AML data when the GSEA p-values are combined with any of the 6 meta-analysis methods.

	Pathway analysis	Stouffer’s method	Z-method	Brown’s method	Fisher’s method	Additive method	DANUBE	MetaPath
Alzheimer’s	GSEA	4 (Y)	4 (Y)	4 (Y)	4 (Y)	3 (Y)	3 (Y)	7 (N)
	GSA	11 (Y)	11 (N)	15 (N)	14 (N)	6 (Y)	2 (Y)
	SPIA	2 (Y)	2 (Y)	3 (Y)	3 (Y)	2 (Y)	2 (Y)
	PADOG	21 (N)	21 (N)	31 (N)	23 (N)	21 (N)	6 (Y)

AML	GSEA	1 (N)	1 (N)	4 (N)	4 (N)	1 (N)	1 (N)	4 (N)
	GSA	13 (Y)	13 (N)	16 (N)	14 (N)	2 (Y)	1 (Y)
	SPIA	4 (Y)	4 (N)	6 (N)	6 (Y)	2 (Y)	2 (Y)
	PADOG	7 (Y)	7 (N)	10 (N)	7 (Y)	7 (Y)	1 (Y)

Open in a new tab

We note that for both diseases, DANUBE and the additive methods have the same results when combined with GSEA because GSEA is an unbiased method with uniform distributions of p-values under the null. In addition, the results of the two methods for SPIA are almost equivalent because the distributions of the p-values produced by SPIA under the null are closer to the expected uniform. Notably, DANUBE is more useful in conjunction with methods that have more skewed empirical null distributions.

C. General case: t-test and Wilcoxon test

In this section we will demonstrate the generality of the problem, beyond pathway analysis applications. In order to do so, we have used the one sample t-test [57, 58] and the one sample Wilcoxon signed-rank test [59–61], as illustrative examples of parametric and non-parametric tests. Using simulated null distributions, we show that both the t-test and Wilcoxon tests have systematic bias depending on the shape and the symmetry of the null distribution. When the p-values are biased towards zero, combining multiple studies results in an increase of type I error (prevalence of false positives). When the p-values are biased towards one, the test loses power and more evidence is needed to identify true positives.

In Figure 5, panel (a) displays a simulated null distribution H₀ which is not symmetrical and does not follow any standard distribution. Panel (b) displays an alternative distribution H₁, which has the same shape as H₀, but a slightly smaller median. Panel (c) displays another alternative distribution H₂ which has the same shape as H₀ but a slightly larger median. Each population has 100, 000 elements. The goal here is to investigate the ability of each approach to distinguish between H₀ and H₁, and between H₀ and H₂, respectively. This is attempted using both a t-test and a Wilcoxon test.

Fig. 5 — Type I and Type II errors of the classical meta-analysis using one sample t-test and Wilcoxon signed-ranked test. Panel (a) displays the probability distribution under the null hypothesis H₀. Panel (b) displays an alternative distribution H₁ which has the same shape as the null distribution with a slightly smaller median. Panel (c) displays another alternative distribution H₂ which has the same shape as the null distribution with a slightly larger median. Panels (d–h) display the results using left-tailed t-tests. Panel (d) displays the distribution of p-values using left-tailed t-test for samples drawn from the null distribution H₀. Panel (e) displays the distribution of combined p-values using left-tailed t-test for samples drawn from the null distribution H₀. The red dashed line represents the threshold (0.05) below which the null hypothesis will be rejected. The blue area to the left of the red dashed line is type I error (false positives). Panel (f) displays the distribution of combined p-values using a left-tailed t-test for samples drawn from the alternative distribution H₁. The blue area to the right of the red dashed line is type II error (false negatives). Panel (g) displays the type I error with varying number of studies. Panel (h) displays the type II error with varying number of studies using a left-tailed t-test for samples drawn from the alternative distribution H₁. Similarly, panels (i–m) display the results using right-tailed t-test; panels (n–r) display the results of left-tailed Wilcoxon signed-rank test; panels (s–w) display the results of right-tailed Wilcoxon signed-rank test. In this example, the left-tailed t-test and right-tailed Wilcoxon tests are biased towards 0 as shown in (e,f). Therefore, an increase in the number of studies makes the combined p-values more biased towards 0, causing an increase in type I error as shown in (g,v). On the contrary, the right-tailed t-test and left-tailed Wilcoxon test are biased towards 1. This kind of bias makes the test less powerful. For example, with 10 studies, type II errors using right-tailed t-test and left-tailed Wilcoxon test are 0.51 and 0.61, respectively.

Denoting M₀ and m₀ as the mean and median of the null distribution H₀, M₀ is used as the parameter (mean) for the t-tests where m₀ is used as the parameter (median) for Wilcoxon test. To make the analysis more general, the sample size is randomized between 3 and 10 everytime we pick a sample. Since DANUBE uses the additive method to combine the p-values, we also use the additive method to combine the p-values of t-test and Wilcoxon test. When the number of studies is larger or equals to 20, the combined p-values are calculated using the Central Limit Theorem as described in section III.

Panels (d–h) show the results using the one sample left-tailed t-test for the mean; panels (i–m) show the results using the one sample right-tailed t-test for the mean; panels (n–r) show the results using the one sample left-tailed Wilcoxon test for the median; panels (s–w) show the results using one sample right-tailed Wilcoxon test for the median.

Panel (d) shows the distribution of p-values for samples drawn from the null distribution H₀. To plot this panel, we randomly select 100, 000 samples from H₀ and then calculate the p-values using the left-tailed t-test. Since the null distribution H₀ is not normal, the resulting p-values are not uniformly distributed. Panel (e) displays the distribution of combined p-values for samples drawn from the null distribution H₀. To calculate a combined p-value, we randomly pick 10 samples from the null population H₀ and then calculate the 10 p-values using the left-tailed t-test. From these 10 p-values, we calculate a combined p-value using the addiive method. This procedure is repeated 100, 000 times to generate the distribution of the combined p-values under the null hypothesis. Similarly, panel (f) displays the distribution of the combined p-values for samples drawn from the alternative distribution H₁.

The red dashed lines in panels (e, f) show the 0.05 cutoff. Since the combined p-values in (e) are calculated under the null hypothesis, values smaller than the cutoff are false positives. Therefore, the blue area to the left of the red dashed line is type I error of the classical meta-analysis using the left-tailed t-test. Similarly, combined p-values larger than the cutoff in panel (f) are false negatives. The blue area to the right of the red line panel (f) displays type II error.

The results show that combined p-values will be biased towards zero, since p-values of the left-tailed t-test are biased towards zero. To understand the behavior of the meta-analysis, we display type I and type II error in panels (g, h) with varying numbers of studies to be combined. As the number of studies increases, the meta-analysis becomes more biased, and type I error increases. For example, when the number of studies reaches 50, the analysis has more than 60% false positives. Paradoxically, increasing the number of studies will make the meta-analysis less useful due to the increase of type I error.

Panels (i–m) display the results of the right-tailed t-test. Panel (i) displays the distribution of p-values for samples drawn from the null distribution H₀. Panel (j) displays the combined p-values for samples drawn from the null distribution H₀. Panel (k) displays the combined p-values for samples drawn from the alternative distribution H₂. Each combined p-value is calculated from 10 individual p-values. The right-tailed t-test is biased towards one, therefore more evidence is required to identify true positives. Compared to the left-tailed t-test, the right-tailed t-test has smaller type I error but larger type II error (less power). Therefore, many more studies would be required for this test to identify true positives. Panel (m) shows that for the case of combining 10 studies, the type II error of the right-tailed t-test is about 0.5 whereas the type II error of the left-tailed t-test is less than 0.2.

Panels (n–r) display the results of meta-analysis using the one sample left-tailed Wilcoxon test for the median. In this example, the left-tailed Wilcoxon test is biased towards one, so more evidence is required to identify true positives. As shown in panel (r), the expected type II error of the meta-analysis is about 0.6 when combining 10 studies. Interestingly, the behavior of the meta-analysis using the left-tailed Wilcoxon test is similar to that of the the right-tailed t-test. In both cases, the meta-analysis needs a large number of studies to identify true positives. Panels (m and r) show that type II error converges to zero as the number of studies increases.

Panels (s–w) display the results of meta-analysis using the one sample right-tailed Wilcoxon test for the median. Similar to the t-test, the right-tailed Wilcoxon test is biased towards zero. As shown in panels (g, v), type I error using either of the two tests increases as the number of studies increases.

D. General case: DANUBE

In this section, we analyze the performance of DANUBE using the same null and alternative distributions that were used for the t-test and Wilcoxon tests. Figure 6 displays the results using DANUBE. Panels (a, b, c) show the null distribution H₀ and two alternative distributions H₁ and H₂. Panels (d–h) display the results using left-tailed DANUBE for the mean; panels (i–m) display the results using right-tailed DANUBE for the mean; panels (n–r) display the results using left-tailed DANUBE for the median; panels (s–w) display the results using right-tailed DANUBE for the median.

Fig. 6 — Type I and type II errors of DANUBE using mean and median as discriminative statistics. Panel (a) displays the probability distribution under the null hypothesis (H₀). Panel (b) displays an alternative distribution (H₁), which has the same shape as the null distribution but a slightly smaller median. Panel (c) displays an alternative distribution (H₂) which has the same shape as the null distribution but a slightly larger median. Panels (d–h) display the results of the left-tailed DANUBE using mean; panels (i–m) display the results of the right-tailed DANUBE using mean; panels (n–r) display the results of left-tailed DANUBE using median; panels (s–w) display the results of right-tailed DANUBE using median. Panels (d, i, n, s) show the p-value distributions for samples drawn from the null. For all four tests, p-values are uniformly distributed under the null hypothesis. Consequently, the combined p-values (using the additive method) are also uniformly distributed under the null hypothesis as shown in (e, j, o, t). The result is that the type I error equals the threshold (0.05) regardless of the number of studies combined, as shown in (g, l, q, v). Panels (h, m, r, w) show that the type II error converges quickly to zero. Combining 10 studies, the type II errors of left and right-tailed DANUBE for the mean are both less than 0.3 compared to 0.51 for the right-tailed t-test. Similarly, using the median, the type II error of DANUBE is less than 0.2 compared to 0.61 for the left-tailed Wilcoxon test.

We randomly select 10, 000 samples from the null distribution and use them to construct the empirical distribution of sample means (panels d–m) and likewise of sample medians (panels n–w). For a given empirical distribution, we calculate the probability of observing the discriminating statistic in a study. Panel (d) displays the distribution of empirical p-values for samples drawn from the null distribution H₀; we see that these are uniformly distributed under the null hypothesis. Panel (e) displays the distribution of combined p-values for samples drawn from the null distribution H₀. Each combined p-value is calculated from 10 individual empirical p-values. The blue area to the left of the red dashed line is type I error. Since the individual p-values are uniformly distributed, the combined p-values are also uniformly distributed. Consequently, the type I error of this test is equal to the threshold. Panel (f) displays the distribution of combined p-values for samples drawn from the alternative distribution H₁. The blue area to the right of the red dashed line is the type II error.

Panels (g, h) display the type I and type II error of DANUBE with varying numbers of combined studies. The graphs show that the type I error of DANUBE consistently equals the threshold while type II error decreases when the number of studies increases. When combining 10 studies, the type I and type II errors of the left-tailed DANUBE for the mean are 0.05 and 0.27, respectively, compared to 0.24 and 0.14 for the left-tailed t-test. When the number of the studies increases over 30, one can expect DANUBE to give a 0.05 type I error and an almost zero type II error.

Similar to the left-tailed test, right-tailed DANUBE on the mean has the expected type I error and a reasonable type II error as shown in panels (l, m). With 10 studies to be combined, the right-tailed DANUBE’s type I and type II errors are 0.05 and 0.25, respectively, compared to 0.01 and 0.51 for the right-tailed t-test. The results for the mean show that both left- and right-tailed type I errors are equal to the threshold while the type II error decreases rapidly. On the contrary, the left and right-tailed t-tests have unpredictable behavior due to the skewness of the null distribution.

Panels (n–w) show the results of left- and right-tailed DANUBE for the median. As expected, the type I error for the median is also equal to the threshold, regardless of the number of studies that are combined. The test is proven to be powerful for both tails with type II error less than 0.2 for 10 studies. When compared to the left-tailed Wilcoxon test on 10 studies, the DANUBE left-tailed type II error is 0.17 as opposed to 0.61.

V. Conclusions

In this paper, we present a new framework to combine the results of multiple studies in order to gain more statistical power. Our framework first calculates the empirical p-values for each study using the empirical distribution of the discriminating statistic. It then combines the empirical p-value using either the Central Limit Theorem or the additive method. The new framework makes no statistical assumptions about the data and is therefore usable in many practical cases when no simple model is appropriate. In addition, use of the additive method makes the framework more robust to outliers.

The advantage of the new meta-analysis framework is demonstrated using both simulation and real-world data. In our simulation study, we compare the results of DANUBE to the classical additive method using the one sample t-test and Wilcoxon signed-rank test. The skewness and the non-normality of the simulated null distribution produces systematic bias in classical meta-analysis, either increasing type I error or decreasing the power of the test. In contrast, the type I error of DANUBE is equal to the threshold cutoff and type II error declines quickly when the number of studies increases.

To evaluate the proposed framework for pathway analysis applications, we examine 7 Alzheimer’s and 9 acute myeloid leukemia datasets using 25 approaches: 6 meta-analysis methods, Stouffer’s, Z-method, Brown’s, Fisher’s, the additive method and DANUBE, each of them combined with 4 representative pathway analysis methods, GSA, SPIA, PADOG, and GSEA, plus an additional independent meta-analysis method MetaPath. The results confirm the advantage of DANUBE over classical meta-analysis to identify pathways relevant to the phenotype.

This work describes an important limitation of current meta-analysis techniques, and provides a general statistical approach to increase the power of an analysis method using empirical distributions. With vast databases of biological data being made available, this framework may be powerful because it lets the data speak for itself. The proposed framework is flexible enough to be applicable to various types of studies, including gene-level analysis, pathway analysis, or clinical trials to assess the effect of a therapy in complex diseases.

Supplementary Material

DANUBE_PIEEE_Suppl

NIHMS854489-supplement-DANUBE_PIEEE_Suppl.pdf^{(2MB, pdf)}

Acknowledgments

This research was supported in part by the following grants: NIH R01 DK089167, R42 GM087013 and NSF DBI-0965741, and by the Robert J. Sokol Endowment in Systems Biology. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of any of the funding agencies. We thank Diana Diaz for help and useful discussion.

Biographies

graphic file with name nihms854489b1.gif

Tin Nguyen received the BSc and MSc degrees in computer science from Eotvos Lorand University in Budapest, Hungary. He is currently is a PhD Candidate and a member of the Intelligent Systems and Bioinformatics Laboratory (ISBL) in the Department of Computer Science at Wayne State University, Michigan. His research interests include computational and statistical methods for analyzing high-throughput data. His current focus is meta-analysis, multi-omics data integration, and disease subtyping.

graphic file with name nihms854489b2.gif

Cristina Mitrea is a PhD Candidate and a member of the Intelligent Systems and Bioinformatics Laboratory (ISBL) in the Department of Computer Science at Wayne State University. In 2012, she received the Master of Science in Computer Science from Wayne State University. Her work is focused on research in data mining techniques applied to bioinformatics and computational biology. The main focus of her research is developing bioinformatics tools for cancer studies. Other interests include network discovery and meta-analysis applied to pathway analysis. She is also a student member of IEEE and ACM.

graphic file with name nihms854489b3.gif

Rebecca Tagett has a Bachelors in Physics, a Masters in Molecular Biology, and 10 years R&D experience in industry as a Computational Biologist. A PhD Candidate and a member of the Intelligent Systems and Bioinformatics Laboratory (ISBL) in the Department of Computer Science at Wayne State University, her research focuses on phenotypic prediction using multi-omics. Her interests are Functional Genomics, Scientific Writing, Bioinformatics and Biostatistics. She is a member of the International Society for Computational Biology (ISCB).

graphic file with name nihms854489b4.gif

Sorin Draghici is the Associate Dean for Innovation and Entrepreneurship, and Director, James and Patricia Anderson Engineering Ventures Institute in the College of Engineering at Wayne State University. He currently holds the Robert J. Sokol, MD Endowed Chair in Systems Biology, as well as appointments as full professor in the Department of Computer Science and the Department of Obstetrics and Gynecology, Wayne State University. Professor Draghici is also the head of the Intelligent Systems and Bioinformatics Laboratory (ISBL) in the Department of Computer Science. His work is focused on research in artificial intelligence, machine learning and data mining techniques applied to bioinformatics and computational biology. He has published two best-selling books on data analysis of high throughput genomics data, 8 book chapters and over 160 peer-reviewed journal and conference papers. His research laboratory has a strong track record in developing tools for data analysis of high throughput data. His laboratory has developed 8 analysis tools in this area, tools that have been made available over the web for over 10 years to over 11,000 scientists from 5 continents. He has also co-authored 3 analysis packages in Bioconductor. His top 4 papers in this area have over 2,000 total citations, while this entire work gathered over 7,000 citations. During his 17 year appointments as faculty, he was able to attract $8,262,283 as PI and $27,418,291 as co-PI in NIH and NSF grants.

Contributor Information

Tin Nguyen, Department of Computer Science, Wayne State University, Detroit, MI 48202.

Cristina Mitrea, Department of Computer Science, Wayne State University, Detroit, MI 48202.

Rebecca Tagett, Department of Computer Science, Wayne State University, Detroit, MI 48202.

Sorin Draghici, Department of Computer Science and the Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI 48202.

References

1.Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL, Serova N, Davis S, Soboleva A. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Research. 2013;41(D1):D991–D995. doi: 10.1093/nar/gks1193. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Research. 2002;30(1):207–210. doi: 10.1093/nar/30.1.207. [Online]. Available: http://nar.oxfordjournals.org/cgi/content/abstract/30/1/207. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Rustici G, Kolesnikov N, Brandizi M, Burdett T, Dylag M, Emam I, Farne A, Hastings E, Ison J, Keays M, Kurbatova N, Malone J, Mani R, Mupo A, Pereira RP, Pilicheva E, Rung J, Sharma A, Tang YA, Ternent T, Tikhonov A, Welter D, Williams E, Brazma A, Parkinson H, Sarkans U. ArrayExpress update–trends in database growth and links to data analysis tools. Nucleic Acids Research. 2013;41(D1):D987–D990. doi: 10.1093/nar/gks1174. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Brazma A, Parkinson H, Sarkans U, Shojatalab M, Vilo J, Abeygunawardena N, Holloway E, Kapushesky M, Kemmeren P, Lara GG, Oezcimen A, Rocca-Serra P, Sansone S-A. ArrayExpress–a public repository for microarray gene expression data at the EBI. Nucleic Acids Research. 2003;31(1):68–71. doi: 10.1093/nar/gkg091. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Tseng GC, Ghosh D, Feingold E. Comprehensive literature review and statistical considerations for microarray meta-analysis. Nucleic Acids Research. 2012;40(9):3785–3799. doi: 10.1093/nar/gkr1265. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Ramasamy A, Mondry A, Holmes CC, Altman DG. Key issues in conducting a meta-analysis of gene expression microarray datasets. PLoS Medicine. 2008;5(9):e184. doi: 10.1371/journal.pmed.0050184. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Manoli T, Gretz N, Gröne H-J, Kenzelmann M, Eils R, Brors B. Group testing for pathway analysis improves comparability of different microarray datasets. Bioinformatics. 2006;22(20):2500–2506. doi: 10.1093/bioinformatics/btl424. [DOI] [PubMed] [Google Scholar]
8.Borovecki F, Lovrecic L, Zhou J, Jeong H, Then F, Rosas H, Hersch S, Hogarth P, Bouzou B, Jensen R, Krainc D. Genome-wide expression profiling of human blood reveals biomarkers for Huntington’s disease. Proceedings of the National Academy of Sciences of the United States of America. 2005;102(31):11023–11028. doi: 10.1073/pnas.0504921102. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Friedman L. Why vote-count reviews don’t count. Biological Psychiatry. 2001;49(2):161–162. [Google Scholar]
10.Hedges LV, Olkin I. Vote-counting methods in research synthesis. Psychological Bulletin. 1980;88(2):359. [Google Scholar]
11.Shen K, Tseng GC. Meta-analysis for pathway enrichment analysis when combining multiple genomic studies. Bioinformatics. 2010;26(10):1316–1323. doi: 10.1093/bioinformatics/btq148. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Setlur SR, Royce TE, Sboner A, Mosquera J-M, Demichelis F, Hofer MD, Mertz KD, Gerstein M, Rubin MA. Integrative microarray analysis of pathways dysregulated in metastatic prostate cancer. Cancer Research. 2007;67(21):10296–10303. doi: 10.1158/0008-5472.CAN-07-2173. [DOI] [PubMed] [Google Scholar]
13.Rhodes DR, Barrette TR, Rubin MA, Ghosh D, Chinnaiyan AM. Meta-analysis of microarrays interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Research. 2002;62(15):4427–4433. [PubMed] [Google Scholar]
14.Kaever A, Landesfeind M, Feussner K, Morgenstern B, Feussner I, Meinicke P. Meta-analysis of pathway enrichment: combining independent and dependent omics data sets. PloS One. 2014;9(2):e89297. doi: 10.1371/journal.pone.0089297. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Mitrea C, Taghavi Z, Bokanizad B, Hanoudi S, Tagett R, Donato M, Voichiţa C, Drǎghici S. Methods and approaches in the topology-based analysis of biological pathways. Frontiers in Physiology. 2013;4:278. doi: 10.3389/fphys.2013.00278. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Khatri P, Sirota M, Butte AJ. Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Computational Biology. 2012;8(2):e1002375. doi: 10.1371/journal.pcbi.1002375. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Kotelnikova E, Shkrob MA, Pyatnitskiy MA, Ferlini A, Daraselia N. Novel approach to meta-analysis of microarray datasets reveals muscle remodeling-related drug targets and biomarkers in Duchenne muscular dystrophy. PLoS Computational Biology. 2012;8(2):e1002365. doi: 10.1371/journal.pcbi.1002365. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Huang DW, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Research. 2009;37(1):1–13. doi: 10.1093/nar/gkn923. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research. 2000 Jan;28(1):27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research. 1999;27(1):29–34. doi: 10.1093/nar/27.1.29. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G, Caudy M, Garapati P, Gillespie M, Kamdar MR, Jassal B, Jupe S, Matthews L, May B, Palatnik S, Rothfels K, Shamovsky V, Song H, Williams M, Birney E, Hermjakob H, Stein L, D’Eustachio P. The Reactome pathway knowledgebase. Nucleic Acids Research. 2014;42(D1):D472–D477. doi: 10.1093/nar/gkt1102. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27(12):1739–1740. doi: 10.1093/bioinformatics/btr260. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Loughin TM. A systematic comparison of methods for combining p-values from independent tests. Computational Statistics & Data Analysis. 2004;47(3):467–485. [Google Scholar]
24.Fisher RA. Statistical methods for research workers. Edinburgh: Oliver & Boyd; 1925. [Google Scholar]
25.Edgington ES. An additive method for combining probability values from independent experiments. The Journal of Psychology. 1972;80(2):351–363. [Google Scholar]
26.Hall P. The distribution of means for samples of size n drawn from a population in which the variate takes values between 0 and 1, all such values being equally probable. Biometrika. 1927;19(3–4):240–244. [Google Scholar]
27.Irwin JO. On the frequency distribution of the means of samples from a population having any law of frequency with finite moments, with special reference to Pearson’s Type II. Biometrika. 1927;19(3–4):225–239. [Google Scholar]
28.Tippett LHC. The methods of statistics. London: Williams & Norgate; 1931. [Google Scholar]
29.Wilkinson B. A statistical consideration in psychological research. Psychological Bulletin. 1951;48(2):156. doi: 10.1037/h0059111. [DOI] [PubMed] [Google Scholar]
30.Li J, Tseng GC. An adaptively weighted statistic for detecting differential gene expression when combining multiple transcriptomic studies. The Annals of Applied Statistics. 2011;5(2A):994–1019. [Google Scholar]
31.Choi H, Shen R, Chinnaiyan AM, Ghosh D. A latent variable approach for meta-analysis of gene expression data from multiple microarray experiments. BMC Bioinformatics. 2007;8(1):364. doi: 10.1186/1471-2105-8-364. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Shen R, Ghosh D, Chinnaiyan AM. Prognostic meta-signature of breast cancer developed by two-stage mixture modeling of microarray data. BMC Genomics. 2004;5(1):94. doi: 10.1186/1471-2164-5-94. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Barton SJ, Crozier SR, Lillycrop KA, Godfrey KM, Inskip HM. Correction of unexpected distributions of P values from analysis of whole genome arrays by rectifying violation of statistical assumptions. BMC Genomics. 2013;14(1):161. doi: 10.1186/1471-2164-14-161. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Fodor AA, Tickle TL, Richardson C. Towards the uniform distribution of null P values on Affymetrix microarrays. Genome Biology. 2007;8(5):R69. doi: 10.1186/gb-2007-8-5-r69. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Bland M. Do baseline p-values follow a uniform distribution in randomised trials? PloS One. 2013;8(10):e76010. doi: 10.1371/journal.pone.0076010. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceeding of The National Academy of Sciences of the Unites States of America. 2005;102(43):15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Efron B, Tibshirani R. On testing the significance of sets of genes. The Annals of Applied Statistics. 2007;1(1):107–129. [Google Scholar]
38.Tarca AL, Drǎghici S, Bhatti G, Romero R. Down-weighting overlapping genes improves gene set analysis. BMC Bioinformatics. 2012;13(1):136. doi: 10.1186/1471-2105-13-136. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Mootha VK, Lindgren CM, Eriksson K-F, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstråle M, Laurila E, Houstis N, Daly MJ, Patterson N, Mesirov JP, Golub TR, Tamayo P, Spiegelman B, Lander ES, Hirschhorn JN, Altshuler D, Groop LC. PGC-11 α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature Genetics. 2003 Jul;34(3):267–273. doi: 10.1038/ng1180. [DOI] [PubMed] [Google Scholar]
40.Khatri P, Drǎghici S, Ostermeier GC, Krawetz SA. Profiling gene expression using Onto-Express. Genomics. 2002;79(2):266–270. doi: 10.1006/geno.2002.6698. [DOI] [PubMed] [Google Scholar]
41.Drǎghici S, Khatri P, Martins RP, Ostermeier GC, Krawetz SA. Global functional profiling of gene expression. Genomics. 2003;81(2):98–104. doi: 10.1016/s0888-7543(02)00021-6. [DOI] [PubMed] [Google Scholar]
42.Beißbarth T, Speed TP. GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics. 2004 Jun;20:1464–1465. doi: 10.1093/bioinformatics/bth088. [DOI] [PubMed] [Google Scholar]
43.Tarca AL, Drǎghici S, Khatri P, Hassan SS, Mittal P, Kim J-s, Kim CJ, Kusanovic JP, Romero R. A novel signaling pathway impact analysis. Bioinformatics. 2009;25(1):75–82. doi: 10.1093/bioinformatics/btn577. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Voichita C, Draghici S. ROntoTools: R Onto-Tools suite. 2013 R package. [Online]. Available: http://www.bioconductor.org.
45.Drǎghici S, Khatri P, Tarca AL, Amin K, Done A, Voichiţa C, Georgescu C, Romero R. A systems biology approach for pathway level analysis. Genome Research. 2007;17(10):1537–1545. doi: 10.1101/gr.6202607. [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences of the United States of America. 2003;100(16):9440–9445. doi: 10.1073/pnas.1530509100. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Edgington ES. A normal curve method for combining probability values from independent experiments. The Journal of Psychology. 1972;82(1):85–89. [Google Scholar]
48.Stouffer S, Suchman E, DeVinney L, Star S, Williams JRM. The American Soldier: Adjustment during army life Princeton. Vol. 1 Princeton University Press; 1949. [Google Scholar]
49.Brown MB. A method for combining nonindependent, one-sided tests of significance. Biometrics. 1975:987–992. [Google Scholar]
50.Wang X, Kang DD, Shen K, Song C, Lu S, Chang L-C, Liao SG, Huo Z, Tang S, Ding Y, Kaminski N, Sibille E, Lin Y, Li J, Tseng GC. An R package suite for microarray meta-analysis in quality control, differentially expressed gene analysis and pathway enrichment detection. Bioinformatics. 2012;28(19):2534–2536. doi: 10.1093/bioinformatics/bts485. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Swerdlow RH. Brain aging, Alzheimer’s disease, and mitochondria. Biochimica et Biophysica Acta (BBA)-Molecular Basis of Disease. 2011;1812(12):1630–1639. doi: 10.1016/j.bbadis.2011.08.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Maruszak A, Żekanowski C. Mitochondrial dysfunction and Alzheimer’s disease. Progress in Neuro-Psychopharmacology and Biological Psychiatry. 2011;35(2):320–330. doi: 10.1016/j.pnpbp.2010.07.004. [DOI] [PubMed] [Google Scholar]
53.Zhu X, Perry G, Smith MA, Wang X. Abnormal mitochondrial dynamics in the pathogenesis of Alzheimer’s disease. Journal of Alzheimer’s Disease. 2013;33:S253–S262. doi: 10.3233/JAD-2012-129005. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Querfurth HW, LaFerla FM. Mechanisms of disease. New England Journal of Medicine. 2010;362(4):329–344. doi: 10.1056/NEJMra0909142. [DOI] [PubMed] [Google Scholar]
55.Donato M, Xu Z, Tomoiaga A, Granneman JG, MacKenzie RG, Bao R, Than NG, Westfall PH, Romero R, Drăghici S. Analysis and correction of crosstalk effects in pathway analysis. Genome Research. 2013;23(11):1885–1893. doi: 10.1101/gr.153551.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
56.Brookes PS, Yoon Y, Robotham JL, Anders M, Sheu S-S. Calcium, ATP, and ROS: a mitochondrial love-hate triangle. American Journal of Physiology-Cell Physiology. 2004;287(4):C817–C833. doi: 10.1152/ajpcell.00139.2004. [DOI] [PubMed] [Google Scholar]
57.Gosset WS. The Probable Error of a Mean. Biometrika. 1908;6:1–25. [Google Scholar]
58.Peaeson E, Haetlet H. Biometrika tables for statisticians. Biometrika Trust. 1976 [Google Scholar]
59.Wilcoxon F. Individual comparisons by ranking methods. Biometrics. 1945;1(6):80–83. [Google Scholar]
60.Wilcoxon F, Katti S, Wilcox RA. Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test. Selected tables in mathematical statistics. 1970;1:171–259. [Google Scholar]
61.Hollander M, Wolfe DA, Chicken E. Nonparametric statistical methods. Vol. 751 John Wiley & Sons; 2013. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

DANUBE_PIEEE_Suppl

NIHMS854489-supplement-DANUBE_PIEEE_Suppl.pdf^{(2MB, pdf)}

[R1] 1.Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL, Serova N, Davis S, Soboleva A. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Research. 2013;41(D1):D991–D995. doi: 10.1093/nar/gks1193. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Research. 2002;30(1):207–210. doi: 10.1093/nar/30.1.207. [Online]. Available: http://nar.oxfordjournals.org/cgi/content/abstract/30/1/207. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Rustici G, Kolesnikov N, Brandizi M, Burdett T, Dylag M, Emam I, Farne A, Hastings E, Ison J, Keays M, Kurbatova N, Malone J, Mani R, Mupo A, Pereira RP, Pilicheva E, Rung J, Sharma A, Tang YA, Ternent T, Tikhonov A, Welter D, Williams E, Brazma A, Parkinson H, Sarkans U. ArrayExpress update–trends in database growth and links to data analysis tools. Nucleic Acids Research. 2013;41(D1):D987–D990. doi: 10.1093/nar/gks1174. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Brazma A, Parkinson H, Sarkans U, Shojatalab M, Vilo J, Abeygunawardena N, Holloway E, Kapushesky M, Kemmeren P, Lara GG, Oezcimen A, Rocca-Serra P, Sansone S-A. ArrayExpress–a public repository for microarray gene expression data at the EBI. Nucleic Acids Research. 2003;31(1):68–71. doi: 10.1093/nar/gkg091. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Tseng GC, Ghosh D, Feingold E. Comprehensive literature review and statistical considerations for microarray meta-analysis. Nucleic Acids Research. 2012;40(9):3785–3799. doi: 10.1093/nar/gkr1265. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Ramasamy A, Mondry A, Holmes CC, Altman DG. Key issues in conducting a meta-analysis of gene expression microarray datasets. PLoS Medicine. 2008;5(9):e184. doi: 10.1371/journal.pmed.0050184. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Manoli T, Gretz N, Gröne H-J, Kenzelmann M, Eils R, Brors B. Group testing for pathway analysis improves comparability of different microarray datasets. Bioinformatics. 2006;22(20):2500–2506. doi: 10.1093/bioinformatics/btl424. [DOI] [PubMed] [Google Scholar]

[R8] 8.Borovecki F, Lovrecic L, Zhou J, Jeong H, Then F, Rosas H, Hersch S, Hogarth P, Bouzou B, Jensen R, Krainc D. Genome-wide expression profiling of human blood reveals biomarkers for Huntington’s disease. Proceedings of the National Academy of Sciences of the United States of America. 2005;102(31):11023–11028. doi: 10.1073/pnas.0504921102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Friedman L. Why vote-count reviews don’t count. Biological Psychiatry. 2001;49(2):161–162. [Google Scholar]

[R10] 10.Hedges LV, Olkin I. Vote-counting methods in research synthesis. Psychological Bulletin. 1980;88(2):359. [Google Scholar]

[R11] 11.Shen K, Tseng GC. Meta-analysis for pathway enrichment analysis when combining multiple genomic studies. Bioinformatics. 2010;26(10):1316–1323. doi: 10.1093/bioinformatics/btq148. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Setlur SR, Royce TE, Sboner A, Mosquera J-M, Demichelis F, Hofer MD, Mertz KD, Gerstein M, Rubin MA. Integrative microarray analysis of pathways dysregulated in metastatic prostate cancer. Cancer Research. 2007;67(21):10296–10303. doi: 10.1158/0008-5472.CAN-07-2173. [DOI] [PubMed] [Google Scholar]

[R13] 13.Rhodes DR, Barrette TR, Rubin MA, Ghosh D, Chinnaiyan AM. Meta-analysis of microarrays interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Research. 2002;62(15):4427–4433. [PubMed] [Google Scholar]

[R14] 14.Kaever A, Landesfeind M, Feussner K, Morgenstern B, Feussner I, Meinicke P. Meta-analysis of pathway enrichment: combining independent and dependent omics data sets. PloS One. 2014;9(2):e89297. doi: 10.1371/journal.pone.0089297. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Mitrea C, Taghavi Z, Bokanizad B, Hanoudi S, Tagett R, Donato M, Voichiţa C, Drǎghici S. Methods and approaches in the topology-based analysis of biological pathways. Frontiers in Physiology. 2013;4:278. doi: 10.3389/fphys.2013.00278. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Khatri P, Sirota M, Butte AJ. Ten years of pathway analysis: current approaches and outstanding challenges. PLoS Computational Biology. 2012;8(2):e1002375. doi: 10.1371/journal.pcbi.1002375. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Kotelnikova E, Shkrob MA, Pyatnitskiy MA, Ferlini A, Daraselia N. Novel approach to meta-analysis of microarray datasets reveals muscle remodeling-related drug targets and biomarkers in Duchenne muscular dystrophy. PLoS Computational Biology. 2012;8(2):e1002365. doi: 10.1371/journal.pcbi.1002365. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Huang DW, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Research. 2009;37(1):1–13. doi: 10.1093/nar/gkn923. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Kanehisa M, Goto S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research. 2000 Jan;28(1):27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research. 1999;27(1):29–34. doi: 10.1093/nar/27.1.29. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Croft D, Mundo AF, Haw R, Milacic M, Weiser J, Wu G, Caudy M, Garapati P, Gillespie M, Kamdar MR, Jassal B, Jupe S, Matthews L, May B, Palatnik S, Rothfels K, Shamovsky V, Song H, Williams M, Birney E, Hermjakob H, Stein L, D’Eustachio P. The Reactome pathway knowledgebase. Nucleic Acids Research. 2014;42(D1):D472–D477. doi: 10.1093/nar/gkt1102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27(12):1739–1740. doi: 10.1093/bioinformatics/btr260. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Loughin TM. A systematic comparison of methods for combining p-values from independent tests. Computational Statistics & Data Analysis. 2004;47(3):467–485. [Google Scholar]

[R24] 24.Fisher RA. Statistical methods for research workers. Edinburgh: Oliver & Boyd; 1925. [Google Scholar]

[R25] 25.Edgington ES. An additive method for combining probability values from independent experiments. The Journal of Psychology. 1972;80(2):351–363. [Google Scholar]

[R26] 26.Hall P. The distribution of means for samples of size n drawn from a population in which the variate takes values between 0 and 1, all such values being equally probable. Biometrika. 1927;19(3–4):240–244. [Google Scholar]

[R27] 27.Irwin JO. On the frequency distribution of the means of samples from a population having any law of frequency with finite moments, with special reference to Pearson’s Type II. Biometrika. 1927;19(3–4):225–239. [Google Scholar]

[R28] 28.Tippett LHC. The methods of statistics. London: Williams & Norgate; 1931. [Google Scholar]

[R29] 29.Wilkinson B. A statistical consideration in psychological research. Psychological Bulletin. 1951;48(2):156. doi: 10.1037/h0059111. [DOI] [PubMed] [Google Scholar]

[R30] 30.Li J, Tseng GC. An adaptively weighted statistic for detecting differential gene expression when combining multiple transcriptomic studies. The Annals of Applied Statistics. 2011;5(2A):994–1019. [Google Scholar]

[R31] 31.Choi H, Shen R, Chinnaiyan AM, Ghosh D. A latent variable approach for meta-analysis of gene expression data from multiple microarray experiments. BMC Bioinformatics. 2007;8(1):364. doi: 10.1186/1471-2105-8-364. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Shen R, Ghosh D, Chinnaiyan AM. Prognostic meta-signature of breast cancer developed by two-stage mixture modeling of microarray data. BMC Genomics. 2004;5(1):94. doi: 10.1186/1471-2164-5-94. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Barton SJ, Crozier SR, Lillycrop KA, Godfrey KM, Inskip HM. Correction of unexpected distributions of P values from analysis of whole genome arrays by rectifying violation of statistical assumptions. BMC Genomics. 2013;14(1):161. doi: 10.1186/1471-2164-14-161. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Fodor AA, Tickle TL, Richardson C. Towards the uniform distribution of null P values on Affymetrix microarrays. Genome Biology. 2007;8(5):R69. doi: 10.1186/gb-2007-8-5-r69. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] 35.Bland M. Do baseline p-values follow a uniform distribution in randomised trials? PloS One. 2013;8(10):e76010. doi: 10.1371/journal.pone.0076010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] 36.Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceeding of The National Academy of Sciences of the Unites States of America. 2005;102(43):15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Efron B, Tibshirani R. On testing the significance of sets of genes. The Annals of Applied Statistics. 2007;1(1):107–129. [Google Scholar]

[R38] 38.Tarca AL, Drǎghici S, Bhatti G, Romero R. Down-weighting overlapping genes improves gene set analysis. BMC Bioinformatics. 2012;13(1):136. doi: 10.1186/1471-2105-13-136. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] 39.Mootha VK, Lindgren CM, Eriksson K-F, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstråle M, Laurila E, Houstis N, Daly MJ, Patterson N, Mesirov JP, Golub TR, Tamayo P, Spiegelman B, Lander ES, Hirschhorn JN, Altshuler D, Groop LC. PGC-11 α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nature Genetics. 2003 Jul;34(3):267–273. doi: 10.1038/ng1180. [DOI] [PubMed] [Google Scholar]

[R40] 40.Khatri P, Drǎghici S, Ostermeier GC, Krawetz SA. Profiling gene expression using Onto-Express. Genomics. 2002;79(2):266–270. doi: 10.1006/geno.2002.6698. [DOI] [PubMed] [Google Scholar]

[R41] 41.Drǎghici S, Khatri P, Martins RP, Ostermeier GC, Krawetz SA. Global functional profiling of gene expression. Genomics. 2003;81(2):98–104. doi: 10.1016/s0888-7543(02)00021-6. [DOI] [PubMed] [Google Scholar]

[R42] 42.Beißbarth T, Speed TP. GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics. 2004 Jun;20:1464–1465. doi: 10.1093/bioinformatics/bth088. [DOI] [PubMed] [Google Scholar]

[R43] 43.Tarca AL, Drǎghici S, Khatri P, Hassan SS, Mittal P, Kim J-s, Kim CJ, Kusanovic JP, Romero R. A novel signaling pathway impact analysis. Bioinformatics. 2009;25(1):75–82. doi: 10.1093/bioinformatics/btn577. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] 44.Voichita C, Draghici S. ROntoTools: R Onto-Tools suite. 2013 R package. [Online]. Available: http://www.bioconductor.org.

[R45] 45.Drǎghici S, Khatri P, Tarca AL, Amin K, Done A, Voichiţa C, Georgescu C, Romero R. A systems biology approach for pathway level analysis. Genome Research. 2007;17(10):1537–1545. doi: 10.1101/gr.6202607. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] 46.Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences of the United States of America. 2003;100(16):9440–9445. doi: 10.1073/pnas.1530509100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] 47.Edgington ES. A normal curve method for combining probability values from independent experiments. The Journal of Psychology. 1972;82(1):85–89. [Google Scholar]

[R48] 48.Stouffer S, Suchman E, DeVinney L, Star S, Williams JRM. The American Soldier: Adjustment during army life Princeton. Vol. 1 Princeton University Press; 1949. [Google Scholar]

[R49] 49.Brown MB. A method for combining nonindependent, one-sided tests of significance. Biometrics. 1975:987–992. [Google Scholar]

[R50] 50.Wang X, Kang DD, Shen K, Song C, Lu S, Chang L-C, Liao SG, Huo Z, Tang S, Ding Y, Kaminski N, Sibille E, Lin Y, Li J, Tseng GC. An R package suite for microarray meta-analysis in quality control, differentially expressed gene analysis and pathway enrichment detection. Bioinformatics. 2012;28(19):2534–2536. doi: 10.1093/bioinformatics/bts485. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R51] 51.Swerdlow RH. Brain aging, Alzheimer’s disease, and mitochondria. Biochimica et Biophysica Acta (BBA)-Molecular Basis of Disease. 2011;1812(12):1630–1639. doi: 10.1016/j.bbadis.2011.08.012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R52] 52.Maruszak A, Żekanowski C. Mitochondrial dysfunction and Alzheimer’s disease. Progress in Neuro-Psychopharmacology and Biological Psychiatry. 2011;35(2):320–330. doi: 10.1016/j.pnpbp.2010.07.004. [DOI] [PubMed] [Google Scholar]

[R53] 53.Zhu X, Perry G, Smith MA, Wang X. Abnormal mitochondrial dynamics in the pathogenesis of Alzheimer’s disease. Journal of Alzheimer’s Disease. 2013;33:S253–S262. doi: 10.3233/JAD-2012-129005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R54] 54.Querfurth HW, LaFerla FM. Mechanisms of disease. New England Journal of Medicine. 2010;362(4):329–344. doi: 10.1056/NEJMra0909142. [DOI] [PubMed] [Google Scholar]

[R55] 55.Donato M, Xu Z, Tomoiaga A, Granneman JG, MacKenzie RG, Bao R, Than NG, Westfall PH, Romero R, Drăghici S. Analysis and correction of crosstalk effects in pathway analysis. Genome Research. 2013;23(11):1885–1893. doi: 10.1101/gr.153551.112. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R56] 56.Brookes PS, Yoon Y, Robotham JL, Anders M, Sheu S-S. Calcium, ATP, and ROS: a mitochondrial love-hate triangle. American Journal of Physiology-Cell Physiology. 2004;287(4):C817–C833. doi: 10.1152/ajpcell.00139.2004. [DOI] [PubMed] [Google Scholar]

[R57] 57.Gosset WS. The Probable Error of a Mean. Biometrika. 1908;6:1–25. [Google Scholar]

[R58] 58.Peaeson E, Haetlet H. Biometrika tables for statisticians. Biometrika Trust. 1976 [Google Scholar]

[R59] 59.Wilcoxon F. Individual comparisons by ranking methods. Biometrics. 1945;1(6):80–83. [Google Scholar]

[R60] 60.Wilcoxon F, Katti S, Wilcox RA. Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test. Selected tables in mathematical statistics. 1970;1:171–259. [Google Scholar]

[R61] 61.Hollander M, Wolfe DA, Chicken E. Nonparametric statistical methods. Vol. 751 John Wiley & Sons; 2013. [Google Scholar]

PERMALINK

DANUBE: Data-driven meta-ANalysis using UnBiased Empirical distributions—applied to biological pathway analysis

Tin Nguyen

Cristina Mitrea

Rebecca Tagett

Sorin Draghici

Roles

Abstract

I. Introduction

II. Background

A. Fisher’s method

B. Additive method

C. Pitfalls of the existing approaches

Fig. 1.

TABLE I.

III. Methods

A. The DANUBE framework

Fig. 2.

Lemma 1

Proof

B. The application of DANUBE in pathway analysis

Fig. 3.

IV. Results and Validation

A. Pathway analysis applications: Alzheimer’s disease

Fig. 4.

TABLE II.

B. Pathway analysis applications: AML

TABLE III.

TABLE IV.

TABLE V.

C. General case: t-test and Wilcoxon test

Fig. 5.

D. General case: DANUBE

Fig. 6.

V. Conclusions

Supplementary Material

Acknowledgments

Biographies

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases