Table 1.
Network inference methods.
ID | Synopsis | Reference | |
---|---|---|---|
Regression: Transcription factors are selected by target gene specific (1) sparse linear regression and (2) data resampling approaches. | |||
1 | Trustful Inference of Gene REgulation using Stability Selection (TIGRESS): (1) Lasso; (2) the regularization parameter selects five transcription factors per target gene in each bootstrap sample. | 33a | |
2 | (1) Steady state and time series data are combined by group lasso; (2) bootstrapping. | 34a | |
3 | Combination of lasso and Bayesian linear regression models learned using Reversible Jump Markov Chain Monte Carlo simulations. | 35a | |
4 | (1) Lasso; (2) bootstrapping. | 36 | |
5 | (1) Lasso; (2) area under the stability selection curve. | 36 | |
6 | Application of the Lasso toolbox GENLAB using standard parameters. | 37 | |
7 | Lasso models are combined by the maximum regularization parameter selecting a given edge for the first time. | 36a | |
8 | Linear regression determines the contribution of transcription factors to the expression of target genes. | —a,b | |
Mutual Information: Edges are (1) ranked based on variants of mutual information and (2) filtered for causal relationships. | |||
1 | Context likelihood of relatedness (CLR): (1) Spline estimation of mutual information; (2) the likelihood of each mutual information score is computed based on its local network context. | 11a,b | |
2 | (1) Mutual information is computed from discretized expression values. | 38a,b | |
3 | Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNE): (1) kernel estimation of mutual information; (2) the data processing inequality is used to identify direct interactions. | 9a,b | |
4 | (1) Fast kernel-based estimation of mutual information; (2) Bayesian Local Causal Discovery (BLCD) and Markov blanket (HITON-PC) algorithm to identify direct interactions. | 39a | |
5 | (1) Mutual information and Pearson’s correlation are combined; (2) BLCD and HITON-PC algorithm. | 39a | |
Correlation: Edges are ranked based on variants of correlation. | |||
1 | Absolute value of Pearson’s correlation coefficient. | 38 | |
2 | Signed value of Pearson’s correlation coefficient. | 38a,b | |
3 | Signed value of Spearman’s correlation coefficient. | 38a,b | |
Bayesian networks optimize posterior probabilities by different heuristic searches. | |||
1 | Simulated annealing (catnet R package, http://cran.r-project.org/web/packages/catnet), aggregation of three runs. | — | |
2 | Simulated annealing (catnet R package, http://cran.r-project.org/web/packages/catnet). | — | |
3 | Max-Min Parent and Children algorithm (MMPC), bootstrapped datasets. | 40 | |
4 | Markov blanket algorithm (HITON-PC), bootstrapped datasets. | 41 | |
5 | Markov boundary induction algorithm (TIE*), bootstrapped datasets. | 42 | |
6 | Models transcription factor perturbation data and time series using dynamic Bayesian networks (Infer.NET toolbox, http://research.microsoft.com/infernet). | —a | |
Other Approaches: Network inference by heterogeneous and novel methods. | |||
1 | Genie3: A random forest is trained to predict target gene expression. Putative transcription factors are selected as tree nodes if they consistently reduce the variance of the target. | 19a | |
2 | Co-dependencies between transcription factors and target genes are detected by the non-linear correlation coefficient η2 (two-way ANOVA). Transcription factor perturbation data are up-weighted. | 20a | |
3 | Transcription factors are selected maximizing the conditional entropy for target genes, which are represented as Boolean vectors with probabilities to avoid discretization. | 43a | |
4 | Transcription factors are preselected from transcription factor perturbation data or by Pearson’s correlation and then tested by iterative Bayesian Model Averaging (BMA). | 44 | |
5 | A Gaussian noise model is used to estimate if the expression of a target gene changes in transcription factor perturbation measurements. | 45 | |
6 | After scaling, target genes are clustered by Pearson’s correlation. A neural network is trained (genetic algorithm) and parameterized (back-propagation). | 46a | |
7 | Data is discretized by Gaussian mixture models and clustering (Ckmeans); Interactions are detected by generalized logical network modeling (χ2 test). | 47a | |
8 | The χ2 test is applied to evaluate the probability of a shift in transcription factor and target gene expression in transcription factor perturbation experiments. | 47a | |
Meta predictors (1) apply multiple inference approaches and (2) compute aggregate scores. | |||
1 | (1) Z-scores for target genes in transcription factor knockout data, time-lagged CLR for time series, and linear ordinary differential equation models constrained by lasso (Inferelator); (2) resampling approach. | 48a | |
2 | (1) Pearson’s correlation, mutual information, and CLR; (2) rank average. | — | |
3 | (1) Calculates target gene responses in transcription factor knockout data, applies full-order, partial correlation and transcription factor-target co-deviation analysis; (2) weighted average with weights trained on simulated data. | —a | |
4 | (1) CLR filtered by negative Pearson’s correlation, least angle regression (LARS) of time series, and transcription factor perturbation data; (2) combination by z-scores. | 49 | |
5 | (1) Pearson’s correlation, differential expression (limma), and time series analysis (maSigPro); (2) Naïve Bayes. | —a |
Methods have been manually categorized based on participant-supplied descriptions. Within each class, methods are sorted by overall performance (see Figure 2a). Note that generic references have been used if more specific ones were not available.
Detailed method description included in Supplementary Note 10;
Off-the-shelf algorithm applied by challenge organizers.