Skip to main content
. 2019 Aug 28;10:627. doi: 10.3389/fgene.2019.00627

Table 1.

A high-level comparison of the multitable analysis methods discussed in this review. The purpose of this table is to give rules-of-thumb that can guide practical application, where choices invariably depend on the scale and structure of the data, the goals of the analysis, the expected number of future workflow applications, and availability of programming computation time.

Property Algorithms Consequence
Analytical solution Concat. PCA, CCA, CoIA, MFA, PTA, Statico/Costatis Methods with analytical solutions generally run much faster than those that require iterative updates, optimization, or Monte Carlo sampling. They tend to be restricted to more classical settings, however.
Require covariance estimate Concat. PCA, CCA, CoIA, MFA, PTA, Statico/Costatis Methods that require estimates of covariance matrices cannot be applied to data with more variables than samples, and become unstable in high-dimensional settings.
Sparsity SPLS, Graph-Fused Lasso, Graph-Fused Lasso Encouraging sparsity on scores or loadings can result in more interpretable, results for high-dimensional data sets. These methods provide automatic variable selection in the multitable analysis problem.
Tuning parameters Sparsity: Graph-Fused Lasso, PMD, SPLS
Number of Factors: PCA-IV, Red. Rank Regression, Mixed-Membership CCA Prior Parameters: Mixed- Membership CCA, Bayesian Multitask Regression
Methods with many tuning parameters are often more expressive than those without any, since it makes it possible to adapt to different degrees of model complexity. However, in the absence of automatic tuning strategies, these methods are typically more difficult to use effectively.
Probabilistic Mixed-Membership CCA, Bayesian Multitask Regression Probabilistic techniques provide estimates of uncertainty, along with representations of cross-table covariation. This comes at the cost of more involved computation and difficulty in assessing convergence.
Not Normal or Nonlinear CCpNA, Mixed-Membership CCA, Bayesian Multitask Regression When data are not normal (and are difficult to transform to normality) or there are sources of nonlinear covariation across tables, it can be beneficial to directly model this structure.
>2 Tables Concat. PCA, CCA, MFA, PMD Methods that allow more than two tables are applicable in a wider range of multitable problems. Note that these are a subset of the cross-table symmetric methods.
Cross-Table Symmetry Concat. PCA, CCA, CoIA, Statico/Costatis, MFA, PMD Cross-table symmetry refers to the idea that some methods don’t need a supervised or multitask setup, where one table contains response variable and the other requires predictors. The results of these methods do not change when the two tables are swapped in the method input.