Table 3.
A summary of commonly applied multivariate data analysis approaches in metabolomics studies. Abbreviations: PCA, principal component analysis; PLS-DA, partial least squares discriminant analysis; OPLS-DA, orthogonal partial least squares discriminant analysis; HCA, hierarchical cluster analysis; RF, Random Forest.
| Technique | Unsupervised / Supervised | Characteristics |
|---|---|---|
| PCA | Unsupervised | Exploratory clustering technique extremely useful in identification of differences between observations including variable differences and covariances |
| PLS-DA | Supervised | Maximum separation between groups of observations is achieved using rotating PCA components. Useful for obtaining information about which variables are involved in class separation |
| OPLS-DA | Supervised | Systematic variation that is not correlated with classes is removed, which may improve interpretation but not predictivity |
| HCA | Unsupervised | Exploratory tool to visualize groupings of observations and represented as a tree or dendrogram showing observation homology |
| RF | Supervised or unsupervised | A learning algorithm which uses an ensemble of decision trees to assign class relationships to observations |