Skip to main content
. Author manuscript; available in PMC: 2018 Aug 7.
Published in final edited form as: Expert Rev Mol Diagn. 2018 Feb 16;18(3):219–226. doi: 10.1080/14737159.2018.1439380

Table 1.

Summary of machine learning approaches that are promising in addressing issues imposed by trait heterogeneity. The Approaches column lists examples of algorithms in the respective method category. Strengths and Limitations are described in terms of EHR-derived data applications. The Biomedical Applications column lists traits for which subgroups have been identified using the respective method.

Method Category Approaches Strengths Limitations Biomedical Applications
Cluster analysis Hierarchical, k-means Wide range of applications; easy interpretation Not robust to highly dimensional data or large datasets; most approaches restricted to one data type; some approaches require number of clusters COPD[11,37], Fibromyalgia[39], Tinnitus[40], Diabetes[41], Obesity[42,43]
Topological approaches TDA, manifold learning algorithms Able to handle highly dimensional and noisy data; does not require knowledge of number of clusters; sensitive to global and local structure Optimization of free parameters; computational cost; deep knowledge of topological methods for correct application T2D[9], Breast cancer [53], Attention deficit [52]
Dimensionality Reduction Linear (PCA), Non-linear (MDS, t-SNE, Isomap, LLE) Able to handle highly dimensional, noisy data; does not require knowledge of number of clusters Optimization of free parameters; Many methods are non-parametric and do not provide information on how dimensionality was reduced; projection loss; result inconsistency; computational cost COPD[37,46], Changes during anesthesia[54], temporal lobe epilepsy[55]