Table 1.
Algorithm | Acronyms/Variations | Description | Advantages | Disadvantages | References |
---|---|---|---|---|---|
Regressions | LASSO, Ridge, Elastic | Fit a simple function’s parameters to minimize the sum of squared distances to the observed data |
|
|
Tibshirani14, Kennedy13, Zou & Hastie15 |
Support Vector Machines | SVM, SVR | Use support vectors to identify decision boundaries in the data |
|
|
Cortes & Vapnik16 |
k-nearest neighbor | k-NN | Classify new data based on labels of surrounding neighbors |
|
|
Cover & Hart17 |
k-means clustering | kMeans | Assign each point in the dataset to one of k clusters to minimize within cluster- variance relative to cluster’s centroid |
|
|
Lloyd18 |
Principal Component Analysis | PCA, POD, SVD, EVD, KLT | Change coordinates to orthonormal basis that maximizes the variance of the data along these new coordinates |
|
|
Bishop8, Wold et al.157 |
Decision Tree and Random Forest | RF, AdaBoost, XGBoost | Build flowchart-like decision trees for which the questions are learned from data, then ensemble multiple to form random forests |
|
|
Breiman20 |
Artificial Neural Networks | (A)NN, CNN, RNN, DNN, GAN | Directed, weighted acyclic graph of neurons arranged in layers, using a propagation function to transmit information |
|
|
Chollet7, Bishop8 |
Naïve Bayes | N/A | Classification using Bayes’ theorem by assuming independence between features to model class conditional probability |
|
|
Bishop8 |
Linear discriminant analysis | LDA, NDA | Find a linear combination of features that that separates input data in classes |
|
|
Duda et al.158 |
Gaussian Mixture Model | GMM | Assume data follows a linear combination of Gaussian distributions with parameters estimated from data |
|
|
Bishop8 |
Spectral Clustering | N/A | Use eigenvalue decomposition to cluster based on the similarity matrix whose entries Aij express degree of similarity between points i and j |
|
|
Ng et al.159 |
Mean Shift | N/A | A centroid-based clustering method using an iterative approach to search through neighborhood of points and locate modes of density functions |
|
|
Comaniciu & Meer160 |
Isomap | N/A | Non-linear dimensionality reduction using an isometric mapping (distance-preserving transformation between metric spaces) |
|
|
Tenenbaum et al.161 |
Local Linear Embedding | LLE, HLLE, MLLE | Non-linear dimensionality reduction by using linear combinations of projected neighborhood points to reconstruct data |
|
|
Roweis & Saul162 |
Diffusion Maps | N/A | Feature extraction and dimensionality reduction based on a nonlinear approach, in which distances between points are defined in terms of probabilities of diffusion |
|
|
Coifman et al.163 |
t-distributed stochastic neighbor embedding | tSNE | Data visualization tool which defines similarity between two points as the conditional probability one would pick the other as neighbor if neighbors were picked based on Student t probabilities centered at the first point |
|
|
Roweis & Hinton164 |