TABLE 1.
Example | Description | |
Supervised learning algorithm | ||
Regularised regression | LASSO | An extension of classic regression algorithms in which a penalty is enforced to the fitted model to minimise its complexity and reduce the risk of overfitting |
Tree-based model | Classification and regression trees, random forest, gradient boosted trees (XGBoost) | Based on decision trees (a decision support tool which is a sequence of “if-then-else” splits are derived by iteratively separating data into groups based on the relationship between attributes and outcomes) |
Support vector machine | Linear, hinge loss, radial basis function kernel | Represents data in a multidimensional feature space and fits a “hyperplane” that best separates data based on outcomes of interest |
KNN | KNN | Represents data in a multidimensional feature space and uses local information about observations closest to a new dataset to predict outcomes for the new dataset |
Neural network | Deep neural networks, ANNs | Nonlinear algorithms built using multiple layers of nodes that extract features from the data and perform combinations that best predict outcomes |
Unsupervised learning algorithm | ||
Dimensionality reduction algorithms | Principal component analysis, linear discriminant analysis | Exploits inherent structure to transform data from high-dimensional space into a low-dimensional space which retains some meaningful attributes of the original data |
LCA | LCA | Identifies hidden population subgroups (latent classes) in the data. Used in datasets with complex constructs that have multiple behaviours. The probability of class membership is indirectly estimated by measuring patterns in the data |
Cluster analysis | K-means, hierarchical cluster analysis | Uses inherent structures in the data to best organise data into subgroups of maximum commonality based on some distance measure between features |
Reinforcement learning algorithm | ||
Reinforcement learning | Markov decision process and Q learning | Provides tools to optimise sequences of decision for the best outcomes or to maximise rewards. Learns by trial and error. An action is reinforced with the action that results in a positive outcome (reward), and vice versa. The algorithm can improve performance in scenarios where a learning system can choose to repeat known decisions (exploitation) or make novel decisions expecting to gain even greater rewards (exploration) |
KNN: K-nearest neighbour; LCA: latent class analysis; LASSO: least absolute shrinkage and selection operator; ANN: artificial neural network.