Skip to main content
. 2024 Sep 23;53(9):afae201. doi: 10.1093/ageing/afae201

Table 1.

Popular machine learning algorithms: learning type algorithm and basic working principle. For details see [10, 30]

Algorithm Learning type Basic working principle
Regularised Linear Regression Supervised Extension of the linear regression by including a regularisation penalty to prevent overfitting. Finds a linear relationship between input variables and a continuous output variable
Regularised Logistic Regression Supervised Extension of the logistic regression by including a regularisation penalty to prevent overfitting. Estimates probabilities using a logistic function, often used for binary classification
Regularised Cox Proportional Hazards Model Supervised Extension of the Cox regression by including a regularisation penalty term. Used in survival analysis to model the time until an event occurs focusing on the relationship between survival time and one or more predictors and including censored data. Many machine learning algorithms have a version which allows modelling time until an event occurs
Decision Trees Supervised Splits data into branches to form a tree structure, making decisions based on features
Random Forest Supervised Ensemble of Decision Trees, used for classification and regression, improving accuracy and reducing overfitting
XGBoost Supervised A highly optimised machine learning library known for its speed and performance. It combines decision trees (‘weak learners’) sequentially to develop stronger learners to improve predictions. This is done by training each weak learner on the errors of the preceding one, targeting areas of poor model performance
Support Vector Machines Supervised An algorithm that helps to separate data points into distinct categories by finding the best-dividing line (or plane in more complex multidimensional situations) between different sets of data points
K-Nearest Neighbours Supervised Classifies data based on the majority vote of its ‘k’ nearest neighbours in the feature space
K-Means Clustering Unsupervised Partitions data into ‘k’ distinct clusters based on feature similarity
Hierarchical Clustering Unsupervised Creates a tree of clusters by iteratively grouping data points based on their similarity
Self-Organising Maps Unsupervised Neural network based, used for dimensionality reduction and visualisation, organising high-dimensional data into a low-dimensional map
Principal Component Analysis Unsupervised Reduces data dimensionality by transforming to a new set of variables (principal components)
Neural Networks Supervised/
Unsupervised
Composed of interconnected nodes or neurons, mimicking the human brain, used in complex pattern recognition
Deep Learning Supervised/
Unsupervised/
Reinforcement
Utilises multilayered neural networks to analyse large and complex datasets, excelling in complex tasks like image and speech recognition and natural language processing. Usually not recommended for small data sets
Q-Learning Reinforcement A model-free reinforcement learning algorithm that seeks to learn a policy, which tells an agent what action to take under what circumstances