. 2024 Sep 23;53(9):afae201. doi: 10.1093/ageing/afae201

Table 1.

Popular machine learning algorithms: learning type algorithm and basic working principle. For details see [10, 30]

Algorithm	Learning type	Basic working principle
Regularised Linear Regression	Supervised	Extension of the linear regression by including a regularisation penalty to prevent overfitting. Finds a linear relationship between input variables and a continuous output variable
Regularised Logistic Regression	Supervised	Extension of the logistic regression by including a regularisation penalty to prevent overfitting. Estimates probabilities using a logistic function, often used for binary classification
Regularised Cox Proportional Hazards Model	Supervised	Extension of the Cox regression by including a regularisation penalty term. Used in survival analysis to model the time until an event occurs focusing on the relationship between survival time and one or more predictors and including censored data. Many machine learning algorithms have a version which allows modelling time until an event occurs
Decision Trees	Supervised	Splits data into branches to form a tree structure, making decisions based on features
Random Forest	Supervised	Ensemble of Decision Trees, used for classification and regression, improving accuracy and reducing overfitting
XGBoost	Supervised	A highly optimised machine learning library known for its speed and performance. It combines decision trees (‘weak learners’) sequentially to develop stronger learners to improve predictions. This is done by training each weak learner on the errors of the preceding one, targeting areas of poor model performance
Support Vector Machines	Supervised	An algorithm that helps to separate data points into distinct categories by finding the best-dividing line (or plane in more complex multidimensional situations) between different sets of data points
K-Nearest Neighbours	Supervised	Classifies data based on the majority vote of its ‘k’ nearest neighbours in the feature space
K-Means Clustering	Unsupervised	Partitions data into ‘k’ distinct clusters based on feature similarity
Hierarchical Clustering	Unsupervised	Creates a tree of clusters by iteratively grouping data points based on their similarity
Self-Organising Maps	Unsupervised	Neural network based, used for dimensionality reduction and visualisation, organising high-dimensional data into a low-dimensional map
Principal Component Analysis	Unsupervised	Reduces data dimensionality by transforming to a new set of variables (principal components)
Neural Networks	Supervised/ Unsupervised	Composed of interconnected nodes or neurons, mimicking the human brain, used in complex pattern recognition
Deep Learning	Supervised/ Unsupervised/ Reinforcement	Utilises multilayered neural networks to analyse large and complex datasets, excelling in complex tasks like image and speech recognition and natural language processing. Usually not recommended for small data sets
Q-Learning	Reinforcement	A model-free reinforcement learning algorithm that seeks to learn a policy, which tells an agent what action to take under what circumstances