. 2023 Dec 19;41(2):534–552. doi: 10.1007/s12325-023-02743-3

Table 1.

Commonly used machine learning algorithms (with Python and R codes)

Algorithms	Objectives	References
Linear regression	Used to estimate real values (independent variables such as predicted FEV₁) based on continues variables (dependent variable such as height, age, etc.) [46] It can be simple or multiple if there is more than one independent variable	Frank E. Harrell Jr Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis Springer International; 2015
Logistic regression^a	A classification algorithm used to estimate discrete values (yes/no, true/false, etc.) [46] Since it predicts the probability of developing a certain condition, the values are between 0 and 1	Frank E. Harrell Jr Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis Springer International; 2015
Support vector machine (SVM)	A classification algorithm in which the value of each feature is plotted in a particular coordinate (known as support vectors) [47] Data are split by a line (called classifier) equidistant to the closest point from each group	Ingo Steinwart, Andreas Christmann Support vector machines Springer New York; 2008
Naïve Bayes	A classification algorithm that assumes independence between predictors [48] It is simple, useful in large datasets, and known to outperform sophisticated classification methods	Thomas Mitchell Machine learning McGraw-Hill Education; 1997
Decision tree	A supervised learning algorithm mostly used for the classification of problems with both categorical and continuous dependent variables [49] It splits populations into homogenous sets based on significant attributes (independent variables)	Clinton Sheppard Tree-based machine learning algorithms: decision trees, random forests, and boosting CreateSpace; 2017
Random forest^a	An ensemble of decision trees It classifies a new object on the basis of attributes and the tree “votes” with forest choosing the classification having the most votes Combining trees improves the accuracy of the model [50]	Chris Smith, Mark Koning Decision trees and random forests: a visual introduction for beginners Blue Windmill Media; 2017
Gradient boosting machines (GBM)^a	Has high prediction power and is used on large datasets By combining learning algorithms, gradient boosting works well with scientific data XGBoost has high predictive power and accuracy; regularized boosting helps reduce overfitting [51] LightGBM is a faster algorithm that uses tree-based algorithms [52]	Corey Wade, Kevin Glynn Hands-on gradient boosting with XGBoost and Scikit-learn: perform accessible machine learning and extreme gradient boosting with Python Packt; 2020 Ke G, Meng Q, Finley T, et al. Lightgbm: a highly efficient gradient boosting decision tree Adv Neural Inf Process Syst 2017;30:3149–3157
Recurrent neural networks (RNN)^a	Designed to process sequential data such as time series Have been used to predict hospital readmissions [53]	Ralf C. Staudemeyer, Eric Rothstein Morris Understanding LSTM – a tutorial into long short-term memory recurrent neural networks arXiv 1909.09586 [preprint]. 2019
Long short-term memory (LSTM)^a	A type of RNN designed to handle long-term dependencies in sequential data [53] Has been used to predict disease progression	Ralf C. Staudemeyer, Eric Rothstein Morris Understanding LSTM – a tutorial into long short-term memory recurrent neural networks arXiv 1909.09586 [preprint]. 2019
K-nearest neighbors (KNN)	Can be used for classification or regression problems In classification, it stores all cases and classifies new cases by a majority vote of k-neighbors measured by a distance function [54] The sum of the square of the difference between the centroid and the data points within a cluster determines the value of k for that cluster	Antonio Mucherino, Petraq J. Papajorgji, Panos M. Pardalos Data mining in agriculture Springer Science & Business Media; 2009
K-means clustering	An unsupervised algorithm that solves clustering problems by picking a k number known as centroid, and each data point forms a cluster with the closest centroids [55]	Swati Patel K-means clustering algorithm: implementation and critical analysis Scholars; 2019
Dimensionality reduction	It identifies highly significant variables from vast data sets where the data are unstructured or in great detail [56]	Benyamin Ghojogh, Mark Crowley, Fakhri Karray, Ali Ghodsi Elements of dimensionality reduction and manifold learning Springer Nature; 2023

Algorithms

Objectives

References

Linear regression

Used to estimate real values (independent variables such as predicted FEV₁) based on continues variables (dependent variable such as height, age, etc.) [46]

It can be simple or multiple if there is more than one independent variable

Frank E. Harrell Jr

Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis

Springer International; 2015

Logistic regression^a

A classification algorithm used to estimate discrete values (yes/no, true/false, etc.) [46]

Since it predicts the probability of developing a certain condition, the values are between 0 and 1

Frank E. Harrell Jr

Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis

Springer International; 2015

Support vector machine (SVM)

A classification algorithm in which the value of each feature is plotted in a particular coordinate (known as support vectors) [47]

Data are split by a line (called classifier) equidistant to the closest point from each group

Ingo Steinwart, Andreas Christmann

Support vector machines

Springer New York; 2008

Naïve Bayes

A classification algorithm that assumes independence between predictors [48]

It is simple, useful in large datasets, and known to outperform sophisticated classification methods

Thomas Mitchell

Machine learning

McGraw-Hill Education; 1997

Decision tree

A supervised learning algorithm mostly used for the classification of problems with both categorical and continuous dependent variables [49]

It splits populations into homogenous sets based on significant attributes (independent variables)

Clinton Sheppard

Tree-based machine learning algorithms: decision trees, random forests, and boosting

CreateSpace; 2017

Random forest^a

An ensemble of decision trees

It classifies a new object on the basis of attributes and the tree “votes” with forest choosing the classification having the most votes

Combining trees improves the accuracy of the model [50]

Chris Smith, Mark Koning

Decision trees and random forests: a visual introduction for beginners

Blue Windmill Media; 2017

Gradient boosting machines (GBM)^a

Has high prediction power and is used on large datasets

By combining learning algorithms, gradient boosting works well with scientific data

XGBoost has high predictive power and accuracy; regularized boosting helps reduce overfitting [51]

LightGBM is a faster algorithm that uses tree-based algorithms [52]

Corey Wade, Kevin Glynn

Hands-on gradient boosting with XGBoost and Scikit-learn: perform accessible machine learning and extreme gradient boosting with Python

Packt; 2020

Ke G, Meng Q, Finley T, et al.

Lightgbm: a highly efficient gradient boosting decision tree

Adv Neural Inf Process Syst 2017;30:3149–3157

Recurrent neural networks (RNN)^a

Designed to process sequential data such as time series

Have been used to predict hospital readmissions [53]

Ralf C. Staudemeyer, Eric Rothstein Morris

Understanding LSTM – a tutorial into long short-term memory recurrent neural networks

arXiv 1909.09586 [preprint]. 2019

Long short-term memory (LSTM)^a

A type of RNN designed to handle long-term dependencies in sequential data [53]

Has been used to predict disease progression

Ralf C. Staudemeyer, Eric Rothstein Morris

Understanding LSTM – a tutorial into long short-term memory recurrent neural networks

arXiv 1909.09586 [preprint]. 2019

K-nearest neighbors (KNN)

Can be used for classification or regression problems

In classification, it stores all cases and classifies new cases by a majority vote of k-neighbors measured by a distance function [54]

The sum of the square of the difference between the centroid and the data points within a cluster determines the value of k for that cluster

Antonio Mucherino, Petraq J. Papajorgji, Panos M. Pardalos

Data mining in agriculture

Springer Science & Business Media; 2009

K-means clustering

An unsupervised algorithm that solves clustering problems by picking a k number known as centroid, and each data point forms a cluster with the closest centroids [55]

Swati Patel

K-means clustering algorithm: implementation and critical analysis

Scholars; 2019

Dimensionality reduction

It identifies highly significant variables from vast data sets where the data are unstructured or in great detail [56]

Benyamin Ghojogh, Mark Crowley, Fakhri Karray, Ali Ghodsi

Elements of dimensionality reduction and manifold learning

Springer Nature; 2023

^aMost commonly used to predict medical events