. 2021 Nov 18;11:772663. doi: 10.3389/fonc.2021.772663

Table 1.

Summary and definitions of most common machine learning (ML) models.

ML model	Abbreviations	Application	Definition
Artificial Neural Network	ANN, NN	Classification, regression, and clustering	Any set of algorithms modeled on human brain neuronal connections
Active Shape Model	ASM	Segmentation	Model-based method to compare an image reference model with the image of interest
Bayesian Bagging (Bootstrap AGGregatING)	BB	Classification and regression	Bayesian analog of the original bootstrap. Bootstrap samples of the data are taken, the model is fit to each sample, and the predictions are averaged over all of the fitted models to get the bagged prediction
Boosting	–	Classification and regression	Boosting is a generic algorithm rather than a specific model. Boosting needs a weak model (e.g., regression, shallow decision trees, etc.) as a starting point and then improves it
Bootstrap aggregating	–	Classification and regression	Meta-algorithm designed to improve the stability and accuracy of ML algorithms used in statistical classification and regression. It also reduces variance and helps to avoid overfitting. Although it is usually applied to decision tree methods, it can be used with any type of method
Classification and Regression Tree	CART	Classification and regression	Predictive model which predicts an outcome variable value based on other values. A CART output is a decision tree where each fork is a split in a predictor variable and each end node contains a prediction for the outcome variable
Convolutional Neural Network (CNN)	CNN, NN	Classification, regression, and clustering	Ordinary NN which implements convolution (mathematical operation on 2 functions producing a third function expressing how the shape of the first one is modified by the second one), in at least 1 of its layers. Most commonly, inputs are images
C4.5	–	Classification	An algorithm used to generate a decision tree. The decision trees generated by C4.5 can be used for classification, and for this reason, this algorithm is often referred to as a statistical classifier
Decision tree	DT	Classification and regression	Algorithm containing conditional control statements organized in the form of a flowchart-like structure, also called tree-like model. Paths from roots to leaves represent classification rules, while each node is a class label (decision based on the computation of the attributes)
Decision stump	DS	Classification and regression	Model consisting of a 1-level decision tree, a tree with an internal node (root) immediately connected to the terminal nodes (its leaves). A DS makes a prediction based on the value of just a single input feature. Sometimes they are also called 1xrules
Fully Convolutional Neural Network	FCNN	Classification, regression, and clustering	A deep learning model based on traditional CNN model. A FCNN is one where all the learnable layers are convolutional, so it does not have any fully connected layer.
Incremental Association Markov Blanket	IAMB	Features selection	Feature selection method
Least Absolute Shrinkage and Selection Operator	LASSO	Feature selection	A regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the statistical model
Likelihood-Fuzzy Analysis	LFA	Classification	A method used for translating statistical information coming from labeled data into a fuzzy classification system with good confidence measure in terms of class probabilities and interpretability of the fuzzy classification model, by means of semantically interpretable fuzzy partitions and if–then rule
Linear discriminant analysis	LDA	Classification	A method used to find a linear combination of features that characterizes or separates 2 or more classes of objects or events
Logistic regression	LR	Classification	A statistical model that uses a logistic function to model a binary dependent variable
k-Nearest Neighbors	k-NN	Classification and regression	Non-parametric algorithm that classifies data points based on their similarity (also called distance or proximity) with the objects (feature vectors) contained in the collection of known objects (vector space or feature space)
Multiadaptive Regression Splines	MARS	Regression	It is a nonparametric regression technique, extension of linear models that automatically models nonlinearities and interactions between variables
Multivariate Regression Model for Reserving	MRMR	Features selection	Supervised feature selection algorithm which requires both the input features, and the output class labels of data. Using the input features and output class labels, MRMR attempts to find the set of features which associate best with the output class labels, while minimizing the redundancy between the selected features
Naive Bayes	NB	Classification	Applies Bayes’ theorem to calculate the probability of an hypothesis to be true assuming prior knowledge and a strong (therefore, naive) degree of independence between the features
Partial least squares and principal component regression	PLSR and PCR	Regression	Both methods model a response variable when there are a large number of predictor variables, and those predictors are highly correlated. Both methods construct new predictor variables, known as components, as linear combinations of the original predictor variables. PCR creates components to explain the observed variability in the predictor variables, without considering the response variable at all. PLSR does take the response variable into account, and therefore often leads to models that are able to fit the response variable with fewer components
Principal component analysis	PCA	Clustering	Captures the maximum variance in the data into a new coordinate system whose axes are called “principal components,” to reduce data dimensionality, favor their exploration, and reduce computational cost
Penalized logistic regression	PLR	Classification	PLR imposes a penalty to the logistic model for having too many variables. This results in shrinking the coefficients of the less contributive variables toward zero. This is also known as regularization
Random forest (RF)/Random forest classification (RFC)	RF, RFC	Classification and regression	Operates by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees
Relief	–	Features selection	An algorithm that takes a filter-method approach to feature selection that is notably sensitive to feature interactions. Relief calculates a feature score for each feature which can then be applied to rank and select top scoring features for feature selection
Random survival forest	RSF	Survival	A nonparametric method for ensemble estimation constructed by bagging of classification trees for survival data, has been proposed as an alternative method for better survival prediction and variable selection
Rescorla Wagner model	RW	Classification, clustering	Rescorla Wagner model is a model of classical conditioning, in which learning is conceptualized in terms of associations between conditioned and unconditioned stimuli
Stochastic/Gradient Boosting	–	Classification and regression	A ML technique which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees
Support Vector Classifier	SVC	Classification	The objective linear SVC is to fit to the provided data and returns a “best-fit” hyperplane that divides, or categorizes them
Support vector machine	SVM	Classification and regression	The SVM is based on the idea of finding a hyperplane that best divides the support vectors into classes. The SVM algorithm achieves maximum performance in binary classification problems, even if it is used for multiclass classification problems
U-net architecture	–	Segmentation	U-Net is a CNN that was developed for biomedical image segmentation. The main idea is to supplement a usual contracting network by successive layers, where pooling operations are replaced by up sampling operators. Hence, these layers increase the resolution of the output. A successive convolutional layer can then learn to assemble a precise output based on this information