. 2018 Sep 20;13(9):e0204161. doi: 10.1371/journal.pone.0204161

Table 1. Algorithms.

Algorithm	Type	Selected characteristics	Interpretability
Logistic regression (generalized linear model, GLM)	Classification	Most commonly used model in medical literature. Models linear relationships, requires uncorrelated features	++++
Logistic regression with elastic net regularization (GLMNET)	Classification	Adaptation of logistic regression to handle correlated features (as well as high-dimensional datasets). Among correlated features, some will be dropped entirely from the model, even if predictive	+++
Support Vector Machine—SVM	Classification	Popular ML tool in biomedical research offers competitive performance among multiple datasets but poor interpretability	+
Classification and Regression Trees—CART	Classification	Builds an intuitive decision tree for easy patient stratification. Automatically models feature interactions	++++
Tree-Structured Boosting—MediBoost	Classification	Same structure as CART (builds a single decision tree), but with improved accuracy by considering weighted versions of all cases at each split	++++
Random Forest (RF)	Classification	Best out-of-the-box performance with no tuning. Variable importance suggests features that contribute to prediction after considering interactions, but no directionality or explicit interactions shown	++
Gradient Boosting Machine (GBM)	Classification	Best overall performance on structured data across real-world applications. Variable importance similar to RF	++
Penalized Cox regression (Adaptive Elastic Net)	Survival Analysis	Allows Cox survival analysis with high dimensional, correlated data and building of clinically-interpretable nomograms. As in classification, among highly correlated features, some may be dropped from the model, even if predictive.	+++