Table 2.
Machine learning algorithms and their main features.
| ML Algorithm | Basic Idea | Features |
|---|---|---|
| NB | Probabilistic classifier | Cannot handle missing data, stable performance [25]. |
| SVM | Hyperplane optimization | Highly accurate models, less likely to suffer from overfitting, used for prediction and classification tasks. |
| DT | tree-structured model | Robust, for categorical data, easy to interpret. |
| RF | DT ensemble method | Effective for highly complex problems, best for high-dimensional data sets, can handle missing data and imbalanced data sets. |
| AdaBoost | Ensemble algorithm | Improves the performance of individual weak classifiers, sensitive to noise. |
| KNN | Based on a distance metric to measure the distance between data points. | Choice of a distance metric affects performance; known as lazy learner, as it does not perform any analysis until it is presented with a testing data point. |
| GBDT | Ensemble tree induction, seeks to produce a model that minimizes the loss function | Highly flexible [31]. |
| LR | Predicts the probability that a given data point belongs to a certain class | Easy calculation, can handle continuous numerical values, cannot handle non-linear data. |
| ANN | Inspired by networks of biological neurons | Highly accurate models, difficult to interpret the model (black-box models), requires a large number of parameters. |
| ET | Ensemble tree induction | Good performance, easy to implement, less computational time, fewer optimization parameters [32]. |