SVM [16–19, 25, 29, 31] |
SVM avoids the complexity of high-dimensional space and directly uses the kernel function of this space |
SVM is difficult to implement for large training samples and determine the kernel function |
RF [16, 19, 25–27, 31] |
RF can handle very high-dimensional data without feature selection |
RF may overfit on some noisy classification or regression problems |
KNN [31] |
The training time complexity of KNN is lower than the support vector machine (SVM) |
The amount of calculation is large |
Compared with naive Bayes (NB), it has no assumptions about the data, has high accuracy, and is insensitive to outliers |
When the sample is unbalanced, the prediction accuracy of rare categories is low |
NB [16, 32] |
NB performs well on small-scale data, and the algorithm is relatively simple |
The posterior probability is determined by the prior and the data, and then to determine the classification, so there is a certain error rate in the classification decision |
LDA [16, 26] |
LDA works better when the sample classification information depends on the mean rather than the variance |
LDA is not suitable for dimensionality reduction of samples from non-Gaussian distributions and may overfit |
XGBoost [31] |
Regularization is added to the loss function to prevent overfitting |
The split gain of many leaf nodes at the same level is low, and it is unnecessary to perform further splits, which may bring unnecessary overhead |
Parallel computing makes the algorithm more efficient |
Memory optimization |