Table 2.
Illustrates, each algorithm has strengths and challenges
| Category | Algorithm | Advantages | Disadvantages |
|---|---|---|---|
| Linear Models |
Logistic Regression [40] |
Simple to implement and interpret. Efficient to train. Good for binary classification. | Assumes linear relationship between variables. Not suitable for complex relationships. |
| Support Vector Machine (SVM) [42] | Effective in high dimensional spaces. Memory efficient. Versatile with kernel functions. | Requires careful parameter tuning. Not suitable for large datasets. | |
|
SGD Classifier [47] |
Efficient for large-scale problems. Easy to implement and provides a lot of opportunities for code tuning. | Sensitive to feature scaling. Requires a number of hyperparameters | |
| Tree-Based Models |
Decision Tree Classifier [43] |
Easy to interpret and visualize. Can handle both numerical and categorical data. | Prone to overfitting. Can become unstable with small variations in data. |
|
Random Forest Classifier [44] |
Handles overfitting well. Works well on large datasets. Provides feature importances. | Can be slow to predict. Complex and difficult to interpret. | |
|
AdaBoost Classifier [45] |
Improves classification accuracy. Flexible to combine with any learning algorithm. | Sensitive to noisy data and outliers. Can overfit on very complex datasets. | |
|
Gradient Boosting [46] |
Highly effective and flexible. Can optimize on different loss functions. | Prone to overfitting without proper tuning. Time-consuming to train. | |
| Instance-Based Models |
K-Nearest Neighbors (KNN) [41] |
No assumption about data. Simple and effective. Adaptable to any type of data. | Computationally expensive. Performance depends on the number of dimensions. |
| Probabilistic Models |
GaussanNB [48] |
Works well with high-dimensional data. Simple and fast. | Assumes that features are independent. Performance can be affected if the independence assumption is not met. |
| Neural Network Model |
MLP Classifier |
Capable of modeling complex non-linear relationships and works well with large datasets. | Requires significant computational resources and can be prone to overfitting without proper regularization. |