Table 2.
Summary of the ML techniques.
| Category | Algorithms | Concept | Advantages | Limitations |
|---|---|---|---|---|
| Supervised learning |
Linear Regression | Predicts continuous output based on input features | Easy to implement | Assumes a linear relationship between features and target |
| Logistic Regression | Predicts binary or multi-class outcomes using logistic function | Easy to implement and interpretable results | Assumes linear decision boundaries | |
| Decision Trees | Creates a tree-like structure to make predictions | Intuitive and easy to interpret, faster computation, and capture non-linear relationships | Prone to overfitting | |
| Random Forest | Ensemble of decision trees to improve prediction accuracy | Reduces overfitting compared to individual trees, and effectively handling noisy and missing data | Computationally expensive during training and slower computation | |
| Gradient Boosting | Boosts weak learners (usually decision trees) sequentially | High prediction accuracy | Sensitive to noisy data and outliers | |
| Support Vector Machine | Finds the optimal hyperplane for binary/multi-class classification | Effective in high-dimensional spaces | Requires proper selection of kernel functions | |
| Unsupervised learning |
K-Means Clustering | grouping data into clusters based on similarities | Simple and easy to understand | Requires pre-determined number of clusters (K) |
| Hierarchical Clustering | Creates a tree-like hierarchy of clusters based on data similarities | No need to specify the number of clusters beforehand | Sensitive to noise and outliers | |
| Principal Component Analysis | Reduces dimensionality while preserving variance | Efficient for large feature spaces | Information loss due to dimensionality reduction | |
| DL | ANN | A set of interconnected artificial neurons that process input data | Suitable for complex tasks like image recognition | Prone to overfitting, especially with small datasets |
| DNN | Fully connected NN with more than one hidden layer | Can learn complex features and patterns | Longer training time, especially for deep architectures | |
| CNN | Multi-layer NNs with convolution layer connected to the previous layer | Highly effective in image and video analysis | Requires significant computational resources for training | |
| RNN | Multi-layer NNs trained using back-propagation method | Can handle sequential data and suitable for time-series and NLP | Can suffer from vanishing gradient problems, computationally expensive to train, and difficult to parallelize the computation | |
| RL | Trains agents to make decisions in an environment to maximize rewards | Useful in sequential decision-making tasks, suitable for super complex data, maximizes behavior, provides a decent minimization of performance standards | Not preferable for a simple problem, high sample complexity and training time, highly depend on the reward function quality, and difficult to debug and interpret | |