Skip to main content

View full-text article in PMC

. 2023 Sep 6;23(18):7709. doi: 10.3390/s23187709

Table 2.

Summary of the ML techniques.

Category	Algorithms	Concept	Advantages	Limitations
Supervised learning	Linear Regression	Predicts continuous output based on input features	Easy to implement	Assumes a linear relationship between features and target
	Logistic Regression	Predicts binary or multi-class outcomes using logistic function	Easy to implement and interpretable results	Assumes linear decision boundaries
	Decision Trees	Creates a tree-like structure to make predictions	Intuitive and easy to interpret, faster computation, and capture non-linear relationships	Prone to overfitting
	Random Forest	Ensemble of decision trees to improve prediction accuracy	Reduces overfitting compared to individual trees, and effectively handling noisy and missing data	Computationally expensive during training and slower computation
	Gradient Boosting	Boosts weak learners (usually decision trees) sequentially	High prediction accuracy	Sensitive to noisy data and outliers
	Support Vector Machine	Finds the optimal hyperplane for binary/multi-class classification	Effective in high-dimensional spaces	Requires proper selection of kernel functions
Unsupervised learning	K-Means Clustering	grouping data into clusters based on similarities	Simple and easy to understand	Requires pre-determined number of clusters (K)
	Hierarchical Clustering	Creates a tree-like hierarchy of clusters based on data similarities	No need to specify the number of clusters beforehand	Sensitive to noise and outliers
	Principal Component Analysis	Reduces dimensionality while preserving variance	Efficient for large feature spaces	Information loss due to dimensionality reduction
DL	ANN	A set of interconnected artificial neurons that process input data	Suitable for complex tasks like image recognition	Prone to overfitting, especially with small datasets
	DNN	Fully connected NN with more than one hidden layer	Can learn complex features and patterns	Longer training time, especially for deep architectures
	CNN	Multi-layer NNs with convolution layer connected to the previous layer	Highly effective in image and video analysis	Requires significant computational resources for training
	RNN	Multi-layer NNs trained using back-propagation method	Can handle sequential data and suitable for time-series and NLP	Can suffer from vanishing gradient problems, computationally expensive to train, and difficult to parallelize the computation
RL		Trains agents to make decisions in an environment to maximize rewards	Useful in sequential decision-making tasks, suitable for super complex data, maximizes behavior, provides a decent minimization of performance standards	Not preferable for a simple problem, high sample complexity and training time, highly depend on the reward function quality, and difficult to debug and interpret