Skip to main content
. 2024 Mar 13;15:1356260. doi: 10.3389/fpls.2024.1356260

Table 3.

ML supervised classification algorithms.

ML Algorithms Description Advantages Limitations
SVM (Mokhtar et al., 2015) SVM technique mostly used for classification and regression. SVMs are efficient at non-linear classification using the kernel trick. SVMs use this technique to address non-linear classification issues by automatically transforming input data into feature spaces with higher dimensions. • Mitigate the risk of errors
• Ideal for non-linear dependency
• Models are reliable
• Better prediction
• Slow training progress
• Lowering model interpretability
• Black-box
• Hard to handle mixed data
RF (Govardhan and Veena, 2019) RF is a proprietary designation for an ensemble of decision trees. The goal is to categorize novel entities by taking into account their qualities and aggregate the results of categorization through a voting process. In the end, the class with the most votes are chosen as the output classification. • More robust against overfitting
• Demands less fine-tuning
• More trees potentially lead to slow prediction
• Not suitable for categorical variables with various level
DT (Attri et al, 2023) DT expressed as predictor functions, traverse a tree structure from root to leaf nodes to predict instance labels. DT is a popular ML method that employs branching to show the likely outcome of a choice. Each leaf node represents an individual function evaluation, with its branches standing in for several possible results. • Simpler to comprehend
• Fast and accurate
• Efficient to handle missing values
• Better performance for larger data
• More risk potential for complex decision trees
• Highly sensitive to outliers
• Lead to overfitting
ANN (Kumar et al., 2022) ANNs are parallel distributed processing systems that resemble the structure and behavior of the human brain. They are made up of neurons. ANNs are feedforward networks that fine-tune biases and weight parameters via learning techniques. An activation function is crucial an ANN because it specifies which neurons generate output. • Reliable prediction
• Can handle correlated inputs
• Integrate the capability of input combinations
• Vulnerable to the outliers
• Prone to irrelevant attributes
• Struggles to handles complex datasets
KNN (K and Rao, 2019) The KNN is a non-parametric technique utilized in pattern recognition and regression analyses. It is a slow method of learning that uses local approximations of functions and puts off computations until the phase of classification. KNN is an easy and efficient classification method. It does this by giving greater weight to data items that close together. • Resource efficient
• Capable to handle complex datasets
• Cost efficient
• Can handle outliers
• Expensive to Converge larger data
• Ineffective to handle complex data
• Inefficient for noisy data