Skip to main content
. 2024 Oct 29;9:100913. doi: 10.1016/j.crfs.2024.100913

Table 2.

Random Forest (RF), Support Vector Machines (SVM), and Naïve Bayes (NB) classification model parameters for cross-validation and external validation in authenticating extra virgin olive oil.

Pre-processing Model Seven-Class Models
Three-Class Models
Two-Class/Binary Models

Optimal Parameters ACC.cv ACC.p Optimal Parameters ACC.cv ACC.p Optimal Parameters ACC.cv Sens. cv Prec. cv Spec. cv F1.cv ACC.p Sens. p Prec. p Spec. p F1.p MCC.p
Unprocessed RF mt = 18, nt = 500 81.6 64.2 mt = 43,nt = 500 99.1 97.7 mt = 55,nt = 500 99.4 99.7 99.8 95.6 99.7 99.2 99.1 100 100 99.6 0.94
SG smoothing RF mt = 43, nt = 500 81.2 64.0 mt = 43,nt = 500 99.2 97.7 mt = 72,nt = 500 99.5 99.6 99.8 96.0 99.7 99.2 99.1 100 100 99.6 0.94
SG+1st deriv. RF mt = 14, nt = 500 92.8 82.1 mt = 14, nt = 500 99.9 99.8 mt = 7,nt = 500 99.8 100 99.9 95.7 99.9 99.9 99.7 100 100 99.8 0.98
SG+2nd deriv. RF mt = 14, nt = 500 93.0 80.2 mt = 14, nt = 500 99.9 99.8 mt = 7,nt = 500 99.8 100 99.8 96.7 99.9 100 100 100 100 100 1.00
SNV RF mt = 43, nt = 500 90.3 71.5 mt = 14, nt = 500 99.7 97.4 mt = 7,nt = 500 99.7 99.9 99.8 94.9 99.8 98.0 99.4 98.4 78.6 99.0 0.84
SNV + SG Smoothing RF mt = 14, nt = 500 90.6 70.7 mt = 14, nt = 500 99.7 97.4 mt = 7, nt = 500 99.7 99.9 99.8 95.0 99.8 98.0 99.5 98.4 78.6 99.0 0.84
SNV + SG+1st deriv. RF mt = 14, nt = 500 94.7 76.4 mt = 14,nt = 500 100 100 mt = 7,nt = 500 99.9 100 99.9 97.3 99.9 100 100 100 100 100 1.00
SNV + SG+2nd deriv. RF mt = 14, nt = 500 96.0 86.0 mt = 14,nt = 500 100 100 mt = 7,nt = 500 99.9 100 99.9 97.1 99.9 100 100 100 100 100 1.00
MSC RF mt = 43, nt = 500 90.8 71.2 mt = 43,nt = 500 99.8 98.9 mt = 7,nt = 500 99.7 99.9 99.8 95.0 99.9 98.5 99.0 98.0 71.4 98.4 0.88
MSC + SG Smoothing RF mt = 43, nt = 500 90.8 71.4 mt = 14,nt = 500 99.8 97.4 mt = 7,nt = 500 99.7 99.9 99.7 94.7 99.9 99.2 99.3 99.8 97.8 98.7 0.90
MSC + SG+1st deriv. RF mt = 14, nt = 500 92.8 82.1 mt = 14,nt = 500 99.9 99.8 mt = 7,nt = 500 99.8 100 99.8 95.7 99.9 99.7 99.7 100 100 99.8 0.98
MSC + SG+2nd deriv. RF mt = 14, nt = 500 96.4 86.2 mt = 43,nt = 100 99.9 99.8 mtry = 7, nt = 500 99.8 100 99.8 96.2 99.9 100 100 100 100 100 1.00
Unprocessed SVM C = 5, σ = 0.01 55.2 55.6 C = 10, σ = 0.01 99.7 99.2 C = 5, σ = 0.01 99.5 99.6 99.9 97.7 99.7 99.5 99.8 99.7 95.2 99.7 0.96
SG smoothing SVM C = 10, σ = 0.01 55.3 55.6 C = 10, σ = 0.01 99.7 99.2 C = 5, σ = 0.01 99.4 99.5 99.9 97.8 99.7 99.4 99.7 99.7 95.2 99.7 0.95
SG+1st deriv. SVM C = 5, σ = 0.01 60.7 56.6 C = 0.05, σ = 0.01 99.6 99.4 C = 0.5, σ = 0.01 99.9 100 99.8 97.0 99.9 98.1 100 100 100 100 1.00
SG+2nd deriv. SVM C = 5, σ = 0.01 60.8 56.7 C = 0.1, σ = 0.01 99.7 99.5 C = 0.5, σ = 0.01 99.9 100 99.8 97.1 99.9 98.0 100 100 100 100 1.00
SNV SVM C = 10, σ = 0.01 58.6 50.7 C = 0.5, σ = 0.01 99.9 97.6 C = 0.5, σ = 0.01 99.8 100 99.8 97.0 99.9 97.6 100 97.4 64.3 98.7 0.79
SNV + SG Smoothing SVM C = 0.1, σ = 0.01 57.7 63.1 C = 0.5, σ = 0.01 99.8 97.6 C = 0.5, σ = 0.01 99.8 100 99.8 97.0 99.9 97.6 100 97.4 64.3 98.7 0.79
SNV + SG+1st deriv. SVM C = 0.05, σ = 0.01 61.3 55.9 C = 0.05, σ = 0.01 99.7 98.2 C = 0.1, σ = 0.01 99.7 100 99.9 97.6 99.8 100 100 100 100 100 1.00
SNV + SG+2nd deriv. SVM C = 0.05, σ = 0.01 63.1 54.0 C = 0.5, σ = 0.01 99.9 99.0 C = 0.05, σ = 0.01 99.9 100 99.9 99.9 99.9 100 100 100 100 100 1.00
MSC SVM C = 10, σ = 0.01 58.6 50.7 C = 0.5, σ = 0.01 99.9 97.6 C = 0.5, σ = 0.01 99.8 99.9 99.8 97.0 99.9 97.6 100 97.4 64.3 98.7 0.79
MSC + SG Smoothing SVM C = 0.1, σ = 0.01 57.8 63.3 C = 0.5, σ = 0.01 99.9 97.6 C = 0.5, σ = 0.01 99.8 99.9 99.8 97.0 99.9 97.6 100 97.4 64.3 98.7 0.79
MSC + SG+1st deriv. SVM C = 5, σ = 0.01 60.7 55.9 C = 0.05, σ = 0.01 99.6 99.2 C = 0.5, σ = 0.01 99.9 100 99.8 97.0 99.9 100 100 100 100 100 1.00
MSC + SG+2nd deriv. SVM C = 1, σ = 0.01 61.3 59.5 C = 0.5, σ = 0.01 99.9 99.0 C = 0.5, σ = 0.01 99.9 100 99.9 98.2 99.9 99.0 100 99.0 85.7 99.5 0.95
Unprocessed NB lc = 0.1, ad = 0.0 51.0 48.6 lc = 0.1, ad = 0.0 97.4 94.3 lc = 0.1, ad = 0.0 96.6 96.5 99.9 97.2 98.2 94.1 93.7 100 100 96.8 0.71
SG smoothing NB lc = 0.1, ad = 0.0 51.1 48.8 lc = 0.1, ad = 0.0 97.4 94.3 lc = 0.1, ad = 0.0 96.5 97.7 99.9 97.7 98.2 94.1 93.1 100 100 96.8 0.71
SG+1st deriv. NB lc = 0.1, ad = 1.0 72.0 68.3 lc = 0.1, ad = 0.0 99.4 98.9 lc = 0.1, ad = 0.0 99.3 99.3 99.0 98.0 99.6 96.8 96.6 100 100 99.4 0.92
SG+2nd deriv. NB lc = 0.1, ad = 1.0 72.2 68.2 lc = 0.1, ad = 0.0 99.4 98.9 lc = 0.1, ad = 0.0 99.2 99.3 99.9 97.8 99.6 98.9 98.8 100 100 99.4 0.92
SNV NB lc = 0.1, ad = 0.0 65.6 60.5 lc = 0.1, ad = 0.0 98.7 94.1 lc = 0.1, ad = 0.0 98.5 98.4 100 99.9 99.2 95.0 95.3 99.3 90.5 97.2 0.70
SNV + SG Smoothing NB lc = 0.1, ad = 0.0 65.7 60.5 lc = 0.1, ad = 0.0 98.7 94.1 lc = 0.1, ad = 0.0 98.5 98.4 100 99.9 99.2 95.0 95.3 99.2 90.5 97.2 0.70
SNV + SG+1st deriv. NB lc = 0.1, ad = 0.0 80.2 70.5 lc = 0.1, ad = 0.0 99.8 99.5 lc = 0.1, ad = 0.0 99.3 99.4 99.9 98.0 99.6 99.2 99.3 99.3 97.6 99.6 0.94
SNV + SG+2nd deriv. NB lc = 0.1, ad = 0.0 84.3 82.4 lc = 0.1, ad = 0.0 99.7 97.6 lc = 0.1, ad = 0.0 99.9 100 99.9 98.0 99.9 97.7 100 97.6 66.7 98.8 0.81
MSC NB lc = 0.1, ad = 0.0 65.6 60.8 lc = 0.1, ad = 0.0 98.7 94.1 lc = 0.1, ad = 0.0 98.4 98.4 100 99.9 99.2 95.0 95.2 99.3 90.5 97.2 0.70
MSC + SG Smoothing NB lc = 0.1, ad = 0.0 65.7 60.7 lc = 0.1, ad = 0.0 98.7 94.4 lc = 0.1, ad = 0.0 98.4 98.4 100 100 99.2 95.0 95.2 99.2 90.5 97.2 0.70
MSC + SG+1st deriv. NB lc = 0.1, ad = 1.0 72.0 68.2 lc = 0.1, ad = 0.0 99.3 98.9 lc = 0.1, ad = 0.0 99.3 99.3 99.9 98.0 99.6 98.9 98.8 100 100 99.4 0.92
MSC + SG+2nd deriv. NB lc = 0.1, ad = 0.0 84.0 82.1 lc = 0.1, ad = 0.0 99.5 99.2 lc = 0.1, ad = 0.0 99.5 99.6 99.9 98.5 99.8 99.3 99.8 99.5 92.9 99.7 0.95

The metric values for the trained models represent averaged classification parameters of 10-fold cross-validation repeated ten times. ACC.cv = Accuracy, Sens.cv = Sensitivity, Prec.cv = Precision, Spec.cv = Specificity, and F1.cv = F1 Score for cross-validation. ACC.p = Accuracy, Sens.p = Sensitivity, Prec.p = Precision, Spec.p = Specificity, and F1.p = F1 Score for the external validation set (test set). SNV = Standard Normal Variate; MSC = Multiplicative Scatter Correction; SG = Savitzky-Golay smoothing; 1st deriv. = 1st derivative; 2nd deriv. = second derivative. mt = mtry: optimal the number of features randomly sampled at each split in a decision tree within the Random Forest using cross-validation and out-of-bag error; nt = ntree, denotes the total number of decision trees created in the Random Forest ensemble based on model tuning and cross-validation. C = cost parameter, σ = Gaussian Radial Basis kernel function for SVM model. Lc = lap lace, ad = adjust parameters for the Naïve Bayes model. For the Seven-Class system, the classification involves seven groups: extra-virgin olive oil (EVOO), hazelnut oil (HZO), olive pomace oil (POO), refined olive oil (ROO), EVOO + HZO, EVOO + POO, and EVOO + ROO. The Three-Class system categorizes oils into three groups: authentic extra-virgin olive oil, edible oil adulterant (100%), or adulterated olive oil (1–40%). The Two-Class system is a binary classification distinguishing between pure EVOO and adulterated olive oil (1–100% adulteration).