. 2024 Oct 29;9:100913. doi: 10.1016/j.crfs.2024.100913

Table 2.

Random Forest (RF), Support Vector Machines (SVM), and Naïve Bayes (NB) classification model parameters for cross-validation and external validation in authenticating extra virgin olive oil.

Pre-processing	Model	Seven-Class Models			Three-Class Models			Two-Class/Binary Models
Pre-processing	Model	Optimal Parameters	ACC.cv	ACC.p	Optimal Parameters	ACC.cv	ACC.p	Optimal Parameters	ACC.cv	Sens. cv	Prec. cv	Spec. cv	F1.cv	ACC.p	Sens. p	Prec. p	Spec. p	F1.p	MCC.p
Unprocessed	RF	mt = 18, nt = 500	81.6	64.2	mt = 43,nt = 500	99.1	97.7	mt = 55,nt = 500	99.4	99.7	99.8	95.6	99.7	99.2	99.1	100	100	99.6	0.94
SG smoothing	RF	mt = 43, nt = 500	81.2	64.0	mt = 43,nt = 500	99.2	97.7	mt = 72,nt = 500	99.5	99.6	99.8	96.0	99.7	99.2	99.1	100	100	99.6	0.94
SG+1st deriv.	RF	mt = 14, nt = 500	92.8	82.1	mt = 14, nt = 500	99.9	99.8	mt = 7,nt = 500	99.8	100	99.9	95.7	99.9	99.9	99.7	100	100	99.8	0.98
SG+2nd deriv.	RF	mt = 14, nt = 500	93.0	80.2	mt = 14, nt = 500	99.9	99.8	mt = 7,nt = 500	99.8	100	99.8	96.7	99.9	100	100	100	100	100	1.00
SNV	RF	mt = 43, nt = 500	90.3	71.5	mt = 14, nt = 500	99.7	97.4	mt = 7,nt = 500	99.7	99.9	99.8	94.9	99.8	98.0	99.4	98.4	78.6	99.0	0.84
SNV + SG Smoothing	RF	mt = 14, nt = 500	90.6	70.7	mt = 14, nt = 500	99.7	97.4	mt = 7, nt = 500	99.7	99.9	99.8	95.0	99.8	98.0	99.5	98.4	78.6	99.0	0.84
SNV + SG+1st deriv.	RF	mt = 14, nt = 500	94.7	76.4	mt = 14,nt = 500	100	100	mt = 7,nt = 500	99.9	100	99.9	97.3	99.9	100	100	100	100	100	1.00
SNV + SG+2nd deriv.	RF	mt = 14, nt = 500	96.0	86.0	mt = 14,nt = 500	100	100	mt = 7,nt = 500	99.9	100	99.9	97.1	99.9	100	100	100	100	100	1.00
MSC	RF	mt = 43, nt = 500	90.8	71.2	mt = 43,nt = 500	99.8	98.9	mt = 7,nt = 500	99.7	99.9	99.8	95.0	99.9	98.5	99.0	98.0	71.4	98.4	0.88
MSC + SG Smoothing	RF	mt = 43, nt = 500	90.8	71.4	mt = 14,nt = 500	99.8	97.4	mt = 7,nt = 500	99.7	99.9	99.7	94.7	99.9	99.2	99.3	99.8	97.8	98.7	0.90
MSC + SG+1st deriv.	RF	mt = 14, nt = 500	92.8	82.1	mt = 14,nt = 500	99.9	99.8	mt = 7,nt = 500	99.8	100	99.8	95.7	99.9	99.7	99.7	100	100	99.8	0.98
MSC + SG+2nd deriv.	RF	mt = 14, nt = 500	96.4	86.2	mt = 43,nt = 100	99.9	99.8	mtry = 7, nt = 500	99.8	100	99.8	96.2	99.9	100	100	100	100	100	1.00
Unprocessed	SVM	C = 5, σ = 0.01	55.2	55.6	C = 10, σ = 0.01	99.7	99.2	C = 5, σ = 0.01	99.5	99.6	99.9	97.7	99.7	99.5	99.8	99.7	95.2	99.7	0.96
SG smoothing	SVM	C = 10, σ = 0.01	55.3	55.6	C = 10, σ = 0.01	99.7	99.2	C = 5, σ = 0.01	99.4	99.5	99.9	97.8	99.7	99.4	99.7	99.7	95.2	99.7	0.95
SG+1st deriv.	SVM	C = 5, σ = 0.01	60.7	56.6	C = 0.05, σ = 0.01	99.6	99.4	C = 0.5, σ = 0.01	99.9	100	99.8	97.0	99.9	98.1	100	100	100	100	1.00
SG+2nd deriv.	SVM	C = 5, σ = 0.01	60.8	56.7	C = 0.1, σ = 0.01	99.7	99.5	C = 0.5, σ = 0.01	99.9	100	99.8	97.1	99.9	98.0	100	100	100	100	1.00
SNV	SVM	C = 10, σ = 0.01	58.6	50.7	C = 0.5, σ = 0.01	99.9	97.6	C = 0.5, σ = 0.01	99.8	100	99.8	97.0	99.9	97.6	100	97.4	64.3	98.7	0.79
SNV + SG Smoothing	SVM	C = 0.1, σ = 0.01	57.7	63.1	C = 0.5, σ = 0.01	99.8	97.6	C = 0.5, σ = 0.01	99.8	100	99.8	97.0	99.9	97.6	100	97.4	64.3	98.7	0.79
SNV + SG+1st deriv.	SVM	C = 0.05, σ = 0.01	61.3	55.9	C = 0.05, σ = 0.01	99.7	98.2	C = 0.1, σ = 0.01	99.7	100	99.9	97.6	99.8	100	100	100	100	100	1.00
SNV + SG+2nd deriv.	SVM	C = 0.05, σ = 0.01	63.1	54.0	C = 0.5, σ = 0.01	99.9	99.0	C = 0.05, σ = 0.01	99.9	100	99.9	99.9	99.9	100	100	100	100	100	1.00
MSC	SVM	C = 10, σ = 0.01	58.6	50.7	C = 0.5, σ = 0.01	99.9	97.6	C = 0.5, σ = 0.01	99.8	99.9	99.8	97.0	99.9	97.6	100	97.4	64.3	98.7	0.79
MSC + SG Smoothing	SVM	C = 0.1, σ = 0.01	57.8	63.3	C = 0.5, σ = 0.01	99.9	97.6	C = 0.5, σ = 0.01	99.8	99.9	99.8	97.0	99.9	97.6	100	97.4	64.3	98.7	0.79
MSC + SG+1st deriv.	SVM	C = 5, σ = 0.01	60.7	55.9	C = 0.05, σ = 0.01	99.6	99.2	C = 0.5, σ = 0.01	99.9	100	99.8	97.0	99.9	100	100	100	100	100	1.00
MSC + SG+2nd deriv.	SVM	C = 1, σ = 0.01	61.3	59.5	C = 0.5, σ = 0.01	99.9	99.0	C = 0.5, σ = 0.01	99.9	100	99.9	98.2	99.9	99.0	100	99.0	85.7	99.5	0.95
Unprocessed	NB	lc = 0.1, ad = 0.0	51.0	48.6	lc = 0.1, ad = 0.0	97.4	94.3	lc = 0.1, ad = 0.0	96.6	96.5	99.9	97.2	98.2	94.1	93.7	100	100	96.8	0.71
SG smoothing	NB	lc = 0.1, ad = 0.0	51.1	48.8	lc = 0.1, ad = 0.0	97.4	94.3	lc = 0.1, ad = 0.0	96.5	97.7	99.9	97.7	98.2	94.1	93.1	100	100	96.8	0.71
SG+1st deriv.	NB	lc = 0.1, ad = 1.0	72.0	68.3	lc = 0.1, ad = 0.0	99.4	98.9	lc = 0.1, ad = 0.0	99.3	99.3	99.0	98.0	99.6	96.8	96.6	100	100	99.4	0.92
SG+2nd deriv.	NB	lc = 0.1, ad = 1.0	72.2	68.2	lc = 0.1, ad = 0.0	99.4	98.9	lc = 0.1, ad = 0.0	99.2	99.3	99.9	97.8	99.6	98.9	98.8	100	100	99.4	0.92
SNV	NB	lc = 0.1, ad = 0.0	65.6	60.5	lc = 0.1, ad = 0.0	98.7	94.1	lc = 0.1, ad = 0.0	98.5	98.4	100	99.9	99.2	95.0	95.3	99.3	90.5	97.2	0.70
SNV + SG Smoothing	NB	lc = 0.1, ad = 0.0	65.7	60.5	lc = 0.1, ad = 0.0	98.7	94.1	lc = 0.1, ad = 0.0	98.5	98.4	100	99.9	99.2	95.0	95.3	99.2	90.5	97.2	0.70
SNV + SG+1st deriv.	NB	lc = 0.1, ad = 0.0	80.2	70.5	lc = 0.1, ad = 0.0	99.8	99.5	lc = 0.1, ad = 0.0	99.3	99.4	99.9	98.0	99.6	99.2	99.3	99.3	97.6	99.6	0.94
SNV + SG+2nd deriv.	NB	lc = 0.1, ad = 0.0	84.3	82.4	lc = 0.1, ad = 0.0	99.7	97.6	lc = 0.1, ad = 0.0	99.9	100	99.9	98.0	99.9	97.7	100	97.6	66.7	98.8	0.81
MSC	NB	lc = 0.1, ad = 0.0	65.6	60.8	lc = 0.1, ad = 0.0	98.7	94.1	lc = 0.1, ad = 0.0	98.4	98.4	100	99.9	99.2	95.0	95.2	99.3	90.5	97.2	0.70
MSC + SG Smoothing	NB	lc = 0.1, ad = 0.0	65.7	60.7	lc = 0.1, ad = 0.0	98.7	94.4	lc = 0.1, ad = 0.0	98.4	98.4	100	100	99.2	95.0	95.2	99.2	90.5	97.2	0.70
MSC + SG+1st deriv.	NB	lc = 0.1, ad = 1.0	72.0	68.2	lc = 0.1, ad = 0.0	99.3	98.9	lc = 0.1, ad = 0.0	99.3	99.3	99.9	98.0	99.6	98.9	98.8	100	100	99.4	0.92
MSC + SG+2nd deriv.	NB	lc = 0.1, ad = 0.0	84.0	82.1	lc = 0.1, ad = 0.0	99.5	99.2	lc = 0.1, ad = 0.0	99.5	99.6	99.9	98.5	99.8	99.3	99.8	99.5	92.9	99.7	0.95

The metric values for the trained models represent averaged classification parameters of 10-fold cross-validation repeated ten times. ACC.cv = Accuracy, Sens.cv = Sensitivity, Prec.cv = Precision, Spec.cv = Specificity, and F1.cv = F1 Score for cross-validation. ACC.p = Accuracy, Sens.p = Sensitivity, Prec.p = Precision, Spec.p = Specificity, and F1.p = F1 Score for the external validation set (test set). SNV = Standard Normal Variate; MSC = Multiplicative Scatter Correction; SG = Savitzky-Golay smoothing; 1st deriv. = 1st derivative; 2nd deriv. = second derivative. mt = mtry: optimal the number of features randomly sampled at each split in a decision tree within the Random Forest using cross-validation and out-of-bag error; nt = ntree, denotes the total number of decision trees created in the Random Forest ensemble based on model tuning and cross-validation. C = cost parameter, σ = Gaussian Radial Basis kernel function for SVM model. Lc = lap lace, ad = adjust parameters for the Naïve Bayes model. For the Seven-Class system, the classification involves seven groups: extra-virgin olive oil (EVOO), hazelnut oil (HZO), olive pomace oil (POO), refined olive oil (ROO), EVOO + HZO, EVOO + POO, and EVOO + ROO. The Three-Class system categorizes oils into three groups: authentic extra-virgin olive oil, edible oil adulterant (100%), or adulterated olive oil (1–40%). The Two-Class system is a binary classification distinguishing between pure EVOO and adulterated olive oil (1–100% adulteration).