Skip to main content
. 2017 Dec 13;14(4):20170056. doi: 10.1515/jib-2017-0056

Table 4:

Performance values (RMSE and R2) obtained for the different machine learning models trained with a fusion between UV-visible spectrophotometry and CIELAB data.

UV-visible (400–500 ηm) + CIELAB data
TCC Spectrophotometry
TCC HPLC
trans-β-carotene
RMSE R 2 RMSE R 2 RMSE R 2
Partial least squares (simpis) 3.706 0.8746 6.082 0.5032 4.685 0.2545
Support vector machines (el071) 3.875 0.8887 6.379 0.5477 4.353 0.3551
Partial least squares (widekernelpls) 4.017 0.9247 6.031 0.4448 4.592 0.2804
Random forest 3.758 0.9444 7.114 0.3527 6.101 0.1621
Elastic net 3.775 0.9179 6.160 0.6031 4.450 0.3256
Partial least squares (pls) 3.682 0.8931 6.187 0.4834 4.652 0.2530
Ridge regression (w/FS) 3.570 0.9298 5.981 0.5781 4.730 0.3430
Ridge regression 4.839 0.8510 8.469 0.4723 5.627 0.2783
Support vector machines (kernlab) 4.612 0.8800 6.299 0.5249 4.436 0.4241
Partial least squares (kernelpls) 3.804 0.8312 6.010 0.5263 4.681 0.2745
Linear regression (w/Stepwise selection) 4.718 0.7973 8.052 0.5295 4.909 0.2406
Linear regression (w/Forward selection) 4.829 0.8743 8.279 0.4734 4.860 0.2492
Linear regression (w/Backwards selection) 4.479 0.8020 6.385 0.5419 5.179 0.2966
K-Nearest neighbors 6.412 0.6320 7.355 0.2622 4.996 0.1359
Lasso 4.983 0.8076 18.784 0.2545 13.821 0.1487
Conditional inference random forest 6.663 0.7671 6.531 0.5158 4.645 0.3540
Conditional inference tree 7.566 0.6697 6.923 0.4351 4.870 0.2706
Decision trees 8.021 0.6997 7.789 0.3427 5.221 0.2181

The total carotenoid content (TCC) determined by spectrophotometry (Lambert-Beer formula), the TCC determined by HPLC and the total content of trans-β-carotene (the most abundant carotene in cassava roots) were used as response prediction variables. The parenthesis indicate the package specific method chosen for the simulation, with exception to the linear regression models. For each prediction variable used the best performance values are represented in bold.