Table 4:
UV-visible (400–500 ηm) + CIELAB data | ||||||
---|---|---|---|---|---|---|
TCC Spectrophotometry |
TCC HPLC |
trans-β-carotene |
||||
RMSE | R 2 | RMSE | R 2 | RMSE | R 2 | |
Partial least squares (simpis) | 3.706 | 0.8746 | 6.082 | 0.5032 | 4.685 | 0.2545 |
Support vector machines (el071) | 3.875 | 0.8887 | 6.379 | 0.5477 | 4.353 | 0.3551 |
Partial least squares (widekernelpls) | 4.017 | 0.9247 | 6.031 | 0.4448 | 4.592 | 0.2804 |
Random forest | 3.758 | 0.9444 | 7.114 | 0.3527 | 6.101 | 0.1621 |
Elastic net | 3.775 | 0.9179 | 6.160 | 0.6031 | 4.450 | 0.3256 |
Partial least squares (pls) | 3.682 | 0.8931 | 6.187 | 0.4834 | 4.652 | 0.2530 |
Ridge regression (w/FS) | 3.570 | 0.9298 | 5.981 | 0.5781 | 4.730 | 0.3430 |
Ridge regression | 4.839 | 0.8510 | 8.469 | 0.4723 | 5.627 | 0.2783 |
Support vector machines (kernlab) | 4.612 | 0.8800 | 6.299 | 0.5249 | 4.436 | 0.4241 |
Partial least squares (kernelpls) | 3.804 | 0.8312 | 6.010 | 0.5263 | 4.681 | 0.2745 |
Linear regression (w/Stepwise selection) | 4.718 | 0.7973 | 8.052 | 0.5295 | 4.909 | 0.2406 |
Linear regression (w/Forward selection) | 4.829 | 0.8743 | 8.279 | 0.4734 | 4.860 | 0.2492 |
Linear regression (w/Backwards selection) | 4.479 | 0.8020 | 6.385 | 0.5419 | 5.179 | 0.2966 |
K-Nearest neighbors | 6.412 | 0.6320 | 7.355 | 0.2622 | 4.996 | 0.1359 |
Lasso | 4.983 | 0.8076 | 18.784 | 0.2545 | 13.821 | 0.1487 |
Conditional inference random forest | 6.663 | 0.7671 | 6.531 | 0.5158 | 4.645 | 0.3540 |
Conditional inference tree | 7.566 | 0.6697 | 6.923 | 0.4351 | 4.870 | 0.2706 |
Decision trees | 8.021 | 0.6997 | 7.789 | 0.3427 | 5.221 | 0.2181 |
The total carotenoid content (TCC) determined by spectrophotometry (Lambert-Beer formula), the TCC determined by HPLC and the total content of trans-β-carotene (the most abundant carotene in cassava roots) were used as response prediction variables. The parenthesis indicate the package specific method chosen for the simulation, with exception to the linear regression models. For each prediction variable used the best performance values are represented in bold.