Table 3.
Performance metrics of all models used to predict food group consumption and daily diet quality. This table presents the performance of four machine learning models—Stochastic gradient boosted decision trees (SGBDT), random forests (RF), hurdle SGBDT, and hurdle RF—applied to predict intake (serves) of individual food groups and daily diet quality scores. Outcomes include fruit, vegetable, grain, meat and alternatives, dairy and alternatives, discretionary foods, and an overall daily diet quality measure. For each outcome, the model with the lowest MAE (best performance) is marked with an asterisk (*) with other metrics bolded. Lower RMSE and MAE, and higher R² values indicate better model performance.
| Performance metrics | ||||
|---|---|---|---|---|
| Outcome | Model | RMSE1 | MAE2 | R squared |
| Fruits | SGBDT | 0.63 | 0.35 | 0.049 |
| RF | 0.56 | 0.30* | 0.418 | |
| Hurdle SGBDT | 0.71 | 0.59 | 0.023 | |
| Hurdle RF | 0.69 | 0.44 | 0.598 | |
| Vegetables | SGBDT | 1.57 | 0.94 | 0.099 |
| RF | 1.33 | 0.75* | 0.463 | |
| Hurdle SGBDT | 1.12 | 0.99 | 0.031 | |
| Hurdle RF | 1.54 | 1.00 | 0.517 | |
| Grains | SGBDT | 1.29 | 0.96 | 0.096 |
| RF | 1.03 | 0.72 | 0.496 | |
| Hurdle SGBDT | 0.94 | 0.85 | 0.021 | |
| Hurdle RF | 0.82 | 0.55* | 0.616 | |
| Meat and alternatives | SGBDT | 0.80 | 0.52 | 0.097 |
| RF | 0.72 | 0.45 | 0.439 | |
| Hurdle SGBDT | 0.69 | 0.59 | 0.022 | |
| Hurdle RF | 0.62 | 0.40* | 0.632 | |
| Dairy and alternatives | SGBDT | 0.49 | 0.33 | 0.055 |
| RF | 0.43 | 0.28* | 0.441 | |
| Hurdle SGBDT | 0.57 | 0.50 | 0.017 | |
| Hurdle RF | 0.44 | 0.30 | 0.562 | |
| Discretionary foods | SGBDT | 1.38 | 0.79 | 0.129 |
| RF | 1.12 | 0.59* | 0.552 | |
| Hurdle SGBDT | 0.83 | 0.68 | 0.027 | |
| Hurdle RF | 1.23 | 0.72 | 0.620 | |
| Daily diet quality | SGBDT | 14.57 | 11.96 | 0.113 |
| RF | 14.41 | 11.86 | 0.119 | |
*Best performing. Bold – best performing metrics
1RMSE - Root Mean Squared Error
2MAE - Mean Absolute Error