Table 2.
Prevalence of achieving and not achieving dietary recommendations and accuracy of decision trees to predict this, using data mining techniques on the nutritional intake of 4156 individuals (2967 individuals for fruit and vegetables) from the UK National Diet and Nutrition Survey (2008–12)
| Fruit & vegetables | Free sugars | Sodium | Fat | Saturated fat | |
|---|---|---|---|---|---|
| No. achieving recommendation without oversampling | 656 | 1472 | 2524 | 1045 | 795 |
| % | 22·1 | 35·4 | 60·7 | 25·1 | 19·1 |
| SMOTE oversampling %* | 252 % (yes) | 85 % (yes) | 54 % (no) | 197 % (yes) | 322 % (yes) |
| No. achieving recommendation after oversampling | 2309* | 2679 | 2524 | 3103 | 3354 |
| No. not achieving recommendation after oversampling | 2311* | 2684 | 2513 | 3111 | 3361 |
| Decision tree with the best trade-off between accuracy and number of predictor variables | |||||
| Overall accuracy (%) | 83·1 | 76·5 | 75·9 | 72·4 | 79·7 |
| Sensitivity (%) | 82·5 | 76·1 | 81·9 | 66·3 | 75·8 |
| Specificity (%) | 83·8 | 76·9 | 69·8 | 78·4 | 83·6 |
| No. of predictor variables | 11 | 28 | 28 | 33 | 28 |
| % of all relevant food/nutrient (g) accounted for by predictor variables | 21·0† | 31·2 | 13·4 | 13·0 | 27·4 |
| Most accurate decision tree | |||||
| Overall accuracy (%) | 83·6 | 77·0 | 76·1 | 72·9 | 81·7 |
| Sensitivity (%) | 83·9 | 75·7 | 80·7 | 69·3 | 81·4 |
| Specificity (%) | 83·3 | 78·3 | 71·5 | 76·4 | 81·9 |
| No. of predictor variables | 50 | 64 | 49 | 123 | 156 |
| % of all relevant food/nutrient (g) accounted for by predictor variables | 30·8† | 38·6 | 25·4 | 29·5 | 42·7 |
SMOTE, Synthetic Minority Over-sampling TEchnique.
After oversampling using the SMOTE method (see online supplementary material).
Percentage of all fruit and vegetables (g) recorded, not just those contributing to 5-a-day portions (specifically, fruit juice can contribute a maximum of only one 5-a-day portion).