TABLE 3.
Summary of methodological studies evaluating dietary instruments for measuring in-school consumption (n = 22 studies)1
| Study, year (ref), country | Dietary instrument | Reference method | Sample characteristics | Results2 | Key findings3 |
| School meal specific recalls4 (n = 15 studies) | |||||
| Baxter et al., 2002 (31), United States | Breakfast and lunch recall (school meals), same day (evening), no parental assistance | MOs | 104 children (10 y old) | Omission rate: 41% Intrusion rate: 24% Absolute difference: 0.24 svg Arithmetic difference: −0.08 svg 89% agreement across raters for within 0.25 svg for MOs |
Poor accuracy for types of foods reported. However, children were able to report amounts consumed for matched foods with acceptable accuracy. Good interrater reliability for in-person meal observations. |
| Baxter et al., 2003 (37), United States | Breakfast and lunch recall (school meals), same day (evening) via person and telephone, no parental assistance, multiple-pass protocol | MOs | 69 children (10 y old) | For in-person and via telephone recalls, respectively: Omission rates: 34% and 32% Intrusion rates: 19% and 16% Absolute difference: 0.28 and 0.19 svg Arithmetic difference: −0.09 and 0 svg 93% agreement across raters for within 0.25 svg for MOs |
Poor accuracy for types of foods reported, regardless of telephone vs. in-person interview. However, children were able to report amounts consumed for matched foods with acceptable accuracy. Good interrater reliability for the reference method. |
| Baxter et al., 2003 (32), United States | Breakfast and lunch recall (school meals), next morning, no parental assistance, multiple-pass protocol (4 passes) | MOs | 121 children (10 y old) | For reverse and forward recalls, respectively: Omission rates: 57% and 56% Intrusion rates: 32% and 39% Absolute difference: 0.23 and 0.24 svg Arithmetic difference: −0.08 and −0.08 svg 90% agreement across raters for within 0.25 svg for MOs |
Poor accuracy for foods reported, regardless of the recall order. However, children were accurate at reporting amounts for matched foods. Good interrater reliability for the reference method. |
| Baxter et al., 2009 (38), United States | Breakfast and lunch recall (school meals), same and next day (different retention intervals), no parental assistance | MOs | 374 children (10 y old) | For lunch recalls (rates for different time retention periods, ranges): Omission rates: 28–55% Intrusion rates: 16–49% Energy report rates: 78–88% Reported good interrater reliability for MOs but percentage agreement not reported by authors |
Poor accuracy for types of foods reported (both on same day and next day). Poor accuracy for energy reported when recall occurred on the next day but acceptable accuracy when recall occurred on the same day. |
| Baxter et al., 2010 (39), United States | Breakfast and lunch recall (school meals), different retention intervals, no parental assistance | MOs | 374 children (10 y old) | Mean report rates for all interview conditions combined: Energy report rates: 85% Protein report rates: 105% Carbohydrates report rates: 86% Fat report rates: 97%94–97% agreement across raters for MOs |
Acceptable accuracy for energy, protein, carbohydrates, and fat report rates; report rates varied by nutrient. Retention period did not affect energy, protein, carbohydrate, or fat report rates. Good interrater reliability for reference method. |
| Baxter et al., 2000 (33), United States | Lunch recall (school meals), morning of the next day, no parental assistance, with and without prompting | MOs | 96 children (mean 7.2 and 10.1 y old) | Next day, no prompting: Total inaccuracy5: 2.7 svg (grade 1) Total inaccuracy: 1.7 svg (grade 4) Next day, with prompting: Total inaccuracy: 2.6 svg (grade 1) Total inaccuracy: 1.8 svg (grade 4) |
Accuracy described as poor for both grades, but accuracy measures were higher for grade 4 than for grade 1 students. |
| Baxter et al., 1997 (36), United States | Lunch recall (school meals), same day and next day, no parental assistance | MOs | 260 children (10 y old) | Rates for same day and next day, respectively: Omission rates: 16%, 32% Intrusion rates: 5%, 13% Absolute difference: 0.10 and 0.14 svg Arithmetic difference: −0.02 and −0.01 svg No interaction effects found by sex or ethnicity |
Poor accuracy for types of foods reported, but omission and intrusion rates improved when recall was conducted on the same day as the lunch meal. Children were able to accurately report amounts consumed for matched foods (small absolute and arithmetic differences). |
| Biltoft-Jensen et al., 2013 (40), Denmark | Lunch recall (school meals), same day (evening), web-based software (WebDASC), parental assistance | Weighed FRs and DP method | 81 children (8–11 y old) | Omission rate: 3% Intrusion rate: 14% |
Acceptable accuracy for types of foods reported. |
| Biltoft-Jensen et al., 2015 (41), Denmark | Lunch recall (school meals), same day (evening), web-based software (WebDASC), parental assistance | Weighed FRs and DP method | 193 children (8–11 y old) | Omission rate: 9% Intrusion rate: 6% |
Acceptable accuracy for types of foods reported. |
| Guinn et al., 2010 (42), United States | School meal recall, various retention periods, no parental assistance | MOs | 327 children (10 y old) | Mean energy report rate: 88% Report rates for energy decreased with increasing points on the social desirability scale and BMI percentile categories >90% interrater agreement for the reference method |
Acceptable accuracy for energy report rates. Accuracy was inversely associated with social desirability bias and BMI. Good interrater reliability for the reference method |
| Hunsberger et al., 2013 (43), Sweden | School meal recall, previous day, RD administered using a web-based dietary recall software | Weighed FRs | 25 children (6–8 y old) | Overall match rate: 90% (range: 67–100%). Difference in child-reported energy and observed energy: 7 ± 50 kcal (P = 0.49); strong correlation (r = 0.92; P < 0.001) between children’s recall and MOs | Acceptable accuracy for reporting individual foods and for estimating energy intake of the group. |
| Lyng et al., 2013 (44), Denmark | Lunch recall (home-packed lunches), same day (early afternoon), no parental assistance | DP method | 114 children (11 y old) | For girls and boys, respectively: Match rates: 78%, 74% Omission rates: 22%, 26% Intrusion rates: 18%, 24% |
Poor accuracy for reported food items. |
| Medin, 2015 (45), Norway | Lunch recall (school meals), web-based self-administered recall, same day, with parental assistance | MOs | 117 children (8–9 y old) | Match rate: 73% Omission rate: 27% Mean intrusion rate: 19% Higher parental education associated with better accuracy (77% vs. 52% match rate) (P = 0.008)92% agreement across raters for MOs |
Poor accuracy for reported foods. Good interrater reliability for the reference method. |
| Paxton et al., 2011 (46), United States | Lunch recall (school meals), paper-and-pencil questionnaire, same day, no parental assistance | MOs | 18 children (8–10 y old) | 6% mean omission rate 10% mean intrusion rate Absolute difference: 0.06 svg Arithmetic difference: 0.01 svg 85% agreement between raters for MOs |
Acceptable accuracy for both reported food items and for amounts of foods reported. Good interrater reliability for reference method. |
| Warren et al., 2003 (47), United Kingdom | Lunch recall (school meals and home-packed lunches), same day (2 h after meal), no parental assistance | MOs | 303 children (5–7 y old) | Home-packed and school lunches, respectively: Match rates: 70%, 58% Intrusion rates: 22%, 8% Prompting significantly increased match rates, which increased from 66% to 80% for the whole sample Interrater reliability not reported for MOs |
Poor accuracy for reported food items, regardless of lunch type. Nondirective prompts increased recall accuracy. |
| Estimated FRs (n = 1 study) | |||||
| Domel et al., 1994 (48), United States | Estimated lunchtime FRs, no parental assistance | MOs | 24 students (9–10 y old) | Using a daily monitoring approach (students were prompted by data collectors to complete their FRs), Pearson r ranged from 0.16 to 0.85 for different meal components (mean r = 0.66). However, using the weekly monitoring approach, Pearson r values dropped: range −0.21 to 0.69 (mean r = 0.25) 90% agreement between raters for MOs |
Overall school meal accuracy was acceptable only for children who were monitored on a daily basis to complete their FRs (but not among children on the weekly monitoring group). Accuracy also varied considerably depending on the meal component. Good interrater reliability for MOs. |
| FFQs (n = 1 study) | |||||
| Neuhouser et al., 2009 (49), United States | 19-item beverage and snack FFQ (recall period: past week) | Estimated 4-d FRs (administered 1 wk before FFQ) | 46 children (mean age 12.7 y old) | Pearson r = 0.71, 0.70, and 0.69 for beverages, snacks and total fruit and vegetables, respectively Test-retest reliability r > 0.7 for 19 items |
Acceptable accuracy.Acceptable test-retest reliability. |
| In-person meal observations (n = 1 study) | |||||
| Richter et al., 2012 (50), Canada | In-person meal observation (by nutrition students) | Weighed FRs | 32 children (elementary school, no age provided) | Raters accurately identified 86% of premeasured lunches (within 0.25 svg) and over- or underreported amounts for the other meal items in 14% of the lunches Majority of ICCs >0.8 |
Acceptable accuracy.Good interrater reliability. |
| DP methods (n = 2 studies) | |||||
| Sabinsky et al., 2013 (51), Denmark | DP for home-packed lunches | Weighed FRs | 191 children (7–13 y old) | Spearman r range: 0.89–0.97 for amounts of meal items. No statistical difference between amounts of fish, fat, starch, whole grains, and overall lunch meal quality index scores between the test and reference instrument Bland-Altman analyses suggest negligible bias (mean bias for fruit: −4.27 g; LOA −29.4 to 20.8 g) (mean bias for vegetables −6.19 g; LOA −34.5 to 22.2 g). κ coefficients: 0.59–0.82 |
Acceptable accuracy. Bland-Altman plots suggest a tendency for the DP method to underestimate fruit and vegetable consumption. Acceptable interrater reliability. |
| Taylor et al., 2014 (52), United States | DP for school meals | Weighed FRs | 958 children (8–10 y old) | Pearson r used to assess correlations between amounts estimated from DP vs. from weighed FRs for various meal lunch items. Correlation coefficient range: 0.59–0.98, all r values >0.8 except for leafy greens (r = 0.59) and lasagna (r = 0.62) Mean fruit and vegetable consumption with the use of photography (97 g) was within 1 g of reference method and not significantly different from reference method (P = 0.56). LOA for individual-tray fruit and vegetable consumption were –32.9 to 31.3 g. 96% agreement across raters; mean ICC was 0.92 |
Acceptable accuracy. DP was accurate at estimating amounts eaten at the group level. There was no evidence of bias from Bland-Altman analyses. Good interrater reliability. |
| SFC (n = 2 studies) | |||||
| Kremer et al., 2006 (53), Australia | SFC | Weighed FRs | 106 children (5–12 y old) | Relative accuracy for energy measured by using mean difference between test and reference instrument: Mean difference 15 kJ (95% CI: −107, 138 kJ) (P > 0.05). κ coefficient: 0.51 |
The SFC provides acceptable accuracy to measure energy intake for the group. Interrater reliability was poor. |
| Mitchell et al., 2010 (54), Australia | SFC, home-packed lunches, SFC + meal observations, and SFC + DP method | No reference measure6 | 176 children (5–8 y old) | Results shown are ICCs range for different lunch items By using the SCF with meal observations: ICCs = 0.78–1 (intrarater reliability) ICCs = 0.50–0.95 (interrater reliability) (majority ICCs >0.7 for all meal items except for noodles and leftovers) By using the SCF with DP: ICCs = 0.57–0.98 (intrarater reliability) ICCs = 0.34–0.92 (interrater reliability) (majority ICCs >0.7 for all meal items except for leftovers) |
The SFC has good intrarater reliability; acceptable interrater reliability for majority of meal items. |
DP, digital photography; FR, food record; ICC, intraclass correlation coefficient; LOA, limits of agreement; MO, meal observation; ref, reference; SFC, school food checklist; svg, serving; WebDASC, Web-based Dietary Assessment Software for Children.
Results presented are group means unless specified otherwise.
To provide an overall rating for relative accuracy and reliability (when relevant), we used cutoffs for measures of relative accuracy. Those measures of accuracy and reliability are presented in Table 1.
Unless specified, the school meal recall method was interviewer-administered.
Total inaccuracy combines both the type of error from misreporting meal items with measures of accuracy (33). Total inaccuracy is calculated as (absolute difference between amounts reported and observed eaten for each match × statistical weight) + (each omitted amount × statistical weight) + (each intruded amount × statistical weight) summed over all items for a given meal for each child. There is no upper limit for total inaccuracy, and this measure of reporting can be sensitive to the number and types of meal components. For example, a meal with multiple small entrées may result in greater total inaccuracy values than a meal with only 1 entrée. Thus, total inaccuracy is not appropriate to compare the accuracy of different instruments because these measures may vary depending on the meal context. Hence, this indicator was not selected as a criterion for comparing accuracy between types of instruments.
The study goal was to test intra- and interrater reliability for estimating energy intake from lunches by using the SFC.