. 2017 Jan 11;8(1):63–79. doi: 10.3945/an.116.013144

TABLE 3.

Summary of methodological studies evaluating dietary instruments for measuring in-school consumption (n = 22 studies)¹

Study, year (ref), country	Dietary instrument	Reference method	Sample characteristics	Results²	Key findings³
School meal specific recalls⁴ (n = 15 studies)
Baxter et al., 2002 (31), United States	Breakfast and lunch recall (school meals), same day (evening), no parental assistance	MOs	104 children (10 y old)	Omission rate: 41% Intrusion rate: 24% Absolute difference: 0.24 svg Arithmetic difference: −0.08 svg 89% agreement across raters for within 0.25 svg for MOs	Poor accuracy for types of foods reported. However, children were able to report amounts consumed for matched foods with acceptable accuracy. Good interrater reliability for in-person meal observations.
Baxter et al., 2003 (37), United States	Breakfast and lunch recall (school meals), same day (evening) via person and telephone, no parental assistance, multiple-pass protocol	MOs	69 children (10 y old)	For in-person and via telephone recalls, respectively: Omission rates: 34% and 32% Intrusion rates: 19% and 16% Absolute difference: 0.28 and 0.19 svg Arithmetic difference: −0.09 and 0 svg 93% agreement across raters for within 0.25 svg for MOs	Poor accuracy for types of foods reported, regardless of telephone vs. in-person interview. However, children were able to report amounts consumed for matched foods with acceptable accuracy. Good interrater reliability for the reference method.
Baxter et al., 2003 (32), United States	Breakfast and lunch recall (school meals), next morning, no parental assistance, multiple-pass protocol (4 passes)	MOs	121 children (10 y old)	For reverse and forward recalls, respectively: Omission rates: 57% and 56% Intrusion rates: 32% and 39% Absolute difference: 0.23 and 0.24 svg Arithmetic difference: −0.08 and −0.08 svg 90% agreement across raters for within 0.25 svg for MOs	Poor accuracy for foods reported, regardless of the recall order. However, children were accurate at reporting amounts for matched foods. Good interrater reliability for the reference method.
Baxter et al., 2009 (38), United States	Breakfast and lunch recall (school meals), same and next day (different retention intervals), no parental assistance	MOs	374 children (10 y old)	For lunch recalls (rates for different time retention periods, ranges): Omission rates: 28–55% Intrusion rates: 16–49% Energy report rates: 78–88% Reported good interrater reliability for MOs but percentage agreement not reported by authors	Poor accuracy for types of foods reported (both on same day and next day). Poor accuracy for energy reported when recall occurred on the next day but acceptable accuracy when recall occurred on the same day.
Baxter et al., 2010 (39), United States	Breakfast and lunch recall (school meals), different retention intervals, no parental assistance	MOs	374 children (10 y old)	Mean report rates for all interview conditions combined: Energy report rates: 85% Protein report rates: 105% Carbohydrates report rates: 86% Fat report rates: 97%94–97% agreement across raters for MOs	Acceptable accuracy for energy, protein, carbohydrates, and fat report rates; report rates varied by nutrient. Retention period did not affect energy, protein, carbohydrate, or fat report rates. Good interrater reliability for reference method.
Baxter et al., 2000 (33), United States	Lunch recall (school meals), morning of the next day, no parental assistance, with and without prompting	MOs	96 children (mean 7.2 and 10.1 y old)	Next day, no prompting: Total inaccuracy⁵: 2.7 svg (grade 1) Total inaccuracy: 1.7 svg (grade 4) Next day, with prompting: Total inaccuracy: 2.6 svg (grade 1) Total inaccuracy: 1.8 svg (grade 4)	Accuracy described as poor for both grades, but accuracy measures were higher for grade 4 than for grade 1 students.
Baxter et al., 1997 (36), United States	Lunch recall (school meals), same day and next day, no parental assistance	MOs	260 children (10 y old)	Rates for same day and next day, respectively: Omission rates: 16%, 32% Intrusion rates: 5%, 13% Absolute difference: 0.10 and 0.14 svg Arithmetic difference: −0.02 and −0.01 svg No interaction effects found by sex or ethnicity	Poor accuracy for types of foods reported, but omission and intrusion rates improved when recall was conducted on the same day as the lunch meal. Children were able to accurately report amounts consumed for matched foods (small absolute and arithmetic differences).
Biltoft-Jensen et al., 2013 (40), Denmark	Lunch recall (school meals), same day (evening), web-based software (WebDASC), parental assistance	Weighed FRs and DP method	81 children (8–11 y old)	Omission rate: 3% Intrusion rate: 14%	Acceptable accuracy for types of foods reported.
Biltoft-Jensen et al., 2015 (41), Denmark	Lunch recall (school meals), same day (evening), web-based software (WebDASC), parental assistance	Weighed FRs and DP method	193 children (8–11 y old)	Omission rate: 9% Intrusion rate: 6%	Acceptable accuracy for types of foods reported.
Guinn et al., 2010 (42), United States	School meal recall, various retention periods, no parental assistance	MOs	327 children (10 y old)	Mean energy report rate: 88% Report rates for energy decreased with increasing points on the social desirability scale and BMI percentile categories >90% interrater agreement for the reference method	Acceptable accuracy for energy report rates. Accuracy was inversely associated with social desirability bias and BMI. Good interrater reliability for the reference method
Hunsberger et al., 2013 (43), Sweden	School meal recall, previous day, RD administered using a web-based dietary recall software	Weighed FRs	25 children (6–8 y old)	Overall match rate: 90% (range: 67–100%). Difference in child-reported energy and observed energy: 7 ± 50 kcal (P = 0.49); strong correlation (r = 0.92; P < 0.001) between children’s recall and MOs	Acceptable accuracy for reporting individual foods and for estimating energy intake of the group.
Lyng et al., 2013 (44), Denmark	Lunch recall (home-packed lunches), same day (early afternoon), no parental assistance	DP method	114 children (11 y old)	For girls and boys, respectively: Match rates: 78%, 74% Omission rates: 22%, 26% Intrusion rates: 18%, 24%	Poor accuracy for reported food items.
Medin, 2015 (45), Norway	Lunch recall (school meals), web-based self-administered recall, same day, with parental assistance	MOs	117 children (8–9 y old)	Match rate: 73% Omission rate: 27% Mean intrusion rate: 19% Higher parental education associated with better accuracy (77% vs. 52% match rate) (P = 0.008)92% agreement across raters for MOs	Poor accuracy for reported foods. Good interrater reliability for the reference method.
Paxton et al., 2011 (46), United States	Lunch recall (school meals), paper-and-pencil questionnaire, same day, no parental assistance	MOs	18 children (8–10 y old)	6% mean omission rate 10% mean intrusion rate Absolute difference: 0.06 svg Arithmetic difference: 0.01 svg 85% agreement between raters for MOs	Acceptable accuracy for both reported food items and for amounts of foods reported. Good interrater reliability for reference method.
Warren et al., 2003 (47), United Kingdom	Lunch recall (school meals and home-packed lunches), same day (2 h after meal), no parental assistance	MOs	303 children (5–7 y old)	Home-packed and school lunches, respectively: Match rates: 70%, 58% Intrusion rates: 22%, 8% Prompting significantly increased match rates, which increased from 66% to 80% for the whole sample Interrater reliability not reported for MOs	Poor accuracy for reported food items, regardless of lunch type. Nondirective prompts increased recall accuracy.
Estimated FRs (n = 1 study)
Domel et al., 1994 (48), United States	Estimated lunchtime FRs, no parental assistance	MOs	24 students (9–10 y old)	Using a daily monitoring approach (students were prompted by data collectors to complete their FRs), Pearson r ranged from 0.16 to 0.85 for different meal components (mean r = 0.66). However, using the weekly monitoring approach, Pearson r values dropped: range −0.21 to 0.69 (mean r = 0.25) 90% agreement between raters for MOs	Overall school meal accuracy was acceptable only for children who were monitored on a daily basis to complete their FRs (but not among children on the weekly monitoring group). Accuracy also varied considerably depending on the meal component. Good interrater reliability for MOs.
FFQs (n = 1 study)
Neuhouser et al., 2009 (49), United States	19-item beverage and snack FFQ (recall period: past week)	Estimated 4-d FRs (administered 1 wk before FFQ)	46 children (mean age 12.7 y old)	Pearson r = 0.71, 0.70, and 0.69 for beverages, snacks and total fruit and vegetables, respectively Test-retest reliability r > 0.7 for 19 items	Acceptable accuracy.Acceptable test-retest reliability.
In-person meal observations (n = 1 study)
Richter et al., 2012 (50), Canada	In-person meal observation (by nutrition students)	Weighed FRs	32 children (elementary school, no age provided)	Raters accurately identified 86% of premeasured lunches (within 0.25 svg) and over- or underreported amounts for the other meal items in 14% of the lunches Majority of ICCs >0.8	Acceptable accuracy.Good interrater reliability.
DP methods (n = 2 studies)
Sabinsky et al., 2013 (51), Denmark	DP for home-packed lunches	Weighed FRs	191 children (7–13 y old)	Spearman r range: 0.89–0.97 for amounts of meal items. No statistical difference between amounts of fish, fat, starch, whole grains, and overall lunch meal quality index scores between the test and reference instrument Bland-Altman analyses suggest negligible bias (mean bias for fruit: −4.27 g; LOA −29.4 to 20.8 g) (mean bias for vegetables −6.19 g; LOA −34.5 to 22.2 g). κ coefficients: 0.59–0.82	Acceptable accuracy. Bland-Altman plots suggest a tendency for the DP method to underestimate fruit and vegetable consumption. Acceptable interrater reliability.
Taylor et al., 2014 (52), United States	DP for school meals	Weighed FRs	958 children (8–10 y old)	Pearson r used to assess correlations between amounts estimated from DP vs. from weighed FRs for various meal lunch items. Correlation coefficient range: 0.59–0.98, all r values >0.8 except for leafy greens (r = 0.59) and lasagna (r = 0.62) Mean fruit and vegetable consumption with the use of photography (97 g) was within 1 g of reference method and not significantly different from reference method (P = 0.56). LOA for individual-tray fruit and vegetable consumption were –32.9 to 31.3 g. 96% agreement across raters; mean ICC was 0.92	Acceptable accuracy. DP was accurate at estimating amounts eaten at the group level. There was no evidence of bias from Bland-Altman analyses. Good interrater reliability.
SFC (n = 2 studies)
Kremer et al., 2006 (53), Australia	SFC	Weighed FRs	106 children (5–12 y old)	Relative accuracy for energy measured by using mean difference between test and reference instrument: Mean difference 15 kJ (95% CI: −107, 138 kJ) (P > 0.05). κ coefficient: 0.51	The SFC provides acceptable accuracy to measure energy intake for the group. Interrater reliability was poor.
Mitchell et al., 2010 (54), Australia	SFC, home-packed lunches, SFC + meal observations, and SFC + DP method	No reference measure⁶	176 children (5–8 y old)	Results shown are ICCs range for different lunch items By using the SCF with meal observations: ICCs = 0.78–1 (intrarater reliability) ICCs = 0.50–0.95 (interrater reliability) (majority ICCs >0.7 for all meal items except for noodles and leftovers) By using the SCF with DP: ICCs = 0.57–0.98 (intrarater reliability) ICCs = 0.34–0.92 (interrater reliability) (majority ICCs >0.7 for all meal items except for leftovers)	The SFC has good intrarater reliability; acceptable interrater reliability for majority of meal items.

DP, digital photography; FR, food record; ICC, intraclass correlation coefficient; LOA, limits of agreement; MO, meal observation; ref, reference; SFC, school food checklist; svg, serving; WebDASC, Web-based Dietary Assessment Software for Children.

Results presented are group means unless specified otherwise.

To provide an overall rating for relative accuracy and reliability (when relevant), we used cutoffs for measures of relative accuracy. Those measures of accuracy and reliability are presented in Table 1.

⁴

Unless specified, the school meal recall method was interviewer-administered.

⁵

Total inaccuracy combines both the type of error from misreporting meal items with measures of accuracy (33). Total inaccuracy is calculated as (absolute difference between amounts reported and observed eaten for each match × statistical weight) + (each omitted amount × statistical weight) + (each intruded amount × statistical weight) summed over all items for a given meal for each child. There is no upper limit for total inaccuracy, and this measure of reporting can be sensitive to the number and types of meal components. For example, a meal with multiple small entrées may result in greater total inaccuracy values than a meal with only 1 entrée. Thus, total inaccuracy is not appropriate to compare the accuracy of different instruments because these measures may vary depending on the meal context. Hence, this indicator was not selected as a criterion for comparing accuracy between types of instruments.

⁶

The study goal was to test intra- and interrater reliability for estimating energy intake from lunches by using the SFC.