The accuracy of the Goldberg method for classifying misreporters of energy intake on a food frequency questionnaire and 24-hour recalls: Comparison with doubly labeled water

Janet A Tooze; Susan M Krebs-Smith; Richard P Troiano; Amy F Subar

doi:10.1038/ejcn.2011.198

. Author manuscript; available in PMC: 2012 Nov 1.

Published in final edited form as: Eur J Clin Nutr. 2011 Nov 30;66(5):569–576. doi: 10.1038/ejcn.2011.198

The accuracy of the Goldberg method for classifying misreporters of energy intake on a food frequency questionnaire and 24-hour recalls: Comparison with doubly labeled water

Janet A Tooze ¹, Susan M Krebs-Smith ², Richard P Troiano ², Amy F Subar ²

PMCID: PMC3319469 NIHMSID: NIHMS334580 PMID: 22127332

Abstract

Background/Objectives

Adults often misreport dietary intake; the magnitude varies by the methods used to assess diet and classify participants. The objective was to quantify the accuracy of the Goldberg method for categorizing misreporters on a food frequency questionnaire (FFQ) and two 24-hour recalls (24HR).

Subjects/Methods

We compared the Goldberg method, which uses an equation to predict total energy expenditure (TEE), to a criterion method that uses doubly labeled water (DLW), in a study of 451 men and women. Underreporting was classified using recommended cutpoints and calculated values. Sensitivity and specificity, positive and negative predictive value (PPV and NPV), and the area under the receiver operating characteristic curve (AUC) were calculated. Predictive models of underreporting were contrasted for the Goldberg and DLW methods.

Results

AUC were 0.974 and 0.972 on the FFQ, and 0.961 and 0.938 on the 24HR for men and women, respectively. The sensitivity of the Goldberg method was higher for the FFQ (92%) than the 24HR (50%); specificity was higher for the 24HR (99%) than the FFQ (88%); PPV was high for the 24HR (92%) and FFQ (88%). Simulation studies indicate attenuation in odds ratio estimates and reduction of power in predictive models.

Conclusions

Although use of the Goldberg method may lead to bias and reduction in power in predictive models of underreporting, the method has high predictive value for both the FFQ and the 24HR. Thus, in the absence of objective measures of TEE or physical activity, the Goldberg method is a reasonable approach to characterizing underreporting.

Keywords: Diet, Diet Surveys, Energy Intake, Statistical Bias, Questionnaires/standards, Research Design

INTRODUCTION

It is well accepted that adults misreport their dietary intake on self-administered tools, most often in the direction of underreporting energy intake. In many studies of underreporting, participants are classified as underreporters (UR) or acceptable reporters (AR), the prevalence of underreporting is estimated, and personal characteristics are related to reporting status (Macdiarmid and Blundell, 1998; Hill and Davies, 2001; Livingstone and Black, 2003). Other studies have proposed excluding UR from analyses to reduce the effects of measurement error on relationships between diet and obesity or other health outcomes; exclusion of UR often leads to different conclusions than when they are included (Drummond et al., 1998; Huang et al., 2005).

The Goldberg approach is commonly used to identify misreporters (Goldberg et al., 1991; Black et al., 1991; Black, 2000a). However, because of the assumptions and formulas used to estimate TEE, it may be prone to misclassification, which could lead to bias in studies using this method. One way to test this is to use an unbiased estimate of TEE, such as that estimated from doubly labeled water (TEE_DLW), to examine how well the Goldberg method classifies underreporters. Two studies have used TEE_DLW to examine the sensitivity and specificity of the Goldberg method for categorizing misreported reported energy intake (rEI) (Livingstone et al., 2003; Black, 2000b). Both studies reported that approximately 50% of the participants categorized as UR using TEE_DLW (UR_DLW) also were categorized as UR by the Goldberg method (UR_GB). More than 98% of the participants identified by DLW as AR also were identified as AR_GB. However, these analyses were based on small studies (Livingstone et al., 2003) or a pooled analysis of multiple small studies treated as a large study (Black, 2000b). Both studies used food diaries to estimate rEI; we are not aware of studies using other methods of dietary assessment.

This paper uses a large DLW sample from the Observing Protein and Energy Nutrition (OPEN) Study to compare the Goldberg method for categorizing misreporting to estimates using TEE_DLW. Two different dietary assessment instruments are used to estimate rEI, a food frequency questionnaire (FFQ) and two 24-hour recalls (24HR). The purpose of this paper is to compare the classification of UR using the Goldberg method to UR classified using TEE_DLW.

SUBJECTS AND METHODS

Study Population

The OPEN Study is described in detail elsewhere (Subar et al., 2003). The primary goal of the study was to describe the measurement error structure of an FFQ and 24HRs. Participants were 484 men and women aged 40-69 y recruited from a random sample of 5,000 households in the Washington, D.C. metropolitan area. Fifty-eight% of eligible participants agreed to participate in the study; only 2 participants dropped out during the course of the study. The National Cancer Institute’s (NCI) Special Studies Institutional Review Board approved the protocol. Participants completed three clinic visits over a period of approximately 3 months between September 1999 and March 2000.

Energy Intake

Participants completed an FFQ and a 24HR twice, approximately 3 months apart. The FFQ was the NCI Diet History Questionnaire (http://riskfactor.cancer.gov/DHQ), which was validated in previous research for this population (Subar et al., 2001). In this analysis, reported energy intakes from the first FFQ were used. Trained interviewers administered the 24HR using a standardized five-pass method developed by the U.S. Department of Agriculture (Conway et al., 2003; Conway et al., 2004; Moshfegh et al., 2008). The 24HR data were analyzed using the Food Intake Analysis System (version 3.99). The average of the two 24HRs was used because it is a commonly used, albeit naive practice to decrease within-person variation in the estimated usual intake.

Energy Expenditure

The DLW measurement for the OPEN Study is described in detail elsewhere (Trabulsi et al., 2003). A five-urine specimen protocol was used (Schoeller, 1992). TEE_DLW was calculated according to Racette et al. (1994) using the modified Weir equation with a respiratory quotient of 0.86. Thirty-three TEE_DLW measures were excluded for the following reasons: unacceptable internal agreement (n=2), failure to isotopically equilibrate on dosing day (n=10), isotopic dilution space ratios outside the range of 1.00–1.08 (n=6), lack of tracer in the final urine specimen due to high water turnover (n=5), or missing specimens (n=10), resulting in 451 participants who were used in this analysis. Twenty-five participants were dosed with DLW a second time approximately 2 weeks after the first to obtain within-person variation of TEE_DLW. Weight was measured at all visits under standardized conditions. Height was measured at visit 1. Basal metabolic rate (BMR) was calculated from weight, height, and age using the equation developed by Schofield for adults (1985).

Additional Measures

At Visit 1, participants completed the Physical Activity Questionnaire from the National Health and Nutrition Examination Survey (NHANES) 1999-2000. At Visit 2, approximately 2 weeks later, participants completed a Health Questionnaire that contained the Fear of Negative Evaluation Scale (Leary, 1983), and questions regarding Stunkard-Sorenson body silhouettes (Stunkard et al., 1982). At Visit 3 (approximately 3 months after Visit 1), participants completed the Three-Factor Eating Questionnaire (Stunkard and Messick, 1985), the Marlowe-Crowne Social Desirability Questionnaire (Crowne and Marlowe, 1960; Strahan and Gerbasi, 1972; Fishcer and Fick, 1993) and questions about dieting/weight loss.

Classification of Misreporters

In the DLW method and the Goldberg method, participants are classified as UR, AR, or overreporters using the ratio of rEI to TEE. In the Goldberg method, TEE_GB is calculated from the product of BMR and physical activity level (PAL). A constant value is assumed for PAL, and therefore the ratio of rEI:TEE may be expressed in terms of multiples of rEI to BMR. Because of skewness observed in the distribution of energy intake, the natural log transformation of the ratio is used in both methods. In both the DLW method and the Goldberg method, a 95% confidence interval is created about the log of the ratio, and individuals who fall outside of the confidence interval are classified as under or overreporters.

For the Goldberg method, values for variation in rEI, BMR, and PAL as suggested by Black (2000b) were applied to classify misreporting. PAL was assumed to be 1.55. In secondary analyses, we classified UR using a different assumption for variability on the FFQ; in particular, we used the coefficient of variation from the OPEN study to estimate within-person variation for one day of measurement (Supplementary Material).

Statistical Analysis

Sensitivity and Specificity Analyses

Because DLW is an objective biomarker of TEE, and, therefore, a marker of energy intake under energy balance, the classification of reporting status using rEI:TEE_DLW was the “gold standard” in our analyses. Due to the small numbers of participants classified as overreporters, this group was excluded from the sensitivity and specificity analyses. Sensitivity was calculated as the proportion UR_GB among UR_DLW. Specificity was calculated as the proportion of AR_GB among AR_DLW. Positive predictive value (PPV), the probability of being an UR if classified as one by the Goldberg method, and Negative predictive value (NPV), the probability of being an AR if classified as one by the Goldberg method were calculated. We also used the area under the receiving operator characteristic curve (AUC) to quantify the classification accuracy of the Goldberg method. Area under the ROC curve over 0.9 indicates outstanding discrimination (Hosmer and Lemeshow, 2000).

Cutpoint and TEE Analyses

The differences between the Goldberg method and the DLW method for classifying underreporters are due to the: estimate of TEE from the Goldberg formula or DLW, and cutpoints used (Figures 1 and 2). If TEE_GB=TEE_DLW, the ratio of rEI:TEE is equivalent, and the two methods agree. Even if TEE_GB differs from TEE_DLW, the two methods will provide the same classification if the participants are below (or above) both of the cutpoints for the two methods. We estimated whether differences between the methods were due to differences in the cutpoint (participants were between the cutpoints for the two methods) or due to TEE (TEE would lead to discrepancies even with the same cutpoints). We compared TEE_GB to TEE_DLW using a Wilcoxon Signed Rank test and by calculating correlation (Supplementary Material).

Men: Ratio of reported energy intake (rEI) on a food frequency questionnaire to total energy expenditure (TEE), as estimated by doubly labeled water (DLW, illustrated with filled circles) or the Goldberg method (circles, triangles, and squares) by participant, ranked by ratio from DLW value. Only the participants classified as underreporters by either method (FFQ: n = 136) are shown in the figure; for clarity the first 60 men (who showed agreement) are excluded from the plot. Open circles indicate that the Goldberg method classification agrees with DLW classification; triangles indicate that the difference between the two methods is due to differences in the cutpoints; and squares indicate that the differences are due to estimation of TEE. The dashed line represents the cutpoint from the Goldberg method (0.71), and the solid line represents the cutpoint from DLW (0.68).

Men: Ratio of reported energy intake (rEI) on the average of two 24-hour recalls to total energy expenditure (TEE), as estimated by doubly labeled water (DLW, illustrated with filled circles) or the Goldberg method (circles, triangles, and squares) by participant, ranked by ratio from DLW value. Only the participants classified as underreporters by either method (24HR: n = 53 are shown in the figure. Open circles indicate that the Goldberg method classification agrees with DLW classification; triangles indicate that the difference between the two methods is due to differences in the cutpoints; and squares indicate that the differences are due to estimation of TEE. The dashed line represents the cutpoint from the Goldberg method (0.62), and the solid line represents the cutpoint from DLW (0.71).

Analysis of Implications of Using the Goldberg Method Compared to the DLW Method

To study the implications of using TEE_GB to classify UR in studies of characteristics of URs, we modeled the probability of being an UR_GB using variables previously identified as statistically significant predictors of UR_DLW in the OPEN Study (Tooze et al., 2004). The variables include: education (men, 24HR), BMI (men, FFQ; women and men, 24HR), percent of energy from fat (women, FFQ and 24HR), number of eating occasions (men, FFQ and 24HR), variability in number of meals (women, 24HR), whether the participant has ever lost 10 lbs or more (women, FFQ), times dieted (men, 24HR), fear of negative evaluation (women,FFQ), activity level compared to others (men, FFQ), usual activity (women, 24HR), social desirabilitiy (women and men, 24HR), and restraint (men, 24HR)).

We also did a simulation study to quantify the effects of varying sensitivities and specificities in this type of model. BMI (mean=27.9, SD=5.3) was simulated based on the participants in the OPEN study for 300 datasets with 500 individuals each. True UR status was simulated from BMI with 49% probability of classification as a true UR, and a 35% increase in the odds of being an UR for each 5 unit increase in BMI, based on the observed relationship in the OPEN study for the FFQ (Tooze et al., 2004). Finally, the UR_GB status was simulated for five different combinations of sensitivity and specificity, and the relationship between UR_GB was modeled in a logistic regression for each dataset. All analyses were performed using SAS software (v 9, Cary, NC).

RESULTS

As reported previously (Subar et al., 2003), using TEE_DLW, 21% of men and 22% of women were categorized as underreporters on the 24HR, and 50% of men and 49% of women were categorized as underreporters on the FFQ. Using TEE_GB and the standard cutpoints recommended by Black (8), 10% of men and 13% of women were categorized as underreporters on the 24HR, and 52% of men and 51% of women were categorized as underreporters on the FFQ. The AUC analysis indicated outstanding discrimination for men and women for both instruments using UR_GB. The AUC were 0.974 and 0.972 for the FFQ, and 0.961 and 0.938 for the 24HR, for women and men, respectively.

For the FFQ, sensitivity of the Goldberg method for identifying UR was 92.6% for men and 92.1% for women; specificity was 87.6% for both men and women (Table 1). The PPV was 88% and NPV was 92% for both men and women. When we assumed that the reported energy intake was based on one measure and not infinite as the Goldberg method commonly assumes (Black, 2000b), and used the estimate of within-person variation from the OPEN Study in the formula, sensitivity was lower (71.9% for men, 62.4% for women), and specificity increased (100% for men, 99% for women). Sensitivity for the 24HR was 45.1% for men and 54.3% for women; specificity was 98.9% for men and 95.5% for women. The PPV was 92% for men and women; the NPV was 86% for men and 88% for women.

Table 1.

Sensitivity and Specificity of Goldberg Method for the Food Frequency Questionnaire and 24-hour recall in the Observing Protein and Energy Nutrition Study

Classification by TEE_DLW (n = 451)
				UR		AR			OR
		Cutpoints¹				Classification by TEE_GB					Sensitivity and Specificity

Instrument	Sex	Lower	Upper	UR (n)	AR (n)	UR (n)	AR (n)	OR (n)	AR (n)	OR (n)	Misclassified (%)	Sensitivity (%)	Specificity (%)
FFQ	M	1.10	2.19	112	9	14	99	4	0	6	11.1	92.6	87.6
FFQ	F	1.10	2.19	93	8	12	85	4	0	4	11.6	92.1	87.6
24HR	M	0.96	2.49	23	28	2	188	0	2	2	13.1	45.1	98.9
24HR	F	0.96	2.49	25	21	2	156	0	0	2	11.1	54.3	98.7

Open in a new tab

TEE_DLW = total energy expenditure as estimated by doubly labeled water; TEE_GB = total energy expenditure as estimated by Goldberg method; UR=underreporters; AR=acceptable reporters; OR=overreporters; FFQ=food frequency questionnaire; 24HR=24-hour recall

Cutpoint is ratio of reported energy intake divided by basal metabolic rate.

For sensitivity for the FFQ, in both men and women, 100% (9/9 for men and 8/8 for women) of misclassification was due to differences in the estimate of TEE (Figure 1). For specificity on the FFQ, for men, 64% (9/14) of misclassification was due to differences in the estimate of TEE, and 36% (5/14) was due to differences in the cutpoints; for women, 25% (3/12) was due to TEE, and 75% (9/12) due to the cutpoints. For sensitivity for the 24HR, in men, 61% (17/28) of misclassification was due to differences in the estimate of TEE, and 39% (11/28) due to differences in the cutpoint (Figure 2); for women 76% (16/21) were misclassified due to differences in the estimate of TEE, and 24% (5/21) were misclassified due to differences in the cutpoint. For specificity on the 24HR, 100% (2/2) of misclassification was due to differences in the estimate of TEE for men and women.

A plot of sensitivity and specificity by rEI:BMR cutpoint may be used to pick the “optimal” choice for a cutpoint to identify UR_GB. For the FFQ, the curves crossed at rEI:BMR of 1.09 for women and 1.07 for men (Figure 3). For the 24HR, the curves crossed at 1.16 for women and 1.19 for men (Figure 4).

Plot of sensitivity and specificity by rEI:BMR cutpoint for the food frequency questionnaire for women. The vertical line indicates the cutpoint from the standard Goldberg method using the values suggested by Black (2000a) (cutpoint = 1.10).

Plot of sensitivity and specificity by rEI:BMR cutpoint for the 24-hour recall for women. The vertical line indicates the cutpoint from the standard Goldberg method using the values suggested by Black (2000a) (cutpoint = 0.96).

The median expenditure was 11775 kJ and 9558 kJ for TEE_DLW and 11730 kJ and 8986 kJ for TEE_GB for men and women, respectively. The Wilcoxon signed rank test indicated significant within-person differences between the two methods. The correlation of TEE_GB with TEE_DLW was 0.71 for men and 0.68 for women.

To better understand the implications of using the Goldberg method for classifying UR_GB, we compared results of relating underreporting status to personal characteristics using models previously published for UR_DLW. In these models, all of the odds ratio (OR) estimates for women were closer to one than in the model of UR_DLW (results not shown). For men, the association of BMI and the number of eating occasions were stronger in the UR_GB model; comparison of reported activity level to others was weaker in the model of UR_GB. The models of UR_GB on the 24HR in women showed a stronger relationship with BMI than the UR_DLW model (results not shown). However, the relationships with fear of negative evaluation and social desirability were not as strong in the UR_GB model as the UR_DLW model. For men on the 24HR, UR_GB was not as strongly related to BMI, education, or restraint, as in the UR_DLW model. The results of the simulation study to investigate the effects using UR_GB rather than true UR status in studies of predicting underreporters indicated that low sensitivity and/or low specificity can affect both bias and power (Table 2). In particular, the parameter for the predictor variable were attenuated by 19.5-37.5%, and power was reduced from 93% to 48-79%.

Table 2.

Results of simulation to investigate the effect of sensitivity and specificity on estimating the relationship of a variable to underreporting status.¹

Sensitivity (%)	Specificity (%)	True effect²	Estimated effect²	Relative bias (%)	Power using true underreporing status (%)	Power using Goldberg method (%)
45	99	1.35	1.21	37.5	93	48
55	99	1.35	1.22	33.5	93	52
65	88	1.35	1.18	46.2	93	47
93	88	1.35	1.28	19.5	93	79
93	78	1.35	1.24	27.7	93	71

Open in a new tab

300 datasets of size 500 were simulated for each sensitivity/specificity combination. The predictor variable was simulated to be BMI with mean 27.9 (SD=5.3).

Odds ratio for 5 unit change in BMI.

DISCUSSION

This analysis explored the utility of the Goldberg method for classifying UR on FFQ and 24HR in a large DLW study, under the assumption that the DLW analysis reflects true UR status. Overall, the Goldberg method provided excellent discrimination between UR and AR. Sensitivity of the 24HR was similar to the estimates from other studies that used food records to assess rEI (Livingstone et al., 2003; Black, 2000b), approximately 50% for both genders combined using the standard Goldberg method. However, it can be argued that PPV has greater utility than sensitivity for evaluating the Goldberg method. The sensitivity indicates half of the true UR were classified as UR_GB. This resulted in a high PPV (92%), i.e., the probability that, among those who are classified UR_GB, most of them really are UR. Conversely, too many UR were classified as AR_GB; this leads to a reduced probability that those classified as AR_GB really are AR, i.e., the NPV (87%) is lower than the PPV. However, the NPV is still relatively high because the prevalence of AR is approximately three-quarters of the population, In contrast, sensitivity for the FFQ was higher than the 24HR, at 92% for the standard Goldberg method. The FFQ had the opposite tendency of the 24HR; true AR were classified as UR_GB. This resulted in a PPV (88%) that was lower than the NPV (92%).

Two sources of variation that may lead to differences in classification were explored: the relationship between TEE_DLW and TEE_GB, and the cutpoints used, which are determined by the within-person variation in rEI and TEE. The higher sensitivity for the FFQ using the standard Goldberg method compared to the 24HR is primarily due to the differences in the assumptions about the variation in the two dietary assessment methods, and subsequent calculated cutpoints. It is not surprising that a 24HR would have more within-person variation than an FFQ, due to day-to-day variation in intake. However, the Goldberg method makes an important assumption about the FFQ that is not made for the 24HR; it assumes that because an FFQ queries usual intake, the number of days it assesses is infinite, thereby eliminating the FFQ term for variability from the equation and tightening the cutpoints for the FFQ (Supplementary Material). When the actual coefficient of variation for the FFQ was used in the formula, the sensitivity dropped dramatically, to 67% overall. (Sensitivity still remained higher than the 24HR due to less within-person variation on the FFQ compared to the 24HR.)

Differences in the estimates of TEE accounted for much of the discrepancy in classification of UR_GB and UR_DLW. We attempted to improve estimation of TEE_GB using PAL estimated from a physical activity questionnaire in the OPEN study. Although this approach often led to estimates of TEE that were closer to TEE_DLW than TEE_GB, the correlation between TEE_GB and TEE_DLW was lower after adjusting for PAL. Due to large differences that have been reported between self-report physical activity and accelerometry (Troiano et al., 2008) and concerns about expenditure-related bias, it is not clear that self-reported estimates of PAL provide better estimates of TEE_GB. However, the use of PAL estimates may be promising if objective measures of physical activity are available.

The optimal cutpoints for maximizing both sensitivity and specificity for the FFQ were 1.09 for women and 1.07 for men, similar to the Goldberg method cutpoint of 1.10. For the 24HR, the optimal cutpoints were 1.16 for women and 1.19 for men, which vary from the Goldberg cutpoint (0.96). Higher cutpoints may be needed to classify underreporters when a 24HR is used, depending on the analyst’s desire to maximize sensitivity, specificity, or both.

This study has limitations that warrant mention. Although DLW provides an unbiased estimate of TEE, the technique still has estimation error. Therefore, the classification of misreporters using DLW is not truly a “gold standard.” However, the within-person variation in this study for DLW was small, so the effect of measurement error in DLW is expected to be minimal. It is also important to note that the participants in the OPEN study were predominantly white and well educated middle-aged adults. Their levels of UR and the association of UR with personal characteristics may differ from those in minority populations, those with lower levels of education, and older or younger persons. However, although the personal characteristics identified may vary in other populations, the loss of power and effect size demonstrated in this study would be expected to occur in studies of these populations.

Another important consideration in interpreting the results of studies of underreporting is the recognition that what is termed “underreporting” is comprised of different sources of error. By definition, underreporting represents systematic error, as opposed to day to day variation and other random sources of error. However, systematic error may be additive systematic error, intake-related systematic error, or a combination. In a previous analysis of this data set, Kipnis et al (2003) identified significant intake-related bias in the FFQ and 24HR, which comprises the underreporting error described in this manuscript. This type of error leads to bias in estimating diet-disease relationships.

Analysis of this large DLW study has demonstrated that using the Goldberg method with recommended cutpoints may misclassify reporting status for some individuals. When evaluating these classification measures, the question that is of most interest for a particular analysis should be considered. Analysts may want to consider choice of different cutpoints other than those commonly used for classifying UR status, depending on whether interest is in maximizing classification of UR or ARs. Compared to doubly labeled water, use of the Goldberg method may lead to loss of power and biased estimates of the association of UR with personal characteristics in predictive models of underreporting status, and any analysis of UR_GB should be interpreted in light of this. Although the sensitivity for the 24HR was low, the PPV was still high indicating that, among the UR_GB classified by the Goldberg method, most of them were true UR, and NPV was also high; these measures were also high for the FFQ. Thus, in the absence of objective measures of TEE or physical activity, the Goldberg method appears to be a reasonable technique to classify UR.

Supplementary Material

NIHMS334580-supplement-1.docx^{(25.5KB, docx)}

Acknowledgments

This work was supported by a contract from the National Cancer Institute. We thank Kristen Beavers, Sharon Kirkpatrick, and Anne Rodgers for helpful suggestions on the manuscript. We would also like to acknowledge the contribution of Arthur Schatzkin to the conception and conduct of the OPEN study, and his contributions to this manuscript.

This work was supported by a contract from the National Cancer Institute (263-MQ-612378).

Footnotes

Supplementary information is available.

CONFLICT OF INTEREST The authors declare no conflict of interest.

References

1.Black AE, Goldberg GR, Jebb SA, Livingstone MB, Cole TJ, Prentice AM. Critical evaluation of energy intake data using fundamental principles of energy physiology: 2. Evaluating the results of published surveys. Eur J Clin Nutr. 1991;45:583–599. [PubMed] [Google Scholar]
2.Black AE. Critical evaluation of energy intake using the Goldberg cut-off for energy intake:basal metabolic rate. A practical guide to its calculation, use and limitations. Int J Obes Relat Metab Disord. 2000a;24:1119–1130. doi: 10.1038/sj.ijo.0801376. [DOI] [PubMed] [Google Scholar]
3.Black AE. The sensitivity and specificity of the Goldberg cut-off for EI:BMR for identifying diet reports of poor validity. Eur J Clin Nutr. 2000b;54:395–404. doi: 10.1038/sj.ejcn.1600971. [DOI] [PubMed] [Google Scholar]
4.Conway JM, Ingwersen LA, Vinyard BT, Moshfegh AJ. Effectiveness of the US Department of Agriculture 5-step multiple-pass method in assessing food intake in obese and nonobese women. Am J Clin Nutr. 2003;77:1171–1178. doi: 10.1093/ajcn/77.5.1171. [DOI] [PubMed] [Google Scholar]
5.Conway JM, Ingwersen LA, Moshfegh AJ. Accuracy of dietary recall using the USDA five-step multiple-pass method in men: an observational validation study. J Am Diet Assoc. 2004;104:595–603. doi: 10.1016/j.jada.2004.01.007. [DOI] [PubMed] [Google Scholar]
6.Crowne DP, Marlowe D. A new scale of social desirability independent of psychopathology. J Consult Pyschol. 1960;24:349–354. doi: 10.1037/h0047358. [DOI] [PubMed] [Google Scholar]
7.Drummond SE, Crombie NE, Cursiter MC, Kirk TR. Evidence that eating frequency is inversely related to body weight status in male, but not female, non-obese adults reporting valid dietary intakes. Int J Obes Relat Metab Disord. 1998;22:105–112. doi: 10.1038/sj.ijo.0800552. [DOI] [PubMed] [Google Scholar]
8.Fischer DG, Fick C. Measuring Social Desirability - Short Forms of the Marlowe-Crowne Social Desirability Scale. Educ Psychol Meas. 1993;53:417–424. [Google Scholar]
9.Food and Nutrition Board, Institute of Medicine. Dietary reference intakes for energy, carbohydrate, fiber, fat, fatty acids, cholesterol, protein, and amino acids. Vol. 225. National Academies Press; Washington, DC, USA: Sep 5, 2002. 7-29-2005. [Google Scholar]
10.Goldberg GR, Black AE, Jebb SA, Cole TJ, Murgatroyd PR, Coward WA, et al. Critical evaluation of energy intake data using fundamental principles of energy physiology: 1. Derivation of cut-off limits to identify under-recording. Eur J Clin Nutr. 1991;45:569–581. [PubMed] [Google Scholar]
11.Hill RJ, Davies PS. The validity of self-reported energy intake as determined using the doubly labelled water technique. Br J Nutr. 2001;85:415–430. doi: 10.1079/bjn2000281. [DOI] [PubMed] [Google Scholar]
12.Hosmer DW, Lemeshow S. Applied Logistic Regression. 2. John Wiley & Sons Inc.; New York, NY, USA: 2000. [Google Scholar]
13.Huang TTK, Roberts SB, Howarth NC, McCrory MA. Effect of screening out implausible energy intake reports on relationships between diet and BMI. Obesity Res. 2005;13:1205–1217. doi: 10.1038/oby.2005.143. [DOI] [PubMed] [Google Scholar]
14.Kipnis V, Subar AF, Midthune D, Freedman LS, Ballard-Barbash R, Troiano R, et al. The structure of dietary measurement error: results of the OPEN biomarker study. American Journal of Epidemiology. 2003;158:14–21. doi: 10.1093/aje/kwg091. [DOI] [PubMed] [Google Scholar]
15.Leary MR. A Brief Version of the Fear of Negative Evaluation Scale. Pers Soc Psychol Bull. 1983;9:371–375. [Google Scholar]
16.Livingstone MB, Black AE. Markers of the validity of reported energy intake. J Nutr. 2003;133(Suppl 3):895S–920S. doi: 10.1093/jn/133.3.895S. [DOI] [PubMed] [Google Scholar]
17.Livingstone MB, Robson PJ, Black AE, Coward WA, Wallace JM, McKinley MC, et al. An evaluation of the sensitivity and specificity of energy expenditure measured by heart rate and the Goldberg cut-off for energy intake: basal metabolic rate for identifying mis-reporting of energy intake by adults and children: a retrospective analysis. Eur J Clin Nutr. 2003;57:455–463. doi: 10.1038/sj.ejcn.1601563. [DOI] [PubMed] [Google Scholar]
18.Macdiarmid J, Blundell J. Assessing dietary intake: who, what and why of underreporting. Nutr Res Rev. 1998;11:231–253. doi: 10.1079/NRR19980017. [DOI] [PubMed] [Google Scholar]
19.Moshfegh AJ, Rhodes DG, Baer DJ, Murayi T, Clemens JC, Rumpler WV, et al. The US Department of Agriculture Automated Multiple-Pass Method reduces bias in the collection of energy intakes. Am J Clin Nutr. 2008;88:324–332. doi: 10.1093/ajcn/88.2.324. [DOI] [PubMed] [Google Scholar]
20.Racette SB, Schoeller DA, Luke AH, Shay K, Hnilicka J, Kushner RF. Relative dilution spaces of 2H- and 18O-labeled water in humans. Am J Physiol. 1994;267:E585–E590. doi: 10.1152/ajpendo.1994.267.4.E585. [DOI] [PubMed] [Google Scholar]
21.Schofield WN. Predicting basal metabolic rate, new standards and review of previous work. Hum Nutr Clin Nutr. 1985;39(Suppl 1):5–41. [PubMed] [Google Scholar]
22.Strahan R, Gerbasi K. Short, homogeneous versions of the Marlowe-Crowne social desirability scale. J Clin Psychol. 1972;28:191–193. [Google Scholar]
23.Stunkard AJ, Sorensen T, Schulsinger F. Use of the Danish Adoption Register for the Study of Obesity and Thinness. Res Publ Assoc Res Nerv Ment Dis. 1982;60:115–120. [PubMed] [Google Scholar]
24.Stunkard AJ, Messick S. The three-factor eating questionnaire to measure dietary restraint, disinhibition and hunger. J Psychosom Res. 1985;29:71–83. doi: 10.1016/0022-3999(85)90010-8. [DOI] [PubMed] [Google Scholar]
25.Schoeller DA. Isotope Dilution Methods. In: Björntorp P, Brodoff BN, editors. Obesity. J.B. Lippincott Co; New York, NY, USA: 1992. pp. 80–88. [Google Scholar]
26.Subar AF, Kipnis V, Troiano RP, Midthune D, Schoeller DA, Bingham S, et al. Using intake biomarkers to evaluate the extent of dietary misreporting in a large sample of adults: the OPEN study. Am J Epidemiol. 2003;158:1–13. doi: 10.1093/aje/kwg092. [DOI] [PubMed] [Google Scholar]
27.Subar AF, Thompson FE, Kipnis V, Midthune D, Hurwitz P, McNutt S, et al. Comparative validation of the Block, Willett, and National Cancer Institute food frequency questionnaires: the Eating at America’s Table Study. Am J Epidemiol. 2001;154:1089–1099. doi: 10.1093/aje/154.12.1089. [DOI] [PubMed] [Google Scholar]
28.Trabulsi J, Troiano RP, Subar AF, Sharbaugh C, Kipnis V, Schatzkin A, et al. Precision of the doubly labeled water method in a large-scale application: evaluation of a streamlined-dosing protocol in the Observing Protein and Energy Nutrition (OPEN) study. Eur J Clin Nutr. 2003;57:1370–1377. doi: 10.1038/sj.ejcn.1601698. [DOI] [PubMed] [Google Scholar]
29.Tooze JA, Subar AF, Thompson FE, Troiano R, Schatzkin A, Kipnis V. Psychosocial predictors of energy underreporting in a large doubly labeled water study. Am J Clin Nutr. 2004;79:795–804. doi: 10.1093/ajcn/79.5.795. [DOI] [PubMed] [Google Scholar]
30.Troiano RP, Berrigan D, Dodd KW, Mâsse LC, Tilert T, McDowell M. Physical activity in the United States measured by accelerometer. Med Sci Sports Exerc. 2008;40:181–188. doi: 10.1249/mss.0b013e31815a51b3. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS334580-supplement-1.docx^{(25.5KB, docx)}

[R1] 1.Black AE, Goldberg GR, Jebb SA, Livingstone MB, Cole TJ, Prentice AM. Critical evaluation of energy intake data using fundamental principles of energy physiology: 2. Evaluating the results of published surveys. Eur J Clin Nutr. 1991;45:583–599. [PubMed] [Google Scholar]

[R2] 2.Black AE. Critical evaluation of energy intake using the Goldberg cut-off for energy intake:basal metabolic rate. A practical guide to its calculation, use and limitations. Int J Obes Relat Metab Disord. 2000a;24:1119–1130. doi: 10.1038/sj.ijo.0801376. [DOI] [PubMed] [Google Scholar]

[R3] 3.Black AE. The sensitivity and specificity of the Goldberg cut-off for EI:BMR for identifying diet reports of poor validity. Eur J Clin Nutr. 2000b;54:395–404. doi: 10.1038/sj.ejcn.1600971. [DOI] [PubMed] [Google Scholar]

[R4] 4.Conway JM, Ingwersen LA, Vinyard BT, Moshfegh AJ. Effectiveness of the US Department of Agriculture 5-step multiple-pass method in assessing food intake in obese and nonobese women. Am J Clin Nutr. 2003;77:1171–1178. doi: 10.1093/ajcn/77.5.1171. [DOI] [PubMed] [Google Scholar]

[R5] 5.Conway JM, Ingwersen LA, Moshfegh AJ. Accuracy of dietary recall using the USDA five-step multiple-pass method in men: an observational validation study. J Am Diet Assoc. 2004;104:595–603. doi: 10.1016/j.jada.2004.01.007. [DOI] [PubMed] [Google Scholar]

[R6] 6.Crowne DP, Marlowe D. A new scale of social desirability independent of psychopathology. J Consult Pyschol. 1960;24:349–354. doi: 10.1037/h0047358. [DOI] [PubMed] [Google Scholar]

[R7] 7.Drummond SE, Crombie NE, Cursiter MC, Kirk TR. Evidence that eating frequency is inversely related to body weight status in male, but not female, non-obese adults reporting valid dietary intakes. Int J Obes Relat Metab Disord. 1998;22:105–112. doi: 10.1038/sj.ijo.0800552. [DOI] [PubMed] [Google Scholar]

[R8] 8.Fischer DG, Fick C. Measuring Social Desirability - Short Forms of the Marlowe-Crowne Social Desirability Scale. Educ Psychol Meas. 1993;53:417–424. [Google Scholar]

[R9] 9.Food and Nutrition Board, Institute of Medicine. Dietary reference intakes for energy, carbohydrate, fiber, fat, fatty acids, cholesterol, protein, and amino acids. Vol. 225. National Academies Press; Washington, DC, USA: Sep 5, 2002. 7-29-2005. [Google Scholar]

[R10] 10.Goldberg GR, Black AE, Jebb SA, Cole TJ, Murgatroyd PR, Coward WA, et al. Critical evaluation of energy intake data using fundamental principles of energy physiology: 1. Derivation of cut-off limits to identify under-recording. Eur J Clin Nutr. 1991;45:569–581. [PubMed] [Google Scholar]

[R11] 11.Hill RJ, Davies PS. The validity of self-reported energy intake as determined using the doubly labelled water technique. Br J Nutr. 2001;85:415–430. doi: 10.1079/bjn2000281. [DOI] [PubMed] [Google Scholar]

[R12] 12.Hosmer DW, Lemeshow S. Applied Logistic Regression. 2. John Wiley & Sons Inc.; New York, NY, USA: 2000. [Google Scholar]

[R13] 13.Huang TTK, Roberts SB, Howarth NC, McCrory MA. Effect of screening out implausible energy intake reports on relationships between diet and BMI. Obesity Res. 2005;13:1205–1217. doi: 10.1038/oby.2005.143. [DOI] [PubMed] [Google Scholar]

[R14] 14.Kipnis V, Subar AF, Midthune D, Freedman LS, Ballard-Barbash R, Troiano R, et al. The structure of dietary measurement error: results of the OPEN biomarker study. American Journal of Epidemiology. 2003;158:14–21. doi: 10.1093/aje/kwg091. [DOI] [PubMed] [Google Scholar]

[R15] 15.Leary MR. A Brief Version of the Fear of Negative Evaluation Scale. Pers Soc Psychol Bull. 1983;9:371–375. [Google Scholar]

[R16] 16.Livingstone MB, Black AE. Markers of the validity of reported energy intake. J Nutr. 2003;133(Suppl 3):895S–920S. doi: 10.1093/jn/133.3.895S. [DOI] [PubMed] [Google Scholar]

[R17] 17.Livingstone MB, Robson PJ, Black AE, Coward WA, Wallace JM, McKinley MC, et al. An evaluation of the sensitivity and specificity of energy expenditure measured by heart rate and the Goldberg cut-off for energy intake: basal metabolic rate for identifying mis-reporting of energy intake by adults and children: a retrospective analysis. Eur J Clin Nutr. 2003;57:455–463. doi: 10.1038/sj.ejcn.1601563. [DOI] [PubMed] [Google Scholar]

[R18] 18.Macdiarmid J, Blundell J. Assessing dietary intake: who, what and why of underreporting. Nutr Res Rev. 1998;11:231–253. doi: 10.1079/NRR19980017. [DOI] [PubMed] [Google Scholar]

[R19] 19.Moshfegh AJ, Rhodes DG, Baer DJ, Murayi T, Clemens JC, Rumpler WV, et al. The US Department of Agriculture Automated Multiple-Pass Method reduces bias in the collection of energy intakes. Am J Clin Nutr. 2008;88:324–332. doi: 10.1093/ajcn/88.2.324. [DOI] [PubMed] [Google Scholar]

[R20] 20.Racette SB, Schoeller DA, Luke AH, Shay K, Hnilicka J, Kushner RF. Relative dilution spaces of 2H- and 18O-labeled water in humans. Am J Physiol. 1994;267:E585–E590. doi: 10.1152/ajpendo.1994.267.4.E585. [DOI] [PubMed] [Google Scholar]

[R21] 21.Schofield WN. Predicting basal metabolic rate, new standards and review of previous work. Hum Nutr Clin Nutr. 1985;39(Suppl 1):5–41. [PubMed] [Google Scholar]

[R22] 22.Strahan R, Gerbasi K. Short, homogeneous versions of the Marlowe-Crowne social desirability scale. J Clin Psychol. 1972;28:191–193. [Google Scholar]

[R23] 23.Stunkard AJ, Sorensen T, Schulsinger F. Use of the Danish Adoption Register for the Study of Obesity and Thinness. Res Publ Assoc Res Nerv Ment Dis. 1982;60:115–120. [PubMed] [Google Scholar]

[R24] 24.Stunkard AJ, Messick S. The three-factor eating questionnaire to measure dietary restraint, disinhibition and hunger. J Psychosom Res. 1985;29:71–83. doi: 10.1016/0022-3999(85)90010-8. [DOI] [PubMed] [Google Scholar]

[R25] 25.Schoeller DA. Isotope Dilution Methods. In: Björntorp P, Brodoff BN, editors. Obesity. J.B. Lippincott Co; New York, NY, USA: 1992. pp. 80–88. [Google Scholar]

[R26] 26.Subar AF, Kipnis V, Troiano RP, Midthune D, Schoeller DA, Bingham S, et al. Using intake biomarkers to evaluate the extent of dietary misreporting in a large sample of adults: the OPEN study. Am J Epidemiol. 2003;158:1–13. doi: 10.1093/aje/kwg092. [DOI] [PubMed] [Google Scholar]

[R27] 27.Subar AF, Thompson FE, Kipnis V, Midthune D, Hurwitz P, McNutt S, et al. Comparative validation of the Block, Willett, and National Cancer Institute food frequency questionnaires: the Eating at America’s Table Study. Am J Epidemiol. 2001;154:1089–1099. doi: 10.1093/aje/154.12.1089. [DOI] [PubMed] [Google Scholar]

[R28] 28.Trabulsi J, Troiano RP, Subar AF, Sharbaugh C, Kipnis V, Schatzkin A, et al. Precision of the doubly labeled water method in a large-scale application: evaluation of a streamlined-dosing protocol in the Observing Protein and Energy Nutrition (OPEN) study. Eur J Clin Nutr. 2003;57:1370–1377. doi: 10.1038/sj.ejcn.1601698. [DOI] [PubMed] [Google Scholar]

[R29] 29.Tooze JA, Subar AF, Thompson FE, Troiano R, Schatzkin A, Kipnis V. Psychosocial predictors of energy underreporting in a large doubly labeled water study. Am J Clin Nutr. 2004;79:795–804. doi: 10.1093/ajcn/79.5.795. [DOI] [PubMed] [Google Scholar]

[R30] 30.Troiano RP, Berrigan D, Dodd KW, Mâsse LC, Tilert T, McDowell M. Physical activity in the United States measured by accelerometer. Med Sci Sports Exerc. 2008;40:181–188. doi: 10.1249/mss.0b013e31815a51b3. [DOI] [PubMed] [Google Scholar]

PERMALINK

The accuracy of the Goldberg method for classifying misreporters of energy intake on a food frequency questionnaire and 24-hour recalls: Comparison with doubly labeled water

Janet A Tooze, PhD

Susan M Krebs-Smith, PhD

Richard P Troiano, PhD

Amy F Subar, PhD, MPH, RD

Abstract

Background/Objectives

Subjects/Methods

Results

Conclusions

INTRODUCTION

SUBJECTS AND METHODS

Study Population

Energy Intake

Energy Expenditure

Additional Measures

Classification of Misreporters

Statistical Analysis

Sensitivity and Specificity Analyses

Cutpoint and TEE Analyses

Figure 1.

Figure 2.

Analysis of Implications of Using the Goldberg Method Compared to the DLW Method

RESULTS

Table 1.

Figure 3.

Figure 4.

Table 2.

DISCUSSION

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases