Skip to main content
American Journal of Epidemiology logoLink to American Journal of Epidemiology
. 2015 Feb 5;181(4):225–233. doi: 10.1093/aje/kwu308

Comparison of Methods to Account for Implausible Reporting of Energy Intake in Epidemiologic Studies

Jinnie J Rhee *, Laura Sampson, Eunyoung Cho, Michael D Hughes, Frank B Hu, Walter C Willett
PMCID: PMC4325679  PMID: 25656533

Abstract

In a recent article in the American Journal of Epidemiology by Mendez et al. (Am J Epidemiol. 2011;173(4):448–458), the use of alternative approaches to the exclusion of implausible energy intakes led to significantly different cross-sectional associations between diet and body mass index (BMI), whereas the use of a simpler recommended criteria (<500 and >3,500 kcal/day) yielded no meaningful change. However, these findings might have been due to exclusions made based on weight, a primary determinant of BMI. Using data from 52,110 women in the Nurses' Health Study (1990), we reproduced the cross-sectional findings of Mendez et al. and compared the results from the recommended method with those from 2 weight-dependent alternative methods (the Goldberg method and predicted total energy expenditure method). The same 3 exclusion criteria were then used to examine dietary variables prospectively in relation to change in BMI, which is not a direct function of attained weight. We found similar associations using the 3 methods. In a separate cross-sectional analysis using biomarkers of dietary factors, we found similar correlations for intakes of fatty acids (n = 439) and carotenoids and retinol (n = 1,293) using the 3 methods for exclusions. These results do not support the general conclusion that use of exclusion criteria based on the alternative methods might confer an advantage over the recommended exclusion method.

Keywords: biomarkers, body mass index, diet, energy intake, implausible reporting, selection bias


Editor's note: An invited commentary on this article appears on page 234, and the authorsresponse appears on page 237.

Implausible reporting, particularly underreporting, is a widely recognized limitation of dietary assessment methods regardless of their type, and it is often influenced by age, sex, and other individual characteristics, including body mass index (BMI) (15). Obese persons tend to underestimate their total energy intakes and underreport intakes of foods that are deemed unhealthy or socially undesirable, such as foods that are high in fat and refined carbohydrates (1, 6). As such, misreporting can have an important impact on studies that aim to investigate associations between diet and obesity or disease outcomes.

Persons who report implausible energy intakes (hereafter referred to as implausible reporters) can be identified by comparing their reported energy intakes (REIs) with energy intake estimates derived using objective methods of measurement, such as use of doubly labeled water; however, such methods are not feasible or practical for large population-based studies (7). In their place, indirect methods for identifying participants who under- or overreport their dietary intakes have been proposed (1, 2). The Goldberg method (8) uses predicted basal metabolic rates (BMR) and the ratio of reported energy intake to BMR to estimate the amount of energy available for activity. The REI:BMR ratio is then compared with physical activity level (PAL) (see Web Appendix 1, available at http://aje.oxfordjournals.org/, for details). If the ratio differs from PAL by more than the specified standard deviation cutoff limits in that PAL category, the REI is determined to be implausible.

Another alternative method, known as the predicted total energy expenditure (pTEE) method, relies on prediction equations for energy expenditure derived from doubly labeled water studies (9). Similar to the Goldberg method, this method uses REI:pTEE ratios and standard deviation cutoffs to identify implausible reporters (1, 10). Most epidemiologic studies have excluded participants with implausible energy intakes using cutoffs for plausible energy intakes, allowing for some inevitable under- and overreporting (recommended method). This recommended method is simpler and more straightforward in that it does not require any extra mathematical calculations. In the Nurses' Health Study (NHS), these cutoffs are defined as less than 500 kcal/day and greater than 3,500 kcal/day. Mendez et al. (1) found that excluding participants with extreme energy intakes based on recommended cutoffs yielded regression coefficients that were similar to those from models without exclusions, whereas using the Goldberg and pTEE methods to identify and exclude under- and overreporters yielded substantially different associations. The authors concluded that alternative methods yielded more valid diet-BMI associations than did the recommended method (1), and they suggested that findings from previous nutritional epidemiologic studies based on the recommended method might not be valid. However, both the Goldberg and pTEE methods depend on body weight–dependent equations to estimate energy requirements, and the outcome variable in their analyses was BMI, which is also largely a function of body weight. As such, because both the exclusion criteria for the primary exposure and the outcome were indirectly based on body weight, this could have produced spurious associations between the dietary factors under investigation and BMI that resulted from selection bias.

The main aim of the present study was to examine the potential effects of using different exclusion criteria for under- and overreporting, as defined by Mendez et al. (1), on associations between dietary exposures and outcomes that were not primarily functions of body weight. First, we performed a cross-sectional analysis to determine if we could replicate the findings of Mendez et al. This was important to exclude the possibility that subsequent findings were simply due to underlying differences in the data structure. We then performed a 4-year prospective analysis of changes in intakes of fat, vegetables, fruits, and sweets and desserts and change in BMI, which is not strongly correlated with attained BMI, among plausible reporters identified by the recommended, Goldberg, and pTEE methods. We also investigated whether using the alternative methods to exclude implausible reporters would strengthen the correlations of energy-adjusted intakes of specific fatty acids, carotenoids, and retinol with their corresponding biomarkers by reducing measurement error.

METHODS

Study population

The NHS is a prospective cohort study established in 1976, and it has been used to examine associations of diet and lifestyle factors with incidence of chronic diseases (11). It consists of 121,700 registered female nurses who were 30–55 years of age at the time of enrollment. In response to the first questionnaire, participants provided information on their medical histories and other lifestyle and health-related risk factors for cancer and cardiovascular disease (12). Subsequently, questionnaires have been administered every 2 years to update this information and identify new health outcomes. The study was approved by the institutional review boards of the Brigham and Women's Hospital and the Harvard School of Public Health.

Assessment of diet, physical activity, and BMI

Diet was first assessed in 1980 using a semiquantitative food frequency questionnaire, and dietary information has been updated approximately every 4 years thereafter (6). For each food item, a standard unit or portion size was specified. There were 9 possible responses that ranged from “never” to “6 or more times per day.” After converting the response to each food item question to average daily intake for each participant, nutrient intakes were calculated by multiplying the frequency of consumption of each food by the nutrient composition in the standard portion size of that food and then summing up the nutrient intakes from all relevant food items. The reproducibility and validity of these food frequency questionnaires have been evaluated in detail elsewhere (1315). For example, the correlations of food frequency questionnaires with multiple dietary records ranged from 0.45 to 0.68 for total and specific types of fat (16), from 0.40 to 0.89 for various fruits and vegetables (14), and from 0.41 to 0.79 for sweets and desserts (14). For the present analysis, we used the 1990–1994 follow-up interval. Of the 80,332 women for whom dietary data at baseline in 1990 were available, we excluded 1,839 with missing data or more than 70 food items with no information.

Physical activity was assessed using a questionnaire about specific activities. Participants were asked the amount of time spent walking and hiking; jogging; running; bicycling; lap swimming; playing tennis, squash, or racquetball; doing calisthenics or aerobic dance or using of exercise machines; performing other vigorous activities, such as lawn mowing; and performing low-intensity exercise, such as yoga and stretching (17). From this information, each activity was assigned a metabolic equivalent task (MET) value, for which 1 MET was approximately equivalent to the energy expended while sitting quietly, and the weekly energy expenditures in MET hours were subsequently computed for each activity by multiplying the MET value by the time spent performing it (18). Walking was assigned a MET value that corresponded to the reported walking pace. The activity questionnaire has been validated previously (19).

Body weight was self-reported through the biennial questionnaire, with high validity. The correlation between self-reported weights and measured weights was 0.96, with a mean difference of 1.5 kg (20). Participants were also asked to report their height, and BMI was calculated as weight in kilograms divided by the square of height in meters. The study protocol was approved by the institutional review boards of Brigham and Women's Hospital and Harvard School of Public Health.

Prospective analysis

We prospectively examined the influence of under- and overreporting on the association between change in intakes of these dietary factors and change in BMI over a 4-year period from 1990 to 1994. Changes in nutrient and dietary intakes were computed by taking the difference between the measurements from 1990 and 1994. For the present study, we used the NHS physical activity data from the 1988 and 1992 questionnaires to calculate PAL and carried forward values from the 1988 questionnaire to replace values missing from in the 1992 questionnaire. If data were missing from the 1988 questionnaire, we only used the physical activity data from 1992. We excluded participants with missing physical activity data in both 1988 and 1992 (n =516); women with missing BMI at baseline in 1990 (n = 106); and participants with obesity, prior diagnosis of diabetes, cancer, or cardiovascular, pulmonary, renal, or liver disease at baseline and those who were over 65 years of age because of possible confouding by age-related loss of lean muscle mass (n = 22,870). We also excluded women who were diagnosed with these medical conditions before 1998 to account for possible effects of preclinical disease on weight, which reduced the original sample to 52,110 women.

Biomarker analysis

We performed a cross-sectional analysis with biomarker data collected in 1990. All women in the analysis were NHS participants who were included in nested case-control studies of the association of fatty acids (measured in erythrocytes and plasma) with coronary heart disease or of carotenoids (measured in plasma) with breast cancer. Because these are biomarkers of intake, stronger correlations with intake should presumably indicate greater validity. Both studies used blood that was drawn between 1989 and 1990 and stored in liquid nitrogen; the details of the studies have been published previously (21, 22). All study participants were free of cancer and cardiovascular disease at the time their blood was drawn. The study of fatty acids and coronary heart disease consisted of 327 controls and 166 cases in whom nonfatal myocardial infarction or coronary heart disease death were newly diagnosed between the time of blood draw and June 1996 (21). Controls were selected from the nondiseased participants and matched for age, smoking status, and fasting status at blood draw. The study of carotenoids and breast cancer included women who returned a blood sample and had incident invasive or in situ breast cancer that was diagnosed by June 1, 1998 (22). Women who had no prior cancer diagnosis except for nonmelanoma skin cancer were randomly selected as controls and were matched to cases on birth year, menopausal status, postmenopausal hormone use, and time of day, month, and fasting status at the time of blood draw, leaving 969 matched pairs with data on plasma carotenoids and retinol available for analysis. Both cases and controls were considered for the final analysis because controls were free of disease at the time of blood collection.

We used dietary data from 1990 when examining the associations between dietary fatty acids, carotenoids, and retinol and their corresponding biomarkers. We excluded women with missing dietary data and limited the carotenoid analysis to women who were not current smokers (n = 1,540) because an earlier study showed that the correlation between dietary and plasma carotene levels was lower in smokers compared with nonsmokers despite only a slight difference in dietary intake of carotenoids (23). Physical activity data were assessed from the 1988 and 1992 questionnaires. After exclusions, 439 participants were included in the final analysis of fatty acids and 1,293 in the analyses of carotenoids and retinol.

Statistical analysis

In addition to the recommended method, we used 2 other alternative methods, the Goldberg and pTEE methods, to classify under- and overreporters. Detailed descriptions are provided in Web Appendix 1.

Cross-sectional analysis

To replicate the analysis of Mendez et al. (1), we conducted a cross-sectional analysis using baseline data from 1990. We examined the potential effect of under- and overreporting on the associations of intakes of total fat, vegetables, fruits, and sweets and desserts with BMI using the recommended, Goldberg, and pTEE methods. Using a multivariate linear regression model, we adjusted for age, smoking, alcohol intake, physical activity, and other dietary factors to estimate β coefficients and their 95% confidence intervals.

Prospective analysis

We used multivariate linear regression models to examine the relationship between change in diet and change in BMI over a 4-year period from 1990 to 1994, taking into account changes in confounding variables during the same period. To minimize missing data for covariates, we used values carried forward from previous study waves to account for missing continuous variables and used missing indicator variables for categorical variables. We adjusted for age and changes in physical activity, smoking status, alcohol intake, and depending on the model, consumption of dietary variables other than the main exposure.

Biomarker analysis

When assessing the impact of adjustment for misreporting of dietary intake on the relationships of fatty acid and carotenoid intakes with their corresponding biomarkers, we excluded implausible reporters, as classified by the recommended, Goldberg, and pTEE methods. We log-transformed dietary and biomarker data to improve normality and used the residual method to adjust dietary fatty acids, carotenoids, and retinol intakes for total energy intake by regressing nutrient intakes on total energy intake derived from self-reported food frequency questionnaires. We computed correlation coefficients between energy-adjusted fatty acid intakes and corresponding plasma and red blood cell fatty acids, and between energy-adjusted intakes of carotenoids and retinol and their plasma levels. Plasma carotenoids and retinol intakes were adjusted for serum cholesterol because they were positively associated with total cholesterol level (P < 0.05) (data not shown).

All P values reported are 2-sided. SAS statistical software, version 9.2 (SAS Institute, Inc., Cary, North Carolina) was used for all statistical analyses.

RESULTS

Cross-sectional and prospective analysis

Baseline characteristics of underreporters, plausible reporters, and overreporters, as classified by the recommended and 2 alternative methods, are shown in Table 1. Based on the recommended method, 99.0% of all women in the study were identified as plausible reporters, whereas 68.6% and 66.2% of them were classified as plausible reporters by the Goldberg and pTEE methods, respectively. Across all 3 methods, underreporters had higher mean BMI than did plausible and overreporters. Overreporters classified all methods had significantly higher mean intakes of fat, vegetables, fruits, and sweets and desserts than did underreporters and plausible reporters. Although statistically significant, the differences in intake of fat as a percentage of energy were quantitatively small across reporting groups for all 3 methods. The mean REI, BMR, PAL, REI:BMR ratio, pTEE, and REI:PTEE ratio for underreporters, plausible reporters, and overreporters as classified by different exclusion methods, are shown in Web Table 1.

Table 1.

Means and Standard Deviations for Baseline Characteristics of Underreporters, Plausible Reporters, and Overreporters as Classified Using the Recommended and Alternative Methods, Nurses' Health Study, United States, 1990a

Variable Method
Recommended (n = 52,110)
Goldberg (n = 52,110)
pTEE (n = 52,110)
Under (n = 78) Plausible (n = 51,563) Over (n = 469) Under (n = 11,716) Plausible (n = 35,754) Over (n = 4,640) Under (n = 10,580) Plausible (n = 34,506) Over (n = 7,024)
Sampleb 0.1 99.0 0.9 22.4 68.6 8.9 20.3 66.2 13.5
Age, yearsc 56.0 (6.2) 54.8 (6.3) 54.5 (6.3) 54.9 (6.1) 54.7 (6.3) 55.4 (6.6)d 54.7 (6.2) 54.8 (6.3) 55.1 (6.3)d
Body mass indexe 25.5 (4.1) 24.9 (3.7) 24.2 (4.1)d 25.5 (3.8) 24.8 (3.7) 24.0 (3.7)d 25.5 (3.8) 24.8 (3.7) 24.2 (3.7)d
Physical activity level, MET-hours/week 13.3 (20.7) 15.9 (21.9) 19.8 (27.2)d 20.5 (29.9) 14.9 (19.3) 12.5 (14.0)d 16.1 (23.2) 15.9 (21.2) 16.2 (23.2)d
Alcohol, drinks/day 0.09 (0.20) 0.43 (0.77) 0.59 (1.11)d 0.32 (0.59) 0.45 (0.78) 0.55 (1.00)d 0.30 (0.56) 0.45 (0.77) 0.55 (0.98)d
Smokingb 16.7 16.9 19.2 17.7 16.4 18.5d 19.3 16.0 17.5d
Daily dietary intake
 Energy, mJ/day 1.79 (0.29) 7.30 (2.14) 17.8 (5.9)d 4.93 (1.11) 7.61 (1.53) 11.9 (3.0)d 4.63 (0.85) 7.40 (1.26) 11.5 (2.6)d
 Fat, g/day 14.6 (4.9) 61.1 (22.3) 156 (59)d 40.2 (11.8) 63.7 (17.8) 104 (33)d 38.4 (10.5) 61.8 (16.1) 98.1 (30.4)d
 Fat, % energy 30.6 (9.0) 31.4 (5.9) 33.1 (7.2)d 30.9 (6.2) 31.5 (5.8) 32.6 (6.1)d 31.2 (6.3) 31.4 (5.8) 32.2 (6.2)d
 Vegetables, servings/day 1.10 (1.05) 3.71 (2.04) 7.98 (6.86)d 2.91 (1.64) 3.81 (1.97) 5.31 (3.50)d 2.69 (1.50) 3.74 (1.87) 5.30 (3.23)d
 Fruits, servings/day 0.50 (0.50) 1.53 (1.11) 3.22 (3.35)d 1.17 (0.85) 1.59 (1.09) 2.19 (1.84)d 1.06 (0.77) 1.56 (1.03) 2.21 (1.75)d
 Sweets and desserts, servings/day 0.22 (0.24) 1.26 (1.23) 4.35 (3.83)d 0.68 (0.65) 1.31 (1.16) 2.63 (2.23)d 0.65 (0.63) 1.25 (1.10) 2.42 (2.05)d

Abbreviations: MET, metabolic equivalent; pTEE, predicted total energy expenditure.

a Values are standardized to the age distribution of the analytic population.

b Values are percents.

c Value is not adjusted for age.

d P < 0.05 for differences by reporting group using nonparametric analysis of variance or χ2 test.

e Weight (kg)/height (m)2.

Findings from the cross-sectional analysis showed associations between diet and BMI that were similar to those observed in the study by Mendez et al. (1) (Table 2). Using the recommended method made little difference compared with the original model that included all study participants, and the β estimates did not change much. In contrast, using the Goldberg and pTEE methods to exclude under- and overreporters changed the associations between the dietary factors examined and BMI with the exception of fat intake. These associations were reversed in direction after exclusions were made using the alternative methods. For example, we observed positive associations between intakes of vegetables and fruits (highest tertile) in the original model but found inverse associations after making exclusions using the Goldberg method (for vegetable intake, β = −0.74, 95% confidence interval (CI): −0.84, −0.64; for fruit intake, β = −0.31, 95% CI: −0.42, −0.21) and pTEE method (for vegetables intake, β = −0.76, 95% CI: −0.86, −0.66; for fruit intake, β = −0.36, 95% CI: −0.46, −0.25). In the original model, intake of sweets and desserts (highest tertile) was inversely associated with BMI (β = −0.18, 95% CI: −0.26, −0.10), but the associations became positive after making exclusions using the Goldberg (β = 0.29, 95% CI: 0.20, 0.39) and pTEE (β = 0.33, 95% CI: 0.23, 0.43) methods. Adjustment for total energy intake made little difference in the main findings with the exception of the relationship between intake of fruits and BMI, which became slightly attenuated.

Table 2.

Associations Between Dietary Factors and Body Mass Indexa Among Plausible Reporters of Energy Intake as Classified Using the Recommended and Alternative Methods, Nurses' Health Study, United States, 1990b

Category of Dietary Intake All Participants (n = 52,110)
Recommended Method (n = 51,563)
Goldberg Method (n = 35,754)
pTEE Method (n = 34,506)
β 95% CI β 95% CI β 95% CI β 95% CI
Fat, % energy 0.071 0.065, 0.077 0.072 0.066, 0.078 0.078 0.071, 0.085 0.078 0.070, 0.085
Vegetables
 Tertile 2 0.21 0.13, 0.28 0.20 0.12, 0.27 −0.41 −0.50, −0.32 −0.42 −0.52, −0.33
 Tertile 3 0.49 0.41, 0.57 0.48 0.39, 0.56 −0.74 −0.84, −0.64 −0.76 −0.86, −0.66
Fruits
 Tertile 2 0.03 −0.05, 0.11 0.03 −0.05, 0.11 −0.14 −0.24, −0.05 −0.14 −0.24, −0.04
 Tertile 3 0.09 0.001, 0.18 0.08 −0.01, 0.17 −0.31 −0.42, −0.21 −0.36 −0.46, −0.25
Sweets and desserts
 Tertile 2 −0.19 −0.27, −0.11 −0.20 −0.28, −0.13 0.09 0.003, 0.18 0.09 0.002, 0.18
 Tertile 3 −0.18 −0.26, −0.10 −0.20 −0.28, −0.12 0.29 0.20, 0.39 0.33 0.23, 0.43

Abbreviations: CI, confidence interval; pTEE, predicted total energy expenditure.

a Weight (kg)/height (m)2.

b Associations are expressed as β coefficients, and the multivariate model was adjusted for age, smoking, alcohol intake, physical activity levels, and dietary factors of interest other than the primary exposure.

The associations between changes in intake of various dietary factors and change in BMI were similar across all 3 methods in the prospective analysis (Table 3). When the models were restricted to plausible reporters identified by the recommended method, positive associations with change in BMI were seen for increased intakes of fat (0.016 per percentage of energy) and sweets and desserts (0.069 per serving per day), whereas negative associations with change in BMI were observed for increased intakes of vegetables (−0.019 per serving per day) and fruits (−0.042 per serving per day) (P < 0.05 for all). The magnitude and direction of these changes in BMI associated with increased intakes of fat, vegetables, fruits, and sweets and desserts were similar to those observed for plausible reporters identified using the Goldberg and pTEE methods. Adjustment for total energy intake did not lead to meaningfully different results.

Table 3.

Associations Between Change in Dietary Factors and Change in Body Mass Indexa Among Plausible Reporters of Energy Intake as Classified Using the Recommended and Alternative Methods, Nurses' Health Study, United States, 1990–1994b

Category of Dietary Intake All Participants (n = 52,110)
Recommended Method (n = 51,563)
Goldberg Method (n = 35,754)
pTEE Method (n = 34,506)
βc 95% CI βc 95% CI βc 95% CI βc 95% CI
Fat, % energy 0.016 0.014, 0.017 0.016 0.014, 0.017 0.016 0.014, 0.017 0.015 0.014, 0.017
Vegetables, servings/day −0.016 −0.020, −0.012 −0.019 −0.024, −0.015 −0.025 −0.030, −0.020 −0.020 −0.026, −0.015
Fruits, servings/day −0.037 −0.044, −0.029 −0.042 −0.050, −0.034 −0.050 −0.060, −0.041 −0.053 −0.063, −0.043
Sweets and desserts, servings/day 0.064 0.058, 0.071 0.069 0.062, 0.076 0.068 0.060, 0.077 0.071 0.062, 0.080

Abbreviations: CI, confidence interval; pTEE, predicted total energy expenditure.

a Weight (kg)/height (m)2.

b The multivariate model was adjusted for age and change in covariates such as smoking behavior, alcohol intake, physical activity level, and dietary factors of interest other than the primary exposure.

c β coefficients represent change in body mass index associated with increased intakes of dietary factors in percentage of energy for fat intake and per serving units per day for intakes of vegetables, fruits, and sweets and desserts within a 4-year period.

Biomarker analysis

The mean REI, BMR, REI:BMR ratio, PAL, pTEE, and REI:pTEE ratio for women in the biomarker analysis classified as underreporters, plausible reporters, and overreporters using the recommended, Goldberg, and pTEE methods are shown in Web Table 2. A higher percentage of women were excluded using the Goldberg and pTEE exclusion criteria compared with the recommended cutoff criteria, and underreporters had lower REI:BMR and REI:pTEE ratios than did plausible and overreporters across all 3 methods in both the fatty acid and carotenoid samples.

Exclusion of implausible reporters using the 2 alternative methods did not meaningfully change the relationships of dietary intake of fatty acids, carotenoids, and retinol with their corresponding biomarkers compared with the recommended method of excluding women with daily caloric intakes of less than 500 kcal/day or more than 3,500 kcal/day (Table 4). Correlations between energy-adjusted intakes of fatty acids, carotenoids, and retinol and their respective biomarkers were similar across all 3 methods for exclusions, and the quantitative differences between correlation coefficients were minimal.

Table 4.

Correlation Coefficients of Energy-Adjusted Dietary Intakes of Fatty Acids, Carotenoids, and Retinol and Their Respective Biomarkers in Women With Plausible Reported Energy Intakes as Classified Using the Recommended and Alternative Methods, Nurses' Health Study, United States, 1990

Nutrient Variable Method
Recommended Goldberg pTEE
Fatty acidsa
 Trans fatty acid
  Plasma 0.26 0.24 0.24
  Red blood cell 0.30 0.31 0.34
 Linoleic acid
  Plasma 0.22 0.29 0.26
  Red blood cell 0.21 0.29 0.23
 DHA
  Plasma 0.43 0.42 0.40
  Red blood cell 0.48 0.49 0.49
 ALA
  Plasma 0.17 0.21 0.16
  Red blood cell 0.16 0.20 0.17
Carotenoidsb,c
 α-carotene 0.27 0.29 0.27
 β-carotene 0.23 0.25 0.26
 β-cryptoxanthin 0.24 0.25 0.25
 Lycopene 0.23 0.24 0.27
 Lutein/Zeaxanthin 0.13 0.18 0.18
Retinolb,c 0.08 0.11 0.10

Abbreviations: ALA, α-linolenic acid; DHA, docosahexaenoic acid; pTEE, predicted total energy expenditure.

a For fatty acids, n = 419 for the recommended method, n = 296 for the Goldberg method, and n = 279 for the pTEE method.

b Plasma carotenoids and retinol intakes were adjusted for serum cholesterol level.

c For carotenoids and retinol, n = 1,279 for the recommended method, n = 919 for the Goldberg method, and n = 900 for the pTEE method.

DISCUSSION

Misreporting of energy and nutrient intakes can distort true associations between diet and health outcomes. In a cross-sectional analysis, Mendez et al. found that associations between various dietary factors and BMI in plausible reporters identified by weight-dependent prediction equation–based alternative methods differed greatly from the results obtained by the recommended method (excluding those with reported energy intakes <500 and >3,500 kcal/day), raising questions about the validity of the use of the recommended cutoff criteria (1). However, in their analysis, selection bias could have been introduced because the definitions used for both exclusion and the outcome, BMI, were functions of body weight. In the present study, we cross-sectionally examined these same relationships to replicate these earlier findings (1) and prospectively with BMI change as the outcome to minimize correlation with attained BMI, thus minimizing selection bias. We also compared correlations of energy-adjusted intakes of various fatty acids, carotenoids, and retinol with their biomarkers among plausible reporters identified using the recommended method with correlations among plausible reporters identified using the 2 alternative methods. In both the prospective analysis with BMI change as the outcome and the cross-sectional analysis with biomarkers of intake as the outcome, the choice of exclusion criteria had, in general, little effect on the observed associations. We conclude that the findings of Mendez et al. (1), in which associations strongly depended on the exclusion criteria, were likely due in part to selection bias and that large effects of exclusion criteria were likely unique to analyses using BMI as an outcome.

Based on the Goldberg and pTEE methods, a higher percentage of participants were identified as underreporters than as overreporters, and underreporters were more likely to weigh more than overreporters. As such, women with higher BMIs and lower reported energy intakes were more likely to be excluded from the study. This was due to the way the exclusion criteria (REI:BMR and REI:pTEE ratios) were defined: Weight was used to calculate the denominators in REI:BMR and REI:pTEE, resulting in ratios that were functions of body weight. Because the numerator in BMI is also a function of body weight, defining the exclusion criteria in this way would elicit an inverse relation between implausible reporting and BMI and result in selection that is strongly related to the exposure because energy intake is associated with intake of almost all specific aspects of diet, and related to the outcome, also a function of body weight, thus creating selection bias. The association between diet and BMI among those selected for analysis is most likely to be different from the association among the eligible, which constitutes a primary definition of selection bias (24). We found these same issues when we examined the association between diet and attained BMI cross-sectionally. Using change in BMI as the outcome would temper the effects of such selection bias because it would be independent of attained BMI, and this is evident in our findings from the prospective analysis.

In the present study, the prevalence of underreporting that was estimated using the 2 alternative methods was similar to prevalences reported in earlier studies (2529) and consistent with the degree of underreporting seen in studies using doubly labeled water (3033). The prevalence of overreporting estimated using the Goldberg and pTEE methods was slightly higher than previously reported levels (25). Compared with the Goldberg and pTEE methods, the recommended method estimated the prevalence of under- and overreporters to be lower. Our findings suggest that excluding a higher percentage of implausible reporters using the 2 alternative methods does not provide a major advantage in detecting diet-BMI associations that are different from associations estimated using the recommended method. In the biomarker analysis, excluding a large number of misreporters using the 2 alternative methods had minimal impact on reducing misclassification of dietary intakes, and the largest difference we observed in the magnitude of correlation coefficients was only 0.07.

Mendez et al. used the revised Goldberg method to identify under- and overreporters in addition to the Goldberg and pTEE methods and found that although the revised Goldberg and pTEE methods yielded concordant results, adjustments made according to the revised Goldberg method led to stronger diet-obesity associations compared with the Goldberg method (1). In the present study, we compared the Goldberg and pTEE methods with the simpler recommended method. Previous studies have shown that the Schofield equations used in the Goldberg method tend to overestimate BMR in obese participants, and the revised Goldberg method based on alternative BMR equations is a better option in the obese population (1, 3436). Because we excluded women with obesity at baseline and the previous study (1) has shown that the revised Goldberg and pTEE methods yield similar results, we do not expect our main findings to change with the use of the revised Goldberg method. Though not examined in this study, another type of exclusion method that can be found in the literature is based on the Box-Cox transformation to normality (37, 38). This method takes into account skewed data and transforms extreme energy intake outliers that might bias parameter estimation using the Box-Cox power transformation to normality. Transformed values that fall either below the 25th percentile of the distribution of transformed reported energy intake minus 2 interquartile ranges or above the 75th percentile plus 2 interquartile ranges are subsequently removed as outliers (37).

A limitation of our study is the lack of comparison of self-reported energy intake with an objective measure of energy expenditure, such as that provided by doubly labeled water, which is impractical in large epidemiologic studies. Another limitation is the absence of an objective measure of body weight, which may have implications for the calculation of BMR and pTEE. However, the validity of self-reported weight has been investigated, and the correlation between reported and direct measures is high (r = 0.96) in this population (20). The physical activity questionnaire used in the NHS included a section on recreational or leisure-time physical activity during the past year (39), but it lacked the ability to capture fine motor movements and physical activity related to work or household activities. However, walking is the most prevalent physical activity among older adults who are in the same age range as our study participants (40), so missing data on these activities should not substantially affect the validity of our data. Also, our study findings are based on comparisons made among women only. Mendez et al. (1) reported that accounting for implausible reporting in analyses of men showed associations similar to those observed in analyses of women. Although further investigation of comparison of different exclusion methods may be warranted in men, we do not expect our conclusions to substantially change in an exclusively male study population. The aforementioned methodological issue of selection bias should apply to all epidemiologic analyses that examine weight-dependent outcomes, such as BMI, regardless of the sex of the participants. The strengths of our study include the large sample size and validated dietary and physical activity questionnaires. These validated activity questionnaires are useful in large epidemiologic studies when more objective yet impractical and costly measures, such as heart rate or accelerometer monitoring, are not readily accessible (41).

The present study suggests that there is little benefit in using weight-based prediction equations to exclude implausible reporters when assessing associations between diet and health-related outcomes. The findings of this study also suggest caution in the use of exclusion criteria based on weight-dependent prediction equations in studies of diet in relation to weight-dependent outcomes, such as BMI.

Supplementary Material

Web Material

ACKNOWLEDGMENTS

Author affiliations: Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts (Jinnie J. Rhee, Frank B. Hu, Walter C. Willett); Department of Nutrition, Harvard School of Public Health, Boston, Massachusetts (Jinnie J. Rhee, Laura Sampson, Frank B. Hu, Walter C. Willett); Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts (Jinnie J. Rhee, Eunyoung Cho, Frank B. Hu, Walter C. Willett); Department of Medicine, Division of Aging, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts (Jinnie J. Rhee); Department of Medicine, Division of Nephrology, Stanford University School of Medicine, Stanford, California (Jinnie J. Rhee); Department of Dermatology, The Warren Alpert Medical School of Brown University (Eunyoung Cho); and Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts (Michael D. Hughes).

This work was supported by the National Institutes of Health (grants 5T32AG000158-23, P01 CA87969, R01 CA49449, R01 HL088521, and 5T32DK007357-29). In addition, for activities related to the Nurses' Health Studies, we have received modest additional resources from the Alcoholic Beverage Medical Research Foundation; the American Cancer Society; Amgen; the California Prune Board; the Centers for Disease Control and Prevention; the Ellison Medical Foundation; the Florida Citrus Growers; the Glaucoma Medical Research Foundation; Hoffmann-LaRoche; Kellogg's; Lederle; the Massachusetts Department of Public Health; Mission Pharmacal; the National Dairy Council; Rhone Poulenc Rorer; the Robert Wood Johnson Foundation; Sandoz; the US Department of Defense; the US Department of Agriculture; the Wallace Genetics Fund; Wyeth-Ayerst; and private contributions.

Conflict of interest: none declared.

REFERENCES

  • 1.Mendez MA, Popkin BM, Buckland G, et al. Alternative methods of accounting for underreporting and overreporting when measuring dietary intake-obesity relations. Am J Epidemiol. 2011;173(4):448–458. doi: 10.1093/aje/kwq380. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Livingstone MB, Black AE. Markers of the validity of reported energy intake. J Nutr. 2003;133(suppl 3):895S–920S. doi: 10.1093/jn/133.3.895S. [DOI] [PubMed] [Google Scholar]
  • 3.Institute of Medicine. Dietary Reference Intakes for Energy, Carbohydrate, Fiber, Fat, Fatty Acids, Cholesterol, Protein, and Amino Acids (Macronutrients) Washington, DC: The National Academies Press; 2005. [Google Scholar]
  • 4.Schatzkin A, Kipnis V, Carroll RJ, et al. A comparison of a food frequency questionnaire with a 24-hour recall for use in an epidemiological cohort study: results from the biomarker-based Observing Protein and Energy Nutrition (OPEN) Study. Int J Epidemiol. 2003;32(6):1054–1062. doi: 10.1093/ije/dyg264. [DOI] [PubMed] [Google Scholar]
  • 5.Subar AF, Kipnis V, Troiano RP, et al. Using intake biomarkers to evaluate the extent of dietary misreporting in a large sample of adults: the OPEN Study. Am J Epidemiol. 2003;158(1):1–13. doi: 10.1093/aje/kwg092. [DOI] [PubMed] [Google Scholar]
  • 6.Willett W. Nutritional Epidemiology. 3rd ed. New York, NY: Oxford University Press; 2012. [Google Scholar]
  • 7.Samuel-Hodge CD, Fernandez LM, Henríquez-Roldán CF, et al. A comparison of self-reported energy intake with total energy expenditure estimated by accelerometer and basal metabolic rate in African-American women with type 2 diabetes. Diabetes Care. 2004;27(3):663–669. doi: 10.2337/diacare.27.3.663. [DOI] [PubMed] [Google Scholar]
  • 8.Goldberg GR, Black AE, Jebb SA, et al. Critical evaluation of energy intake data using fundamental principles of energy physiology: 1. Derivation of cut-off limits to identify under-recording. Eur J Clin Nutr. 1991;45(12):569–581. [PubMed] [Google Scholar]
  • 9.Tooze JA, Schoeller DA, Subar AF, et al. Total daily energy expenditure among middle-aged men and women: the OPEN Study. Am J Clin Nutr. 2007;86(2):382–387. doi: 10.1093/ajcn/86.2.382. [DOI] [PubMed] [Google Scholar]
  • 10.Huang TT, Roberts SB, Howarth NC, et al. Effect of screening out implausible energy intake reports on relationships between diet and BMI. Obes Res. 2005;13(7):1205–1217. doi: 10.1038/oby.2005.143. [DOI] [PubMed] [Google Scholar]
  • 11.Colditz GA, Hankinson SE. The Nurses’ Health Study: lifestyle and health among women. Nat Rev Cancer. 2005;5(5):388–396. doi: 10.1038/nrc1608. [DOI] [PubMed] [Google Scholar]
  • 12.Salmerón J, Manson JE, Stampfer MJ, et al. Dietary fiber, glycemic load, and risk of non-insulin-dependent diabetes mellitus in women. JAMA. 1997;277(6):472–477. doi: 10.1001/jama.1997.03540300040031. [DOI] [PubMed] [Google Scholar]
  • 13.Feskanich D, Rimm EB, Giovannucci EL, et al. Reproducibility and validity of food intake measurements from a semiquantitative food frequency questionnaire. J Am Diet Assoc. 1993;93(7):790–796. doi: 10.1016/0002-8223(93)91754-e. [DOI] [PubMed] [Google Scholar]
  • 14.Willett WC, Sampson L, Stampfer MJ, et al. Reproducibility and validity of a semiquantitative food frequency questionnaire. Am J Epidemiol. 1985;122(1):51–65. doi: 10.1093/oxfordjournals.aje.a114086. [DOI] [PubMed] [Google Scholar]
  • 15.Salvini S, Hunter DJ, Sampson L, et al. Food-based validation of a dietary questionnaire: the effects of week-to-week variation in food consumption. Int J Epidemiol. 1989;18(4):858–867. doi: 10.1093/ije/18.4.858. [DOI] [PubMed] [Google Scholar]
  • 16.Salmerón J, Hu FB, Manson JE, et al. Dietary fat intake and risk of type 2 diabetes in women. Am J Clin Nutr. 2001;73(6):1019–1026. doi: 10.1093/ajcn/73.6.1019. [DOI] [PubMed] [Google Scholar]
  • 17.Hu FB, Sigal RJ, Rich-Edwards JW, et al. Walking compared with vigorous physical activity and risk of type 2 diabetes in women: a prospective study. JAMA. 1999;282(15):1433–1439. doi: 10.1001/jama.282.15.1433. [DOI] [PubMed] [Google Scholar]
  • 18.Ainsworth BE, Haskell WL, Leon AS, et al. Compendium of physical activities: classification of energy costs of human physical activities. Med Sci Sports Exerc. 1993;25(1):71–80. doi: 10.1249/00005768-199301000-00011. [DOI] [PubMed] [Google Scholar]
  • 19.Wolf AM, Hunter DJ, Colditz GA, et al. Reproducibility and validity of a self-administered physical activity questionnaire. Int J Epidemiol. 1994;23(5):991–999. doi: 10.1093/ije/23.5.991. [DOI] [PubMed] [Google Scholar]
  • 20.Willett W, Stampfer MJ, Bain C, et al. Cigarette smoking, relative weight, and menopause. Am J Epidemiol. 1983;117(6):651–658. doi: 10.1093/oxfordjournals.aje.a113598. [DOI] [PubMed] [Google Scholar]
  • 21.Sun Q, Ma J, Campos H, et al. A prospective study of trans fatty acids in erythrocytes and risk of coronary heart disease. Circulation. 2007;115(14):1858–1865. doi: 10.1161/CIRCULATIONAHA.106.679985. [DOI] [PubMed] [Google Scholar]
  • 22.Tamimi RM, Hankinson SE, Campos H, et al. Plasma carotenoids, retinol, and tocopherols and risk of breast cancer. Am J Epidemiol. 2005;161(2):153–160. doi: 10.1093/aje/kwi030. [DOI] [PubMed] [Google Scholar]
  • 23.Stryker WS, Kaplan LA, Stein EA, et al. The relation of diet, cigarette smoking, and alcohol consumption to plasma beta-carotene and alpha-tocopherol levels. Am J Epidemiol. 1988;127(2):283–296. doi: 10.1093/oxfordjournals.aje.a114804. [DOI] [PubMed] [Google Scholar]
  • 24.Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology. 2004;15(5):615–625. doi: 10.1097/01.ede.0000135174.63482.43. [DOI] [PubMed] [Google Scholar]
  • 25.Mendez MA, Wynter S, Wilks R, et al. Under- and overreporting of energy is related to obesity, lifestyle factors and food group intakes in Jamaican adults. Public Health Nutr. 2004;7(1):9–19. doi: 10.1079/phn2003508. [DOI] [PubMed] [Google Scholar]
  • 26.Johansson L, Solvoll K, Bjørneboe GE, et al. Under- and overreporting of energy intake related to weight status and lifestyle in a nationwide sample. Am J Clin Nutr. 1998;68(2):266–274. doi: 10.1093/ajcn/68.2.266. [DOI] [PubMed] [Google Scholar]
  • 27.Mennen LI, Jackson M, Cade J, et al. Underreporting of energy intake in four populations of African origin. Int J Obes Relat Metab Disord. 2000;24(7):882–887. doi: 10.1038/sj.ijo.0801246. [DOI] [PubMed] [Google Scholar]
  • 28.Samaras K, Kelly PJ, Campbell LV. Dietary underreporting is prevalent in middle-aged British women and is not related to adiposity (percentage body fat) Int J Obes Relat Metab Disord. 1999;23(8):881–888. doi: 10.1038/sj.ijo.0800967. [DOI] [PubMed] [Google Scholar]
  • 29.Horner NK, Patterson RE, Neuhouser ML, et al. Participant characteristics associated with errors in self-reported energy intake from the Women's Health Initiative food-frequency questionnaire. Am J Clin Nutr. 2002;76(4):766–773. doi: 10.1093/ajcn/76.4.766. [DOI] [PubMed] [Google Scholar]
  • 30.Johnson RK, Goran MI, Poehlman ET. Correlates of over- and underreporting of energy intake in healthy older men and women. Am J Clin Nutr. 1994;59(6):1286–1290. doi: 10.1093/ajcn/59.6.1286. [DOI] [PubMed] [Google Scholar]
  • 31.Schoeller DA, Bandini LG, Dietz WH. Inaccuracies in self-reported intake identified by comparison with the doubly labelled water method. Can J Physiol Pharmacol. 1990;68(7):941–949. doi: 10.1139/y90-143. [DOI] [PubMed] [Google Scholar]
  • 32.Sawaya AL, Tucker K, Tsay R, et al. Evaluation of four methods for determining energy intake in young and older women: comparison with doubly labeled water measurements of total energy expenditure. Am J Clin Nutr. 1996;63(4):491–499. doi: 10.1093/ajcn/63.4.491. [DOI] [PubMed] [Google Scholar]
  • 33.Martin LJ, Su W, Jones PJ, et al. Comparison of energy intakes determined by food records and doubly labeled water in women participating in a dietary-intervention trial. Am J Clin Nutr. 1996;63(4):483–490. doi: 10.1093/ajcn/63.4.483. [DOI] [PubMed] [Google Scholar]
  • 34.Horgan GW, Stubbs J. Predicting basal metabolic rate in the obese is difficult. Eur J Clin Nutr. 2003;57(2):335–340. doi: 10.1038/sj.ejcn.1601542. [DOI] [PubMed] [Google Scholar]
  • 35.Alfonzo-González G, Doucet E, Alméras N, et al. Estimation of daily energy needs with the FAO/WHO/UNU 1985 procedures in adults: comparison to whole-body indirect calorimetry measurements. Eur J Clin Nutr. 2004;58(8):1125–1131. doi: 10.1038/sj.ejcn.1601940. [DOI] [PubMed] [Google Scholar]
  • 36.Frankenfield D, Roth-Yousey L, Compher C. Comparison of predictive equations for resting metabolic rate in healthy nonobese and obese adults: a systematic review. J Am Diet Assoc. 2005;105(5):775–789. doi: 10.1016/j.jada.2005.02.005. [DOI] [PubMed] [Google Scholar]
  • 37.Thompson FE, Kipnis V, Midthune D, et al. Performance of a food-frequency questionnaire in the US NIH-AARP (National Institutes of Health-American Association of Retired Persons) Diet and Health Study. Public Health Nutr. 2008;11(2):183–195. doi: 10.1017/S1368980007000419. [DOI] [PubMed] [Google Scholar]
  • 38.Box GE, Cox DR. An analysis of transformations. J R Stat Soc Series B Stat Methodol. 1964;26(2):211–252. [Google Scholar]
  • 39.Weuve J, Kang JH, Manson JE, et al. Physical activity, including walking, and cognitive function in older women. JAMA. 2004;292(12):1454–1461. doi: 10.1001/jama.292.12.1454. [DOI] [PubMed] [Google Scholar]
  • 40.Yusuf HR, Croft JB, Giles WH, et al. Leisure-time physical activity among older adults. United States, 1990. Arch Intern Med. 1996;156(12):1321–1326. [PubMed] [Google Scholar]
  • 41.Hu F. Obesity Epidemiology. New York: Oxford University Press; 2008. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web Material

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES