Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Mar 23.
Published in final edited form as: Public Health Nutr. 2011 Apr 13;14(7):1212–1221. doi: 10.1017/S1368980011000632

Validating an FFQ for intake of episodically consumed foods: application to the National Institutes of Health–AARP Diet and Health Study

Douglas Midthune 1,*, Arthur Schatzkin 2, Amy F Subar 3, Frances E Thompson 3, Laurence S Freedman 4, Raymond J Carroll 5, Marina A Shumakovich 1, Victor Kipnis 1
PMCID: PMC3190597  NIHMSID: NIHMS307195  PMID: 21486523

Abstract

Objective

To develop a method to validate an FFQ for reported intake of episodically consumed foods when the reference instrument measures short-term intake, and to apply the method in a large prospective cohort.

Design

The FFQ was evaluated in a sub-study of cohort participants who, in addition to the questionnaire, were asked to complete two non-consecutive 24 h dietary recalls (24HR). FFQ-reported intakes of twenty-nine food groups were analysed using a two-part measurement error model that allows for nonconsumption on a given day, using 24HR as a reference instrument under the assumption that 24HR is unbiased for true intake at the individual level.

Setting

The National Institutes of Health–AARP Diet and Health Study, a cohort of 567 169 participants living in the USA and aged 50–71 years at baseline in 1995.

Subjects

A sub-study of the cohort consisting of 2055 participants.

Results

Estimated correlations of true and FFQ-reported energy-adjusted intakes were 0·5 or greater for most of the twenty-nine food groups evaluated, and estimated attenuation factors (a measure of bias in estimated diet–disease associations) were 0·4 or greater for most food groups.

Conclusions

The proposed methodology extends the class of foods and nutrients for which an FFQ can be evaluated in studies with short-term reference instruments. Although violations of the assumption that the 24HR is unbiased could be inflating some of the observed correlations and attenuation factors, results suggest that the FFQ is suitable for testing many, but not all, diet–disease hypotheses in a cohort of this size.

Keywords: Diet, Food, Epidemiological methods, Questionnaires, Validation studies


Most large prospective cohorts use an FFQ to measure dietary intake. It is well known that an FFQ has substantial measurement error that can affect the results of such studies, leading to bias and the loss of power to detect diet–disease relationships(1,2). In order to evaluate the measurement error in an FFQ, and to correct observed diet–disease relationships for bias due to measurement error, many cohort studies include calibration sub-studies in which another, less biased, dietary instrument is administered as a reference instrument. The reference instrument is usually a short-term instrument such as a 24 h dietary recall (24HR) or food record.

Methods for evaluating an FFQ’s ability to measure foods/nutrients that are consumed daily have been developed based on measurement error models that explicitly or implicitly assume that true usual intake and reported intake from the FFQ and reference instrument are all continuous variables(35). These methods have sometimes been used to evaluate ‘episodically consumed’ foods, or foods that are not consumed nearly every day by almost everyone in the population(68). This can be problematic if the reference instrument covers only a short time period, since short-term instruments may have a substantial proportion of subjects reporting zero intake of an episodically consumed food, violating the assumption that the reported intake is continuous.

Recently, a measurement error model for episodically consumed foods has been developed and used in dietary surveillance to estimate population distributions of usual intakes of such foods(911) and to correct for measurement error in diet–health relationships when the 24HR is the main dietary instrument(12). The model allows for nonconsumption on a given day by separating the probability to consume from the amount consumed on a consumption day using a two-part model(9). The model has also been extended to a ‘three-part’ model to estimate the joint distribution of intakes of an episodically consumed food and energy(10). In the present paper, we use these models to evaluate an FFQ’s ability to measure intake of episodically consumed foods when the reference instrument measures short-term intake. After fitting a model that describes the relationship between the short-term reference and the FFQ, we use Monte Carlo methods to estimate the relationship between true and FFQ-reported intakes.

In 1995, the National Institutes of Health (NIH) and the AARP, formerly the American Association of Retired Persons, initiated a large prospective cohort study called the NIH–AARP Diet and Health Study, which was designed to study relationships between diet and cancer. The study uses an FFQ to measure diet and includes a calibration sub-study of about 2000 subjects who in addition to the FFQ were administered two 24HR. Thompson et al.(13) evaluated the ability of the NIH–AARP FFQ to measure nutrient intake. In the present paper we assess the FFQ’s ability to measure intakes of twenty-nine food groups.

Methods

Study design

The design of the NIH–AARP Diet and Health Study is described in detail elsewhere(14). Briefly, a baseline questionnaire that included a 124-item FFQ was mailed to 3·5 million members of AARP in 1995–1996. A total of 617 119 men and women returned the questionnaire, and, after excluding some whose questionnaires were deemed to be of poor quality or who declined to participate, a cohort of 567 169 subjects was established. Age at baseline in the cohort ranged from 50 to 71 years.

Calibration sub-study participants were selected from the 46 970 subjects who had returned questionnaires as of January 1996. Subjects in the sub-study were asked to complete two non-consecutive unannounced 24HR administered over the telephone by trained interviewers. Of the 2795 individuals invited to participate in the sub-study, 2055 agreed and completed at least one 24HR (97% completed both). The two 24HR were separated in time, with 50% separated by at least 21 days and 75% separated by at least 14 days. In our analysis, we include 1942 subjects (984 men, 958 women) after excluding 113 subjects who subsequently dropped out of the cohort study, had pre-baseline reports of cancer or death-only reports of cancer.

Study instruments

The FFQ used in the NIH–AARP study was an early version of the Diet History Questionnaire (DHQ) developed at the National Cancer Institute (NCI)(15). Frequency responses were asked for 124 food items; portion sizes for 116. An additional twenty-one questions asked about specific food choices and cooking practices. Databases from the US Department of Agriculture’s (USDA) Continuing Survey of Food Intakes of Individuals (CSFII) (1989–91, 1994–96) were used to develop a nutrient composition database for the FFQ(16). The MyPyramid Equivalents Database (MPED) version 1·0, developed by USDA(17), was used to obtain food group intakes in MPED servings consistent with 2005 Dietary Guidelines for Americans(18). The MPED disaggregates components of food mixtures into food groups (e.g. pepperoni pizza components are placed into grain, dairy, vegetable and meat food groups).

In the 24HR interviews, participants were asked to report all foods and beverages consumed on the day before the interview. Interviewers used a food probe list containing standardized probes specific to foods in over 100 food categories. Data were coded using the Food Intake Analysis System (FIAS) version 2·3, developed at the University of Texas; the same nutrient composition database is used for both FIAS and USDA’s CSFII. Data checks were performed on reports with extremely high values for fat, total energy and total fruit and vegetable intakes, and corrections were made when extreme values were due to coding errors.

Statistical analysis

We evaluate the FFQ in terms of its ability to detect diet–disease relationships in observational studies. Two important parameters for characterizing this ability are the correlation of true and FFQ-reported intakes and the attenuation factor. The correlation of true and FFQ-reported intakes is a measure of the statistical power to detect diet–disease relationships, while the attenuation factor for FFQ-reported intake is a measure of the bias in estimated relationships. Both parameters are functions of the joint distribution of true and FFQ-reported intakes. Although one cannot observe true usual intake in free-living populations, one can estimate its distribution and its relationship to the FFQ-reported intake using statistical models and appropriate reference instruments.

Statistical model for episodically consumed foods

The model for episodically consumed foods is described in detail in Kipnis et al.(12), who use the model to correct for measurement error when 24HR is the main dietary instrument. In the present application, FFQ is the main instrument and 24HR is used as a reference instrument.

For individual i, i = 1,…,n, let

  • Tij be the true intake of an episodically consumed food on day j

  • pi = P(Tij>0|i) be the true probability to consume on a given day

  • Ai = E(Tij|Tij > 0, i) be the true average amount consumed on a consumption day

  • Ti = E(Tij|i) = pi × Ai be the true usual intake of the episodically consumed food

  • Rij be the 24HR-reported intake of the episodically consumed food on day j

  • Qi be the FFQ-reported intake of the episodically consumed food.

We assume that an individual’s 24HR-reported intake Rij is an unbiased estimate of true usual intake Ti. In particular, we assume that the probability to report consumption is equal to the true probability to consume, pi, and that the average reported amount on a consumption day is equal to the true average amount consumed on a consumption day, Ai. Then the mean of Rij equals pi × Ai = Ti. We note that this is a strong assumption that may not be exactly true, although it is generally believed that a 24HR is less biased than an FFQ (see Discussion section for more on this).

We also assume that, after appropriate transformations, the relationship between the FFQ-reported intake and the probability to consume can be described by a logistic regression model and that the relationship between the FFQ-reported intake and the amount consumed on a consumption day can be described by a linear regression model. The resulting two-part model can be written as:

logit(pi)=β10+β11×Qi*+U1i (1)

and

(Rij*|Rij>0)=β20+β21×Qi*+U2i+ε2ij, (2)

where βk0 and βk1 are the intercept and slope in the logistic or linear regression; U1i and U2i are person-specific random effects that have a bivariate normal distribution with mean zero, variances σU12andσU22, and correlation ρU1,U2 ; and ε2ij is within-person random error that is normally distributed with mean zero and variance σε22; and ε2ij is independent of (U1i, U2i). We include random effects U1i and U2i to allow for individual variations in probability and amount that are not explained by the FFQ. Variables Qi*andRij* are Box–Cox transformations of Qi and Rij to scales on which they are approximately normal(19) (see Appendix 1 for details).

Equations (1) and (2) define a non-linear mixed-effects model that can be fit using the NLMIXED procedure in SAS to obtain maximum likelihood estimates of the model parameters β10,β11,β20,β21,σU12,σU22,ρU1,U2andσε22. For foods that are consumed every day, the model simplifies to:

Rij*=β30+β31×Qi*+U31+ε3ij. (3)

Under the model assumptions, true usual intake Ti can be written as a function of Qi, U1i and U2i (or U3i), and one can estimate relationships between true and FFQ-reported intakes by generating a Monte Carlo distribution of Ti and Qi (see Appendix 2 for details).

Note that under the model for episodically consumed foods, intake on a given day (Tij and Rij) can be zero, but usual intake (Ti) is assumed to be greater than zero (although it can be arbitrarily small). There may be foods which some people never consume (e.g. alcohol). Kipnis et al.(12) describe an extension of the present model that allows Ti to be zero; with only two 24HR per person, however, it is difficult in practice to distinguish never consumers from infrequent consumers.

A SAS macro that calls the NLMIXED procedure to fit the model for episodically consumed foods (equations (1) and (2)) or foods consumed every day (equation (3)) is available online(20). Prior to fitting the model, we removed outliers of Qi* and positive Rij* for each food group, where outliers were defined to be values that fell below the 25th percentile of the distribution of the variable minus two interquartile ranges or above the 75th percentile plus two interquartile ranges. The average number of outliers removed for Qi* was 2 (men) and 4 (women), and the average number removed for Rij* was 4 (men) and 3 (women).

Correlation with true intake and attenuation factor

The attenuation factor and correlation with true intake are measures of the bias and loss of power in diet–disease studies due to measurement error in the FFQ. We assume that measurement error is non-differential with respect to disease; that is, that reported intake Qi contributes no additional information about disease risk beyond that provided by true intake Ti. Suppose the true diet–disease relationship follows a logistic model:

logit(ri)=α0+α1×Ti*, (4)

where ri is the probability of disease given true usual intake Ti, α0 and α1 are the intercept and slope in the logistic regression, and Ti* is a Box–Cox transformation of Ti to a scale on which it is approximately normal. The logistic regression model does not require covariates to have any particular distribution. In practice, however, covariates with skewed distributions are often transformed to make extreme values less influential.

We want to estimate the bias in the estimation of log odds ratio α1 caused by using reported intake Qi* rather than Ti* in equation (4). Since Ti*andQi* are transformed using different Box–Cox transformations, the interpretation of α1 depends on which variable is in the model. In order to make the interpretations comparable, we first standardize the transformed variables so that a unit change equals the change from the 10th to the 90th percentile of true intake Ti on that scale. We can then interpret α1 as the log odds ratio comparing the 90th and 10th percentiles of true intake.

To a close approximation, fitting equation (4) using Qi* rather than Ti* leads to estimating not the true risk parameter α1 but the product α̃1 = γ1α1, where γ1 is the slope in the linear regression of Ti*v.Qi* (21). The value γ1 is called the attenuation factor and is interpreted as the multiplicative bias in estimating log odds ratio α1 due to measurement error in Qi.

The loss of statistical power due to using Qi* rather than Ti* in equation (4) is related to the correlation between Ti*andQi*, which we will call ρTQ. If a study would need a sample size of n to attain a desired power using Ti* to measure intake, then the study would need a sample size of n˜=n/ρTQ2 to attain the same power using Qi* (22). For both the correlation with true intake and the attenuation factor, one represents the ideal value. A correlation of one means no loss of power, while an attenuation factor of one means no bias in estimated risk. In a univariate diet–disease model, the attenuation factor is usually between zero and one, indicating that the estimated log odds ratio is biased towards zero, or attenuated.

One can estimate γ1 and ρTQ by generating a Monte Carlo distribution of Ti*andQi*, based on the models described in the previous section. Under the model assumptions, the Monte Carlo distribution will be approximately the same as the distribution in the real population, so that estimates based on the Monte Carlo distribution will be approximately unbiased (see Appendix 2 for details).

Energy-adjusted intake

Researchers are often interested in ‘energy-adjusted’ diet–disease relationships; that is, relationships between food intake and disease when total energy intake is held constant(23). One popular energy-adjustment method is the ‘residual’ method, in which one first calculates the residual in the regression of food v. energy intake (after transforming both to approximate normality) and then relates residual intake to disease(23). For simplicity, we refer to residual intake as ‘energy-adjusted’ intake.

To evaluate FFQ-reported energy-adjusted intake, we fit the three-part food and energy model described in Freedman et al.(10) and generate Monte Carlo distributions of true and FFQ-reported food and energy intakes. We then calculate true and reported residual intakes from the Monte Carlo distributions and use them to estimate the correlation with truth and the attenuation factor for residual intake (see Appendix 3 for details).

Results

Table 1 shows the percentage of subjects in the calibration sub-study having zero intake on the 24HR or FFQ for thirty-two food groups. The food groups range from those that are rarely consumed to those that are consumed almost every day. For example, 98% of men and women reported zero intake of organ meat on both 24HR, while 99% reported non-zero intake of total grains on both 24HR.

Table 1.

Percentage of subjects having zero intakes of MPED food groups on 24HR or FFQ; NIH–AARP Diet and Health Study

Men (n 984*) Women (n 958*)


MPED food group % with zero
intake on both
24HR
% with non-zero
intake on both
24HR
% with zero
intake on FFQ
% with zero
intake on both
24HR
% with non-zero
intake on both
24HR
% with zero
intake on FFQ
Milk Group
    Cheese 24·2 34·8 0·6 25·1 30·5 0·8
    Milk 4·7 79·7 0·0 5·6 76·6 0·0
    Yoghurt 91·2 2·3 62·1 85·9 4·2 40·5
    Total dairy 1·3 90·6 0·0 2·4 88·6 0·0
Grain Group
    Non-whole grains 0·2 98·8 0·0 0·0 99·1 0·0
    Whole grains 15·1 53·7 0·1 16·0 52·3 0·1
    Total grains 0·1 99·4 0·0 0·0 99·4 0·0
Fruit Group
    Citrus, melon, berry 10·2 65·3 0·0 5·9 71·4 0·2
    Other fruit 14·6 58·4 0·2 14·7 57·3 0·1
    Total fruit 2·8 84·1 0·0 1·9 86·1 0·0
Vegetable Group
    Dark green vegetables 60·7 7·5 2·0 55·7 9·2 0·9
    Orange vegetables 33·8 22·0 0·1 32·2 25·4 0·2
    Potatoes 28·3 27·1 0·1 32·8 20·7 0·0
    Other starchy vegetables 52·8 7·5 0·3 56·4 8·2 0·4
    Tomatoes 11·1 48·4 0·0 14·7 45·6 0·3
    Other vegetables 1·5 87·4 0·0 1·1 80·9 0·0
    Total vegetables 0·4 95·7 0·0 0·2 93·4 0·0
Legumes 72·7 4·5 3·3 76·9 2·1 7·8
Meat Group
    Red meat 22·1 40·8 0·0 27·1 30·5 0·2
    Poultry 38·3 19·7 0·2 34·0 21·5 0·0
    Fish (high omega) 74·0 4·1 3·2 76·0 3·3 2·5
    Fish (low omega) 64·1 6·7 0·2 69·2 4·8 0·4
    Franks, luncheon meat 51·4 14·3 0·4 61·0 7·1 1·4
    Organ meat 98·0 0·0 51·6 98·3 0·0 59·4
    Meat, poultry & fish 1·5 90·8 0·0 1·8 85·9 0·0
    Eggs 20·0 42·7 0·2 20·8 35·6 0·1
    Nuts & seeds 49·3 18·7 0·4 53·1 12·9 0·5
    Soya 59·1 8·3 94·3 57·8 6·9 95·5
Alcoholic beverages 54·7 23·0 23·6 65·1 16·8 30·4
Added sugars 0·1 99·4 0·0 0·0 99·1 0·0
Discretionary fat (oil) 2·0 86·0 0·0 1·6 81·6 0·0
Discretionary fat (solid) 0·0 100·0 0·0 0·0 100·0 0·0

MPED, MyPyramid Equivalents Database; 24HR, 24 h dietary recall; NIH, National Institutes of Health.

*

Percentages for the 24HR are based on the 953 men and 926 women who completed two 24HR.

Table 2 presents sample means for reported intakes of the thirty-two food groups. The means include both zero and non-zero amounts. In men, FFQ-reported intake tended be less than 24HR-reported intake, while in women it tended to be greater. For men, the FFQ mean was at least 20% smaller than the 24HR mean for twelve food groups, and at least 20% larger for six food groups. For women, the FFQ mean was at least 20% smaller than the 24HR mean for five food groups, and at least 20% larger for ten food groups.

Table 2.

Mean reported MPED food group intakes on 24HR and FFQ, with standard errors; NIH–AARP Diet and Health Study

Men (n 984*) Women (n 958*)


24HR FFQ 24HR FFQ




MPED food group (unit) Mean SE Mean SE Mean SE Mean SE
Milk Group (cup equivalents)
    Cheese 0·51 0·02 0·25 0·01 0·38 0·02 0·18 0·01
    Milk 1·04 0·03 1·17 0·04 0·85 0·03 1·02 0·04
    Yoghurt 0·04 0·01 0·05 0·01 0·07 0·01 0·09 0·01
    Total dairy 1·60 0·04 1·47 0·04 1·30 0·03 1·30 0·04
Grain Group (oz. equivalents)
    Non-whole grains 6·56 0·11 4·74 0·07 4·57 0·07 3·76 0·06
    Whole grains 1·13 0·04 1·18 0·03 0·83 0·03 0·89 0·02
    Total grains 7·69 0·12 5·92 0·09 5·41 0·08 4·65 0·08
Fruit Group (cup equivalents)
    Citrus, melon, berry 0·77 0·03 0·91 0·03 0·72 0·02 0·86 0·03
    Other fruit 0·90 0·04 1·19 0·04 0·68 0·02 1·14 0·03
    Total fruit 1·67 0·05 2·10 0·06 1·40 0·03 2·00 0·05
Vegetable Group (cup equivalents)
    Dark green vegetables 0·15 0·01 0·22 0·01 0·15 0·01 0·28 0·01
    Orange vegetables 0·14 0·01 0·17 0·01 0·13 0·01 0·18 0·01
    Potatoes 0·51 0·02 0·42 0·01 0·33 0·01 0·34 0·01
    Other starchy vegetables 0·14 0·01 0·18 0·01 0·10 0·01 0·15 0·00
    Tomatoes 0·34 0·01 0·38 0·01 0·26 0·01 0·33 0·01
    Other vegetables 1·02 0·03 0·62 0·02 0·85 0·02 0·66 0·02
    Total vegetables 2·30 0·05 1·99 0·04 1·82 0·04 1·94 0·04
Legumes (cup equivalents) 0·10 0·01 0·13 0·01 0·06 0·01 0·08 0·00
Meat Group (oz. lean meat equivalents)
    Red meat 2·33 0·08 1·92 0·05 1·41 0·05 1·21 0·03
    Poultry 1·41 0·06 1·01 0·03 1·24 0·05 0·95 0·03
    Fish (high omega) 0·32 0·03 0·19 0·01 0·18 0·02 0·15 0·01
    Fish (low omega) 0·81 0·06 0·53 0·02 0·45 0·03 0·43 0·02
    Franks, luncheon meat 0·66 0·03 0·72 0·02 0·36 0·02 0·40 0·02
    Organ meat 0·04 0·01 0·03 0·00 0·03 0·01 0·02 0·00
    Meat, poultry & fish 5·57 0·10 4·39 0·09 3·67 0·07 3·15 0·07
    Eggs 0·46 0·02 0·35 0·01 0·30 0·01 0·25 0·01
    Nuts & seeds 0·63 0·05 0·60 0·03 0·34 0·03 0·32 0·02
    Soya 0·05 0·01 0·001 0·00 0·03 0·01 0·001 0·00
Alcoholic beverages (drinks) 0·82 0·05 1·10 0·09 0·45 0·03 0·56 0·06
Added sugars (teaspoons) 16·66 0·41 12·75 0·37 12·08 0·28 9·83 0·31
Discretionary fat (oil) (g) 17·70 0·57 17·72 0·39 12·69 0·40 15·82 0·37
Discretionary fat (solid) (g) 45·71 0·84 37·08 0·73 32·00 0·59 27·06 0·53

MPED, MyPyramid Equivalents Database; 24HR, 24 h dietary recall; NIH, National Institutes of Health.

*

Means for the 24HR are based on the 953 men and 926 women who completed two 24HR.

FFQ mean at least 20% smaller than 24HR mean.

FFQ mean at least 20% larger than 24HR mean.

Table 3 presents estimated correlations of true and FFQ-reported intakes and attenuation factors for twenty-nine food groups. Three food groups (yoghurt, organ meat, soya) are not included because they are too rarely consumed to obtain stable estimates. Results are presented for both unadjusted and energy-adjusted (residual) intakes. For the five most commonly consumed food groups (non-whole grains, total grains, total vegetables, added sugars, discretionary fat (solid)), estimates were obtained using the method for foods consumed every day, described in the Methods section. For the rest of the food groups, estimates were obtained using the method for episodically consumed foods. After energy adjustment, most food groups had correlations with true intake greater than 0·5; the food groups with lowest correlations after energy adjustment were legumes (0·34 for women), potatoes (0·35 for women), discretionary fat (oil) (0·38 for women, 0·43 for men) and low omega fish (0·42 for men). Attenuation factors were generally greater than 0·4, although several food groups had lower values; the food groups with lowest attenuation factors after energy adjustment were discretionary fat (oil) (0·18 for women), potatoes (0·23 for women), legumes (0·28 for women) and other starchy vegetables (0·29 for women).

Table 3.

Estimates of the correlation of true and FFQ-reported food intakes (ρQT) and the attenuation factor (λ) for FFQ-reported food intake, with standard errors; NIH–AARP Diet and Health Study

Men (n 984) Women (n 958)


MPED food group Model ρQT SE λ SE ρQT SE λ SE
Cheese Unadjusted 0·59 0·06 0·55 0·05 0·42 0·08 0·35 0·06
Energy-adjusted 0·63 0·08 0·58 0·05 0·48 0·10 0·42 0·06
Milk Unadjusted 0·68 0·03 0·53 0·03 0·67 0·03 0·50 0·03
Energy-adjusted 0·70 0·03 0·53 0·03 0·73 0·03 0·52 0·03
Total dairy Unadjusted 0·61 0·04 0·44 0·03 0·58 0·03 0·42 0·03
Energy-adjusted 0·63 0·04 0·44 0·03 0·73 0·03 0·52 0·03
Whole grains Unadjusted 0·59 0·04 0·55 0·04 0·48 0·04 0·48 0·05
Energy-adjusted 0·65 0·04 0·61 0·04 0·53 0·05 0·52 0·05
Non-whole grains Unadjusted 0·34 0·04 0·24 0·03 0·39 0·05 0·25 0·03
Energy-adjusted 0·50 0·05 0·37 0·04 0·47 0·06 0·31 0·04
Total grains Unadjusted 0·35 0·04 0·24 0·03 0·39 0·05 0·23 0·03
Energy-adjusted 0·54 0·05 0·39 0·03 0·53 0·05 0·31 0·03
Citrus, melon, berry Unadjusted 0·64 0·03 0·54 0·03 0·57 0·03 0·43 0·03
Energy-adjusted 0·70 0·03 0·59 0·03 0·62 0·03 0·46 0·03
Other fruit Unadjusted 0·70 0·03 0·68 0·04 0·60 0·03 0·52 0·03
Energy-adjusted 0·74 0·03 0·71 0·04 0·64 0·04 0·55 0·03
Total fruit Unadjusted 0·70 0·02 0·61 0·03 0·58 0·03 0·46 0·03
Energy-adjusted 0·76 0·03 0·66 0·03 0·65 0·04 0·51 0·03
Dark green vegetables Unadjusted 0·75 0·10 0·57 0·06 0·52 0·07 0·50 0·05
Energy-adjusted 0·78 0·10 0·59 0·06 0·58 0·08 0·56 0·06
Orange vegetables Unadjusted 0·62 0·10 0·41 0·05 0·57 0·07 0·48 0·05
Energy-adjusted 0·71 0·10 0·50 0·05 0·62 0·07 0·54 0·05
Potatoes Unadjusted 0·58 0·12 0·37 0·05 0·38 0·07 0·27 0·05
Energy-adjusted 0·60 0·14 0·40 0·06 0·35 0·11 0·23 0·05
Other starchy vegetables Unadjusted * * 0·50 0·19 0·26 0·07
Energy-adjusted * * 0·56 0·18 0·29 0·07
Tomatoes Unadjusted 0·44 0·11 0·29 0·05 0·60 0·09 0·42 0·05
Energy-adjusted 0·54 0·11 0·39 0·06 0·64 0·10 0·45 0·05
Other vegetables Unadjusted 0·46 0·05 0·37 0·04 0·44 0·05 0·34 0·05
Energy-adjusted 0·50 0·06 0·43 0·05 0·54 0·06 0·44 0·05
Total vegetables Unadjusted 0·46 0·04 0·32 0·03 0·42 0·05 0·32 0·04
Energy-adjusted 0·55 0·05 0·43 0·04 0·52 0·05 0·44 0·04
Legumes Unadjusted 0·44 0·08 0·44 0·08 0·40 0·23 0·27 0·07
Energy-adjusted 0·50 0·09 0·48 0·08 0·34 0·20 0·28 0·07
Fish high omega Unadjusted 0·48 0·11 0·59 0·10 0·46 0·17 0·45 0·12
Energy-adjusted 0·55 0·13 0·66 0·11 0·60 0·15 0·56 0·12
Fish low omega Unadjusted 0·39 0·09 0·42 0·09 0·74 0·14 0·47 0·08
Energy-adjusted 0·42 0·09 0·47 0·09 0·71 0·13 0·50 0·09
Red meat Unadjusted 0·56 0·05 0·50 0·04 0·83 0·10 0·47 0·04
Energy-adjusted 0·54 0·05 0·55 0·05 0·84 0·09 0·52 0·05
Poultry Unadjusted 0·47 0·11 0·34 0·05 0·39 0·09 0·25 0·04
Energy-adjusted 0·53 0·10 0·42 0·05 0·46 0·09 0·33 0·05
Franks, luncheon meat Unadjusted 0·61 0·07 0·54 0·05 0·55 0·13 0·36 0·05
Energy-adjusted 0·64 0·07 0·60 0·05 0·67 0·15 0·39 0·06
Meat, poultry & fish Unadjusted 0·44 0·04 0·27 0·03 0·45 0·06 0·23 0·03
Energy-adjusted 0·44 0·06 0·31 0·05 0·53 0·05 0·33 0·03
Eggs Unadjusted 0·70 0·06 0·78 0·05 0·54 0·11 0·53 0·07
Energy-adjusted 0·69 0·05 0·81 0·05 0·55 0·11 0·56 0·07
Nuts & seeds Unadjusted 0·54 0·04 0·58 0·05 0·48 0·07 0·48 0·06
Energy-adjusted 0·54 0·06 0·56 0·06 0·60 0·10 0·50 0·07
Alcoholic beverages Unadjusted 0·80 0·03 0·67 0·04 0·81 0·03 0·68 0·04
Energy-adjusted 0·82 0·03 0·70 0·03 0·81 0·03 0·65 0·04
Added sugars Unadjusted 0·55 0·03 0·41 0·03 0·46 0·04 0·39 0·03
Energy-adjusted 0·63 0·03 0·44 0·03 0·58 0·03 0·43 0·03
Discretionary fat (oil) Unadjusted 0·46 0·05 0·34 0·04 0·30 0·07 0·19 0·04
Energy-adjusted 0·43 0·05 0·34 0·04 0·38 0·15 0·18 0·04
Discretionary fat (solid) Unadjusted 0·65 0·04 0·47 0·03 0·50 0·04 0·36 0·03
Energy-adjusted 0·76 0·03 0·61 0·03 0·64 0·04 0·49 0·03

NIH, National Institutes of Health; MPED, MyPyramid Equivalents Database.

*

For Other starchy vegetables in men, the measurement error model failed to converge.

Table 4 shows the number of incident cancers in the NIH–AARP cohort by gender and cancer type during the follow-up period, 1995 to 2003(24). Table 4 also shows for each cancer type the study’s power to detect an odds ratio of 1·5 using FFQ-reported intake if ρTQ=1 (no loss of power due to measurement error) and if ρTQ=0·5. The odds ratio compares the 90th to 10th percentile of true intake in a univariate diet–disease model (see Appendix 4 for details). For common cancer types such as prostate, breast, lung and colorectal, the power to detect the association is at least 85% when ρTQ = 0·5. For less common types such as myeloid leukaemia, thyroid and liver, the power is less than 30%.

Table 4.

Number of incident cancers in the NIH–AARP Diet and Health Study 1995–2003; power to detect an odds ratio of 1·5* using FFQ-reported intake if the correlation of true and FFQ-reported intakes ρTQ = 1 and if ρTQ = 0·5

Men (n 262 642) Women (n 183 535)


Type of cancer No. of cases Power if ρQT = 1 Power if ρQT = 0·5 No. of cases Power if ρQT = 1 Power if ρQT = 0·5
Prostate 15,949 1·00 1·00 –    
Breast –     5478 1·00 1·00
Endometrial –     1041 1·00 0·72
Ovarian –     475 0·93 0·41
Lung 3769 1·00 1·00 2288 1·00 0·97
Colorectal 3031 1·00 0·99 1457 1·00 0·86
Melanoma 1485 1·00 0·86 543 0·96 0·45
Bladder 1246 1·00 0·80 235 0·68 0·23
Non-Hodgkin’s lymphoma 1114 1·00 0·75 605 0·97 0·50
Head and neck 939 1·00 0·68 300 0·78 0·28
Kidney 857 1·00 0·64 322 0·81 0·29
Pancreatic 601 0·97 0·49 348 0·84 0·31
Stomach 440 0·91 0·38 127 0·43 0·14
Oesophagus 425 0·90 0·37 76 0·28 0·10
Brain 356 0·85 0·32 146 0·48 0·16
Myeloma 331 0·82 0·30 157 0·51 0·17
Myeloid leukaemia 288 0·77 0·27 119 0·41 0·14
Liver 238 0·69 0·23 72 0·27 0·10
Thyroid 153 0·50 0·16 176 0·56 0·18
*

Odds ratio comparing the 90th to 10th percentile of true intake in a univariate diet–disease model.

Discussion

We have proposed a methodology to evaluate an FFQ’s ability to measure intake of episodically consumed foods and used it to evaluate the FFQ in the NIH–AARP Diet and Health study. The methodology uses a two-part model designed for such foods(9,12) and Monte Carlo methods to estimate the relationship between true and FFQ-reported intakes. In order to evaluate energy-adjusted intake of such foods, we use a three-part food and energy model(10).

The model for episodically consumed foods is designed for studies in which the reference instrument covers only a short time period and the probability of zero intake is substantial. In the NIH–AARP study, the reference instrument is the repeat application of a single 24HR. Some other studies use as reference the average of many (up to 28) days of 24HR or food records(2527). In such studies, simpler measurement error models may be used. Such studies tend to be small (fewer than 200 subjects), however, and it is generally considered that study designs with more subjects and fewer days per subject are more efficient(28,29).

In epidemiological studies, the most important characteristics in determining the utility of an FFQ are the correlation of true and FFQ-reported intakes and the attenuation factor. We estimated these characteristics for twenty-nine food groups in the NIH–AARP calibration sub-study. After energy adjustment, correlations of true and FFQ-reported intakes were estimated to be 0·5 or greater, and attenuation factors 0·4 or greater, for most of the food groups, including some that are of particular interest to nutritional epidemiologists, such as whole grains, total fruit, total vegetables, red meat and alcoholic beverages.

A limitation of our analysis (and of most FFQ validation studies) is our reliance on 24HR (or similar self-report instrument) as a reference instrument. We have assumed that the 24HR provides unbiased estimates of food group intake. Recent studies using biomarkers as references, however, have shown that the 24HR is biased for energy, protein and energy-adjusted protein intake, and that these biases sometimes, but not always, lead to overestimation of correlations with true intake and attenuation factors when the 24HR is used as a reference instrument(21,30). While no such biomarkers are presently known for any food groups, it is not unreasonable to expect similar biases for at least some food groups. To the extent that this is so, our estimates of the correlations with true intake and attenuation factors could be biased and may overestimate the true parameters.

The two-part model used in the current analysis has been validated by computer simulations(9). In addition, graphical methods have been developed to assess the model’s goodness-of-fit to specific data(12). A comparison of Tables 1 and 3 indicates that the precision of the estimated correlations and attenuation factors is related to the frequency with which a food is consumed. The standard errors of the estimated correlations and attenuation factors for less frequently consumed food groups, such as legumes, fish and other starchy vegetables, tend to be larger than those for more frequently consumed food groups such as milk, whole grains and red meat, and, as we saw with other starchy vegetables in men, there is a possibility that the measurement error model will fail to converge if the food group is infrequently consumed. This is because there is less information about the amount consumed on consumption days when there are fewer consumption days in the data. In particular, if there are only a few subjects who have non-zero consumption on multiple days, then it is difficult to separate between- and within-person error (i.e. difficult to estimate the variances of U2i and ε2ij). To estimate infrequently consumed foods with more precision, it would be necessary to have a larger calibration sub-study.

A number of studies have validated FFQ for intakes of foods or food groups in American adults, including those described by Salvini et al.(25), Flagg et al.(7) and Millen et al.(8). Direct comparison with these studies is complicated by the fact that the food groups validated were generally not the same as in the present study and were not measured in MPED servings. Further, some studies, such as Salvini et al.(25), used food records rather than 24HR as reference instrument. To the extent that comparisons can be made, results of the present study are generally similar to the earlier studies. For example, Salvini et al.(25) reported energy-adjusted correlations for intake of fish, eggs (men) and tomatoes that were similar to those in Table 3, although the correlation for egg intake in women was somewhat higher in their study (0·77 compared to 0·55). Flagg et al.(7) reported energy-adjusted correlations for total grains, total vegetables and red meat that were similar to those in the present study.

The study most comparable to ours is an analysis of the Eating at America’s Table Study (EATS) reported by Millen et al.(8). In that analysis, the NCI’s DHQ was validated for food groups derived from the USDA Pyramid Servings Database(31), a database that is similar to MPED but based on earlier dietary guidelines. The DHQ is a later version of the FFQ used in the NIH–AARP study. In general, energy-adjusted correlations in EATS and the present study are similar, although there are some differences. For example, energy-adjusted correlations for total vegetables were 0·63 (men) and 0·66 (women) in EATS, compared with 0·55 (men) and 0·52 (women) in the present study. Possible explanations for these differences include the facts that the EATS sample was comprised of subjects aged 20–70 years, while the NIH–AARP sample was older (50–71) years, and the EATS analysis did not use methods designed for episodically consumed foods.

As shown in Table 4, when the correlation of true and FFQ-reported intakes is at least 0·5, the NIH–AARP study will have at least 85% power to detect moderate diet–disease associations (odds ratios 1·5 or greater) for common cancer types such as prostate, breast, lung and colorectal. For less common types such as thyroid or liver, however, the power to detect such associations will be much lower. Similarly, when the attenuation factor is at least 0·4, moderate diet–disease associations may be substantially underestimated, but not to the point where they disappear altogether. For example, if the true odds ratio is 1·5 (α1=log(1·5) in equation (4)) and the attenuation factor is 0·4, then the estimated odds ratio will have mean equal to about 1·50·4 = 1·18. Moreover, when the attenuation factor is small, say less than 0·2, attempting to ‘deattenuate’ estimates will give unreliable results and is not advised. When the attenuation factor is at least 0·4, however, it is possible to deattenuate an estimated log odds ratio by dividing it by the attenuation factor, giving an approximately unbiased estimate(4). In summary, the levels of correlation and attenuation factor that we have estimated indicate that the NIH–AARP FFQ is suitable for estimating and testing many, but not all, diet–disease relationships in the NIH–AARP cohort.

Acknowledgements

R.J.C.’s research was supported by a grant from the National Cancer Institute (CA57030). A.S. directed the study. A.S., A.F.S., F.E.T., L.S.F. and R.J.C. participated in the design of the study. D.M., L.S.F., R.J.C. and V.K. developed the statistical methodology and designed and reviewed the analysis. D.M. and M.A.S. carried out the analysis. D.M. had primary responsibility for writing the manuscript, but all authors contributed to its writing and revision.

Appendix 1

Box–Cox transformations

The Box–Cox transformation is defined as

g(x;y)={(xλ1)/λifλ>0log(x)ifλ=0 (5)

for some transformation parameter λ. We use Box–Cox transformations to transform Qi and positive Rij to approximate normality, defining Qi*=g(Qi;λQ)andRij*=g(Rij;λR), and choosing λQ and λR so as to maximize the Shapiro–Wilk test statistic for normality for Qi and positive Rij. We also define Ti*=g(Ti;λT), choosing λT so as to minimize the Kolmogorov test statistic for normality for Ti in the Monte Carlo distribution (see Appendix 2).

Appendix 2

Monte Carlo distribution of true and FFQ-reported intakes

Under the assumptions that Rij is an unbiased estimate of Ti, and that the model defined by equations (1) and (2) is correct, one can write Ti as a function of Qi and random effects (U1i, U2i) as:

Ti=H(β10+β11Qi*+U1i)×E{g1(β20+β21Qi*+U2i+ε2ij;λR)|Qi,U2i}H(β10+β11Qi*+U1i)×g*(β20+β21Qi*+U2i;λR), (6)

where H(x) is the logistic function and g*(ν; λR) is a Taylor-series approximation of the expectation E{g−1(ν + ε2ij; λR)|ν},

g*(ν;λR)=g1(ν;λR)+12σε222{g1(ν;λR)}ν2. (7)

One can estimate relationships between true and FFQ-reported intakes by generating a Monte Carlo distribution of (Ti, Qi). For each individual i, generate random effects (U1i, U2i) having a joint normal distribution with variances (σ^U12,σ^U22) and correlation ρ̂U1, U2, and calculate Ti as in equation (6). Repeat this process m = 100 times for each individual, so that the resulting Monte Carlo distribution has n × m pseudo-individuals. Under the assumptions listed above, the Monte Carlo distribution will be approximately the same as the real distribution of (Ti, Qi), and one can use it to estimate the attenuation factor γ1 and correlation with true intake ρTQ, described in the main text. We estimate ρTQ as the sample correlation of Box–Cox transformed variables Ti*andQi* in the Monte Carlo distribution. Similarly, we estimate γ1 as the slope in the regression of Ti*v.Qi*, after standardizing both variables so that a unit change on the transformed scale is equal to the change from the 10th to 90th percentile of true intake on that scale. Standard errors are estimated using a bootstrap method.

The Monte Carlo method for energy-adjusted foods is similar to the method for unadjusted foods, except that we use the parameter estimates from the three-part food and energy model (see Appendix 3) to generate random effects (U1i, U2i, U3i) and create a Monte Carlo distribution of (TFi, TEi,QFi,QEi). We then calculate TRias the residual in the regression of TFi*v.TEi*, and QRi as the residual in the regression of on QFi*v.QEi*. Finally, we estimate ρTQ as the sample correlation of TRi and QRi, and γ1 as the slope in the regression of TRi v. QRi.

Appendix 3

Three-part food and energy model

For individual i, i = 1,…, n, let

  • TFi be the true usual intake of an episodically consumed food

  • TEi be the true usual intake of energy

  • RFij be the 24HR-reported intake of the episodically consumed food on day j

  • REij be the 24HR-reported intake of energy on day j

  • QFi be the FFQ-reported intake of the episodically consumed food

  • QEi be the FFQ-reported intake of energy.

Under the food and energy model, we assume that the 24HR is unbiased for true intake:

E(RFij|i)=TFi (8)

and

E(REij|i)=TEi, (9)

and that, after appropriate transformations, the relationship between 24HR and FFQ can be described by the following three-part non-linear mixed-effects model:

logit(pi)=β10+β11×QFi*+β12×QEi*+U1i, (10)
(RFij*|RFij>0)=β20+β21×QFi*+β22×QEi*+U2i+ε2ij (11)

and

REij*=β30+β31×QFi*+β32×QEi*+U3i+ε3ij, (12)

where pi is the probability that RFij > 0, logit(p) = log{p=(1 − p)} is the inverse of the logistic distribution function, random effects (U1i, U2i, U3i) have a joint normal distribution with mean zero, within-person random errors (ε2ij, ε3ij) have a joint normal distribution with mean zero, and within-person errors (ε2ij, ε3ij) are independent of random effects (U1i, U2i, U3i). Variables QFi*,QEi*,RFij*andREij* are Box–Cox transformations of QFi, QEi, RFij and REij to scales on which they are approximately normal (see Appendix 1).

Appendix 4

Estimating power in a univariate diet–disease model

Suppose we are fitting diet–disease model, equation (4), using Qi to measure intake and there are D cases of disease in the cohort. Kaaks et al.(22) show that the power to detect α1 at significance level γ is approximately:

PowerΦ(|α1|/var(α^1)zγ/2)Φ(|α1|ρTQσT*Dzγ/2), (13)

where σT* is the standard deviation of Ti*, Φ(z) is the standard normal distribution and zγ/2 = Φ−1(1 − γ/2). The power to test the hypothesis that the odds ratio comparing the 90th to 10th percentile of true intake is equal to 1·5, i.e. that 2·56σT* × α1 = log(1·5), is then:

PowerΦ({log(1·5)/2·56}ρTQDzγ/2). (14)

Footnotes

None of the authors have any conflict of interest.

References

  • 1.Freudenheim JL, Marshall JR. The problem of profound mismeasurement and the power of epidemiologic studies of diet and cancer. Nutr Cancer. 1988;11:243–250. doi: 10.1080/01635588809513994. [DOI] [PubMed] [Google Scholar]
  • 2.Freedman LS, Schatzkin A, Wax Y. The impact of dietary measurement error on planning a sample size required in a cohort study. Am J Epidemiol. 1990;132:1185–1195. doi: 10.1093/oxfordjournals.aje.a115762. [DOI] [PubMed] [Google Scholar]
  • 3.Beaton GH, Milner J, Corey P, et al. Sources of variance in 24-hour dietary recall data; implications for nutritional study design and interpretation. Am J Clin Nutr. 1979;32:2546–2559. doi: 10.1093/ajcn/32.12.2546. [DOI] [PubMed] [Google Scholar]
  • 4.Rosner B, Spiegelman D, Willett WC. Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. Am J Epidemiol. 1990;132:734–744. doi: 10.1093/oxfordjournals.aje.a115715. [DOI] [PubMed] [Google Scholar]
  • 5.Freedman LS, Carroll RJ, Wax Y. Estimating the relation between dietary intake obtained from a food frequency questionnaire and true average intake. Am J Epidemiol. 1991;134:310–320. doi: 10.1093/oxfordjournals.aje.a116086. [DOI] [PubMed] [Google Scholar]
  • 6.Feskanich D, Rimm EB, Giovannucci EL, et al. Reproducibility and validity of food intake measurements from a semiquantitative food frequency questionnaire. J Am Diet Assoc. 1993;93:790–796. doi: 10.1016/0002-8223(93)91754-e. [DOI] [PubMed] [Google Scholar]
  • 7.Flagg EW, Coates RJ, Calle EE, et al. Validation of the American Cancer Society Cancer Prevention Study II Nutritional Survey Cohort food frequency questionnaire. Epidemiology. 2000;11:462–468. doi: 10.1097/00001648-200007000-00017. [DOI] [PubMed] [Google Scholar]
  • 8.Millen AE, Midthune D, Thompson FE, et al. The National Cancer Institute Diet History Questionnaire: validation of pyramid food servings. Am J Epidemiol. 2006;163:279–288. doi: 10.1093/aje/kwj031. [DOI] [PubMed] [Google Scholar]
  • 9.Tooze JA, Midthune D, Dodd KW, et al. A new statistical method for estimating the usual intake of episodically consumed foods with application to their distribution. J Am Diet Assoc. 2006;106:1575–1587. doi: 10.1016/j.jada.2006.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Freedman LS, Guenther PM, Krebs-Smith SM, et al. A population’s distribution of Health Eating Index-2005 component scores can be estimated when more than one 24-hour recall is available. J Nutr. 2010;140:1529–1534. doi: 10.3945/jn.110.124594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Marriott BP, Olsho L, Hadden L, et al. Intake of added sugars and selected nutrients in the United States, National Health and Nutrition Examination Survey (NHANES) 2003–2006. Clin Rev Food Sci Nutr. 2010;50:228–258. doi: 10.1080/10408391003626223. [DOI] [PubMed] [Google Scholar]
  • 12.Kipnis V, Midthune D, Buckman DW, et al. Modeling data with excess zeros and measurement error: application to evaluating relationships between episodically consumed foods and health outcomes. Biometrics. 2009;65:1003–1010. doi: 10.1111/j.1541-0420.2009.01223.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Thompson FE, Kipnis V, Midthune D, et al. Performance of a food-frequency questionnaire in the US NIH–AARP (National Institutes of Health–American Association of Retired Persons) Diet and Health Study. Public Health Nutr. 2008;11:183–195. doi: 10.1017/S1368980007000419. [DOI] [PubMed] [Google Scholar]
  • 14.Schatzkin A, Subar AF, Thompson FE, et al. Design and serendipity in establishing a large cohort with wide dietary intake distributions: the National Institutes of Health–American Association of Retired Persons Diet and Health Study. Am J Epidemiol. 2001;154:1119–1125. doi: 10.1093/aje/154.12.1119. [DOI] [PubMed] [Google Scholar]
  • 15.US National Cancer Institute, Division of Cancer Control and Population Sciences, Applied Research Program. Risk Factor Monitoring and Methods homepage. 2010 http://www.riskfactor.cancer.gov.
  • 16.Subar AF, Midthune D, Kulldorff M, et al. Evaluation of alternative approaches to assign nutrient values to food groups in food frequency questionnaires. Am J Epidemiol. 2000;152:279–286. doi: 10.1093/aje/152.3.279. [DOI] [PubMed] [Google Scholar]
  • 17.Friday JE, Bowman SA. MyPyramid Equivalents Database for USDA Survey Food Codes, 1994–2002 Version 1.0. Beltsville, MD: USDA, Agricultural Research Service, Community Nutrition Research Group; 2006. available at http://www.ars.usda.gov/ba/bhnrc/fsrg. [Google Scholar]
  • 18.US Department of Health and Human Services & US Department of Agriculture. Dietary Guidelines for Americans. Washington, DC: US Government Printing Office; 2005. [Google Scholar]
  • 19.Box GEP, Cox DR. An analysis of transformations. J R Stat Soc B. 1964;26:211–252. [Google Scholar]
  • 20.US National Cancer Institute, Division of Cancer Control and Population Sciences, Applied Research Program. Risk Factor Monitoring and Methods. Usual Dietary Intakes: Background. 2010 http://www.riskfactor.cancer.gov/diet/usualintakes.
  • 21.Kipnis V, Midthune D, Freedman LS, et al. Empirical evidence of correlated biases in dietary assessment instruments and its implications. Am J Epidemiol. 2001;153:394–403. doi: 10.1093/aje/153.4.394. [DOI] [PubMed] [Google Scholar]
  • 22.Kaaks R, Riboli E, van Staveren W. Calibration of dietary intake measurements in prospective cohort studies. Am J Epidemiol. 1995;142:548–556. doi: 10.1093/oxfordjournals.aje.a117673. [DOI] [PubMed] [Google Scholar]
  • 23.Willett WC, Howe GR, Kushi LH. Adjustment for total energy intake in epidemiologic studies. Am J Clin Nutr. 1997;65 Suppl. 4:1220S–1228S. doi: 10.1093/ajcn/65.4.1220S. [DOI] [PubMed] [Google Scholar]
  • 24.George SM, Mayne ST, Leitzmann MF, et al. Dietary glycemic index, glycemic load, and risk of cancer: a prospective cohort study. Am J Epidemiol. 2009;169:462–472. doi: 10.1093/aje/kwn347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Salvini S, Hunter DJ, Sampson L, et al. Food-based validation of a dietary questionnaire: the effects of week-to-week variation in food consumption. Int J Epidemiol. 1989;18:858–867. doi: 10.1093/ije/18.4.858. [DOI] [PubMed] [Google Scholar]
  • 26.Bohlscheid-Thomas S, Hoting I, Boeing H, et al. Reproducibility and relative validity of food group intake in a food frequency questionnaire developed for the German part of the EPIC project. European Prospective Investigation into Cancer and Nutrition. Int J Epidemiol. 1997;26 Suppl. 1:S59–S69. doi: 10.1093/ije/26.suppl_1.s59. [DOI] [PubMed] [Google Scholar]
  • 27.Shu XO, Yang G, Jin F, et al. Validity and reproducibility of the food frequency questionnaire used in the Shanghai Women’s Health Study. Eur J Clin Nutr. 2004;58:17–23. doi: 10.1038/sj.ejcn.1601738. [DOI] [PubMed] [Google Scholar]
  • 28.Rosner B, Willett WC. Interval estimates for correlation coefficients corrected for within-person variation: implications for study design and hypothesis testing. Am J Epidemiol. 1988;127:377–386. doi: 10.1093/oxfordjournals.aje.a114811. [DOI] [PubMed] [Google Scholar]
  • 29.Kaaks R, Riboli E, van Staveren W. Sample size requirements for calibration studies of dietary intake measurements in prospective cohort investigations. Am J Epidemiol. 1995;142:557–565. doi: 10.1093/oxfordjournals.aje.a117674. [DOI] [PubMed] [Google Scholar]
  • 30.Kipnis V, Subar AF, Midthune D, et al. Structure of dietary measurement error: results of the OPEN Biomarker Study. Am J Epidemiol. 2003;158:14–21. doi: 10.1093/aje/kwg091. [DOI] [PubMed] [Google Scholar]
  • 31.Cook A, Friday JE. Pyramid Servings Database for USDA Survey Food Codes, Version 2.0. Washington, DC: Agricultural Research Service, US Department of Agriculture; 2004. [Google Scholar]

RESOURCES