Skip to main content
American Journal of Epidemiology logoLink to American Journal of Epidemiology
. 2015 May 20;181(12):996–1007. doi: 10.1093/aje/kwu468

Applying Recovery Biomarkers to Calibrate Self-Report Measures of Energy and Protein in the Hispanic Community Health Study/Study of Latinos

Yasmin Mossavar-Rahmani *, Pamela A Shaw, William W Wong, Daniela Sotres-Alvarez, Marc D Gellman, Linda Van Horn, Mark Stoutenberg, Martha L Daviglus, Judith Wylie-Rosett, Anna Maria Siega-Riz, Fang-Shu Ou, Ross L Prentice
PMCID: PMC4462334  PMID: 25995289

Abstract

We investigated measurement error in the self-reported diets of US Hispanics/Latinos, who are prone to obesity and related comorbidities, by background (Central American, Cuban, Dominican, Mexican, Puerto Rican, and South American) in 2010–2012. In 477 participants aged 18–74 years, doubly labeled water and urinary nitrogen were used as objective recovery biomarkers of energy and protein intakes. Self-report was captured from two 24-hour dietary recalls. All measures were repeated in a subsample of 98 individuals. We examined the bias of dietary recalls and their associations with participant characteristics using generalized estimating equations. Energy intake was underestimated by 25.3% (men, 21.8%; women, 27.3%), and protein intake was underestimated by 18.5% (men, 14.7%; women, 20.7%). Protein density was overestimated by 10.7% (men, 11.3%; women, 10.1%). Higher body mass index and Hispanic/Latino background were associated with underestimation of energy (P < 0.05). For protein intake, higher body mass index, older age, nonsmoking, Spanish speaking, and Hispanic/Latino background were associated with underestimation (P < 0.05). Systematic underreporting of energy and protein intakes and overreporting of protein density were found to vary significantly by Hispanic/Latino background. We developed calibration equations that correct for subject-specific error in reporting that can be used to reduce bias in diet-disease association studies.

Keywords: biological markers, calibration equations, dietary measurement error, Hispanics/Latinos, 24-hour dietary recall, nutrition assessment


A growing body of research suggests that systematic bias and variability in self-reported dietary data can distort important associations between diet and disease (17). Although measurement error may relate to cultural differences or differences in the methodology of surveys (8), these biases have been relatively unexplored in the Hispanic/Latino population in the United States. Assessing objective recovery biomarkers such as doubly labeled water (DLW) and urinary nitrogen in combination with participant characteristics such as age, body mass index (weight (kg)/height (m)2), and background (2) helps to account for the systematic and random measurement error of dietary self-report. Calibration equations using this strategy can improve assessment of diet and disease association by adjusting for self-reported intake measurement error (5, 9). For example, calibrated, but not uncalibrated, energy was positively correlated with total and site-specific cancer incidence and coronary heart disease incidence in postmenopausal US women (1, 10). Calibrated, but not uncalibrated, protein intake was also associated with diabetes risk in the same population (6).

As part of the Study of Latinos: Nutrition and Physical Activity Assessment Study (SOLNAS), biomarker and self-report measures of diet were collected in a subsample of the multicenter Hispanic Community Health Study (HCHS)/Study of Latinos (SOL) cohort to create estimates of calibrated energy and protein consumption. We also explored whether Hispanic/Latino background (Central American, Cuban, Dominican, Mexican, Puerto Rican, and South American), in addition to other participant characteristics, influences the measurement error of dietary self-report.

METHODS

Study population

HCHS/SOL is a community-based cohort study of 16,415 self-identified Hispanic/Latino adults aged 18–74 years from randomly selected households at 4 US sites (Chicago, Illinois; Miami, Florida; Bronx, New York; San Diego, California) with baseline examination (2008–2011) and yearly telephone follow-up assessment. The goals of HCHS/SOL are to describe the prevalence of risk and protective factors for chronic conditions (e.g., cardiovascular disease, diabetes, and pulmonary disease) and to quantify all-cause mortality, fatal and nonfatal cardiovascular disease and pulmonary disease, and pulmonary disease exacerbation over time. The baseline clinical examination (11) included comprehensive biological, behavioral, and sociodemographic assessments. The sample design and cohort selection have been previously described (12, 13). In 2011–2012, a total of 485 HCHS/SOL participants were enrolled in SOLNAS. Enrollment targets at each site were set by specific categories for age, body mass index, and background to mirror the characteristics of the parent study. Study procedures were approved by the institutional review boards of all the sites and the coordinating/reading centers.

Subjects whose baseline HCHS/SOL study visit was within the allowable window were invited to participate (Figure 1). Participants were excluded for having any medical condition precluding participation, being pregnant or breastfeeding a child, weight instability (weight loss or gain of >15 pounds (>6.8 kg) in the past 4 weeks), taking medication for diabetes, or having extended travel plans during the study period. Of 1,360 participants who were invited and screened for eligibility, 342 (25.1%) declined, 176 (12.9%) were unable to be contacted, 227 (16.7%) were ineligible, and 603 (44.3%) agreed to participate, of which 485 (35.7% of total invited) signed an informed consent. Seven participants did not come back to the second clinic visit, and 1 participant did not provide either biomarker, leaving 477 who completed the protocol. A subsample of 98 participants (20%) repeated the entire protocol approximately 6 months later to provide reliability information.

Figure 1.

Figure 1.

Study of Latinos: Nutrition and Physical Activity Assessment Study (SOLNAS) procedures, 2010–2012. Invitation letter and telephone screening for SOLNAS occurred 12 months after the parent study visit for the San Diego site. DLW, doubly labeled water; GPAQ, Global Physical Activity Questionnaire; HCHS, Hispanic Community Health Study; SOL, Study of Latinos. Actical is an accelerometer that converts accelerations to a unit called “counts” over a given time period (1 minute) (Phillips Respironics, Bend, Oregon).

Study protocol and procedures

The DLW recovery biomarker was used to assess total energy expenditure over approximately a 2-week period (14). Total energy expenditure provides an estimate of energy intake in weight-stable individuals. After a loading of water labeled with deuterium plus the stable isotope oxygen-18 (DLW mixture), the tracers rapidly equilibrate in body water. The deuterium is eliminated from the body as water, and the elimination rate is proportional to water turnover. The oxygen-18 is eliminated as water plus carbon dioxide, and the oxygen-18 elimination is proportional to the sum of water and carbon dioxide production. The difference between these 2 elimination rates is proportional to the production of carbon dioxide that is the end product of energy metabolism from which total energy expenditure is estimated (14).

Study protocol consisted of 2 clinic visits with in-home activities between visits (Figure 1). Participants arrived for the first visit after a 4-hour fast and provided a baseline urine specimen (pre-DLW spot urine sample). Participants then ingested a DLW mixture that provided 1.38 g of 10 atom percent of 18O-labeled water and 0.086 g of 99.9% deuterium- labeled water per kilogram of body weight and provided in-clinic spot urine samples at 3 and 4 hours. Participants received a meal replacement beverage and additional fluids as necessary for post-DLW urine production. Participants aged ≥60 years provided a blood sample 3 hours post isotope to allow adjustment for age-related post void urine retention (15). Twelve days later, at the second clinic visit, subjects provided 2 more timed spot urine samples and a 12-hour fasting blood draw; they also completed indirect calorimetry to assess the resting metabolic rate for a related study on physical activity. Twenty percent of the sample repeated the study in visits 3 and 4.

Testing for quality control using blinded duplicate spot urine samples for DLW was done on 5% (n = 27) of the cohort. The intraclass correlation coefficient between the blind duplicate samples was 0.98 (P < 0.001), and the coefficient of variation was 3.3%. Isotopes for the biospecimens were measured by mass spectrometry at the Gas-Isotope-Ratio Mass Spectrometry Laboratory, US Department of Agriculture/Agricultural Research Service Children's Nutrition Research Center, Baylor College of Medicine, Houston, Texas (16, 17). Total energy expenditure was calculated from the carbon dioxide production rate by using the modified Weir equation (18). For the calculation of total energy expenditure, the standard respiratory quotient or food quotient of 0.86 for populations consuming a Western diet, which is based on a high-fat diet, was used (19). For energy-related analyses, we further excluded 6 participants without the DLW recovery biomarker in the primary study, leaving 471 in the primary study and 96 in the reliability study (Web Figure 1, available at http://aje.oxfordjournals.org/).

Urinary nitrogen biomarker of protein intake

Urinary nitrogen serves as a recovery biomarker for protein intake with 81% of protein intake recovered in the urine. Protein intake (g/day) is calculated as 6.25 × (24-hour urinary nitrogen/0.81) (20). Urinary nitrogen was assayed by the Michigan State University laboratory using the Kjeldahl digestion method followed by a colorimetric measurement of nitrogen using a kit manufactured by Hach Company (Loveland, Colorado). The intraclass correlation coefficient for blinded quality control duplicate samples (10%) for urinary nitrogen was 0.99, and the coefficient of variation was 6.1%.

Prior to the second visit, participants collected urine over a 24-hour period. Participants kept a detailed diary of the number of voids that they missed/spilled and indicated whether they took 3 para-amino benzoic acid (PABA) pills (100 mg/tablet; KAL-PABA, Nutraceutical Corporation, Park City, Utah), with 1 at each meal that is used to assess completion of urinary collection. All pills were from the same lot (number 140308) and were quality tested by Rhumbline Consulting (Pasadena, Maryland) to assess dissolution of tablets and amount recovered. Given the recommendations to check only PABA for unreliable samples (21), we tested only those samples (n = 5) that we deemed unreliable: those reporting <25g/day of protein and/or 24-hour urine samples with small volumes (<500 mL), as well as a 10% random sample of the SOLNAS 24-hour urine samples (n = 54). This test was performed to determine the level of urinary completion using gas chromatography at the Fred Hutchinson Cancer Research Center, Seattle, Washington. The average urinary completion rate (recovered PABA, ≥70%), excluding unreliable samples (n = 5) for the 10% random sample, was 44% (n = 54) (refer to the assessment of urinary completion in the Web Appendix). For the protein analyses, we excluded 27 participants from the main study and 7 from the reliability study because of either the missing protein biomarker, urine sample <500 mL, or an inadequate sample due to 2 or more missed urine collections, leaving 450 individuals in the main and 90 in the reliability studies (Web Figure 1).

Dietary assessment

Two 24-hour dietary recalls were collected in the HCHS/SOL parent study (22). One in-person recall was collected at baseline with a second telephone recall occurring from 5 days to a year later with the majority of recalls collected 5–90 days after the baseline visit. In SOLNAS, an in-person 24-hour dietary recall was also collected at the first visit, mirroring the procedure used for the parent study diet assessment. In this analysis, dietary data for the 24-hour dietary recall are based on the second telephone recall from the parent study and the first SOLNAS in-person recall. By combining the SOLNAS in-person recall with the HCHS/SOL telephone recall, we used the 24-hour dietary recall measures closest to the SOLNAS baseline. For the reliability study, an in-person recall at visit 3 and then a telephone recall 5–90 days after visit 3 were collected to repeat the 24-hour dietary recall assessment protocol. Recalls were conducted by using Nutrition Data System for Research, version 11, software developed by the Nutrition Coordinating Center, University of Minnesota (Minneapolis, Minnesota). Recalls were conducted by bilingual interviewers, most of whom were native Spanish speakers and certified in the use of Nutrition Data System for Research software, using the language preferred by the respondent. In the analyses, we excluded seventeen 24-hour dietary recalls that were unreliable to the interviewer or had energy intake of <500 kcal/day. No participant had both dietary recalls excluded.

Data on demographic, health, lifestyle, and acculturation characteristics were collected at the HCHS/SOL parent study baseline visit. Self-reported physical activity in a typical week was assessed using a modified Global Physical Activity Questionnaire. We constructed a diet behavior variable based on meals and snacks eaten at home from the 24-hour dietary recall (<75% of meals and snacks vs. ≥75% of meals and snacks at home after averaging the percentage of meals and snacks at home for each 24-hour dietary recall).

Statistical analyses

We used log-transformed consumption estimates for each of energy, protein, and protein density (percentage of energy derived from protein) for statistical analyses. From the 24-hour dietary recall, we estimated usual intake from the 2-day mean. Regression calibration equations were developed by using linear regression models that predicted true intakes of energy and protein by regressing the biomarker measures on the 2-day mean of self-reported intakes and other study subject characteristics, as described previously (7). Stepwise backwards selection was used to select the regression calibration model. The final model includes only those covariates that were significant at the 0.10 level. The linear regression models were fitted by using data from both the primary and reliability measures of intake and biomarkers, and coefficients were determined by using generalized estimating equations with a working independence assumption. P values were determined by Wald tests with robust variance, as estimated by generalized estimating equations. All statistical procedures were conducted with SAS, version 9.3, software (SAS Institute, Inc., Cary, North Carolina) and R, version 3.0.1, software (R Foundation for Statistical Computing, Vienna, Austria).

RESULTS

Table 1 shows demographic and lifestyle characteristics by sex in the SOLNAS participants (n = 477). Overall, SOLNAS participants resembled the HCHS/SOL parent study participants in age, body mass index, Hispanic/Latino background, Spanish language preference, and education. The mean age of the participants was 46.0 years at the HCHS/SOL baseline visit. The average number of days between the parent study visit and the first SOLNAS visit was 229 (standard deviation, 55) days. Of the sample, 30% were Mexican, 25.8% Puerto Rican, 14.5% Cuban, 10.7% Central American, 10.1% Dominican, and 9.0% South American. The mean body mass index was 29.6 (standard deviation, 6); 0.8% were underweight (body mass index (BMI), <18.5), 19.1% were normal weight (BMI, 18.5 to <25), 39.8% were overweight (BMI, 25 to <30), and 40.3% were obese (BMI, ≥30). Three of 4 participants preferred Spanish, and 68.5% had a household income of <$30,000. More than half of SOLNAS participants reported having high physical activity, with 61.8% meeting 2008 Physical Activity Guidelines for Americans (23). Half reported no current use of alcohol, and 21% were current smokers.

Table 1.

Demographic and Lifestyle Characteristics of Participants in the Study of Latinos: Nutrition and Physical Activity Assessment Study, by Sex, 2010–2011a

Characteristic No. Overall (n = 477), % Male (n = 189), % Female (n = 288), %
Age group, years
 18–24 42 8.8 12.7 6.3
 25–39 92 19.3 20.6 18.4
 40–54 211 44.2 42.9 45.1
 55–74 132 27.7 23.8 30.2
BMIb group
 Underweight (<18.5) 4 0.8 1.1 0.7
 Normal (18.5–24.9) 91 19.1 19.6 18.8
 Overweight (25–29.9) 190 39.8 40.2 39.6
 Obese (≥30) 192 40.3 39.2 41.0
Hispanic/Latino background
 Central American 51 10.7 11.6 10.1
 Cuban 69 14.5 16.9 12.8
 Dominican 48 10.1 9.5 10.4
 Mexican 143 30 25.9 32.6
 Puerto Rican 123 25.8 27.0 25.0
 South American 43 9.0 9.0 9.0
Language of preference, Spanish 364 76.3 69.8 80.6
Yearly household income
 Missing 42 8.8 9.0 8.7
 ≤$10,000 68 14.3 11.1 16.3
 $10,001–$20,000 163 34.2 32.3 35.4
 $20,001–$40,000 138 28.9 31.2 27.4
 $40,001–$75,000 56 11.7 12.2 11.5
 >$75,000 10 2.1 4.2 0.7
Education status
 Less than high school 153 32.1 27.0 35.4
 High school or equivalent (GED) 119 24.9 27.5 23.3
 Trade/vocational school 70 14.7 12.2 16.3
 University/college 135 28.3 33.3 25.0
Cigarette use
 Never 284 59.5 51.3 64.9
 Former 92 19.3 24.3 16.0
 Current 100 21.0 23.8 19.1
Alcohol use/drinking levelc
 No current use 250 52.4 40.2 60.4
 Low-level use 206 43.2 51.9 37.5
 High-level use 21 4.4 7.9 2.1
≥75% meals and snacks at home, yes 325 68.1 56.6 75.7
Self-reported physical activity leveld per 2008 guidelines
 Inactive 124 26.0 16.9 31.9
 Low 58 12.2 9.0 14.2
 Moderate 44 9.2 8.5 9.7
 High 251 52.6 65.6 44.1

Abbreviations: BMI, body mass index; GED, General Educational Development (test).

a Based on the Hispanic Community Study/Study of Latinos parent study baseline visit.

b BMI expressed as weight (kg)/height (m)2.

c Current low-level use: <14 drinks/week; current high-level use: ≥14 drinks/week.

d Self-reported physical activity in a typical week, assessed using an interviewer-administered modified Global Physical Activity Questionnaire (available at https://www2.cscc.unc.edu/hchs/system/files/forms/UNLICOMMPhysicalActivityPAE02182008.pdf). The 2008 Physical Activity Guidelines for Americans are available at http://www.health.gov/paguidelines/guidelines/.

Tables 2 and 3 show age-adjusted geometric means for recovery biomarkers and 24-hour dietary recall measures of energy, protein, and protein density by Hispanic/Latino background and sex. The 2-day, 24-hour dietary recall mean underestimated energy and protein intakes and overestimated protein density. The ratios of the 100 × 24-hour dietary recall/recovery biomarker for energy are 74.7% overall (72.7% for women and 78.2% for men); for protein: 81.5% overall (79.3% for women and 85.3% for men); and for protein density: 110.7% overall (110.1% for women and 111.3% for men). There were differences in the underestimation of energy and protein intakes by Hispanic/Latino background and sex. Although all groups underreported energy intake, male and female Dominicans exhibited the highest, and South American males and females the lowest, level of underreporting. As for protein intake, male and female Dominicans exhibited the highest, and Puerto Rican females and Mexican males the lowest, level of underreporting. The level of overestimation for protein density was highest for Puerto Rican males and females.

Table 2.

Age-Adjusted Geometric Mean Values for Nutritional Biomarker and Self-Reported Measures of Energy, Protein, and Protein Density by Hispanic/Latino Background, Females,a Study of Latinos: Nutrition and Physical Activity Assessment Study, 2010–2012

Assessment Overall (n = 285)
Dominican (n = 29)
Central American (n = 29)
Cuban (n = 37)
Mexican (n = 93)
Puerto Rican (n = 72)
South American (n = 25)
P Valueb
Geometric Mean 95% CI Geometric Mean 95% CI Geometric Mean 95% CI Geometric Mean 95% CI Geometric Mean 95% CI Geometric Mean 95% CI Geometric Mean 95% CI
Energy, kcal/day
 2-Day mean 1,579 1,513, 1,647 1,266 1,115, 1,439 1,470 1,293, 1,670 1,558 1,376, 1,763 1,744 1,624, 1,872 1,538 1,418, 1,667 1,661 1,446, 1,907 0.0009
 Biomarker 2,170 2,128, 2,213 2,047 1,927, 2,173 2,024 1,906, 2,149 2,233 2,106, 2,367 2,240 2,167, 2,316 2,254 2,170, 2,341 2,011 1,885, 2,147 0.0006
 2-Day mean/biomarker 72.7 69.6, 76.1 61.9 54.0, 71.0 72.6 63.3, 83.3 69.8 61.1, 79.7 77.8 72.1, 84.0 68.2 62.6, 74.4 82.6 71.2, 95.7 0.0187
Protein, g/day
 2-Day mean 64.5 61.7, 67.5 51.3 44.8, 58.6 61.3 53.6, 70.1 62.7 54.9, 71.5 71.9 66.6, 77.6 62.2 57.0, 68.0 69.7 60.4, 80.5 0.0007
 Biomarker 81.3 78.1, 84.7 77.4 68.5, 87.6 79.5 70.3, 90.0 85.5 75.8, 96.5 88.9 82.8, 95.3 72.0 66.4, 78.1 86.5 75.8, 98.8 0.0049
 2-Day mean/biomarker 79.3 74.9, 83.9 66.2 55.8, 78.6 77.1 64.9, 91.5 73.3 61.9, 86.7 80.9 73.4, 89.2 86.4 77.2, 96.8 80.5 67.0, 96.8 0.1767
Protein densityc
 2-Day mean 16.5 16.0, 17.0 16.1 14.7, 17.7 17.0 15.5, 18.6 16.4 15.0, 17.9 16.4 15.6, 17.3 16.7 15.7, 17.7 17.0 15.4, 18.7 0.9564
 Biomarker 15.0 14.4, 15.6 14.9 13.1, 17.0 15.7 13.9, 17.8 15.3 13.5, 17.4 15.9 14.8, 17.1 12.8 11.8, 13.9 17.4 15.2, 20.0 0.0008
 2-Day mean/biomarker 110.1 105.2, 115.3 108.1 93.9, 124.4 108.2 94.2, 124.2 106.8 93.2, 122.4 103.5 95.6, 112.0 130.1 118.8, 142.6 97.3 83.6, 113.2 0.0030

Abbreviation: CI, confidence interval.

a Overall sample sizes among females for energy, protein, and protein density are 285, 275, and 272, respectively.

b Global test (5 df) for Hispanic/Latino background has equal (geometric) mean intakes.

c Percentage of energy derived from protein.

Table 3.

Age-Adjusted Geometric Mean Values for Nutritional Biomarker and Self-Reported Measures of Energy, Protein, and Protein Density by Hispanic/Latino Background, Males,a Study of Latinos: Nutrition and Physical Activity Assessment Study, 2010–2012

Assessment Overall (n = 186)
Dominican (n = 18)
Central American (n = 22)
Cuban (n = 30)
Mexican (n = 48)
Puerto Rican (n = 51)
South American (n = 17)
P Valueb
Geometric Mean 95% CI Geometric Mean 95% CI Geometric Mean 95% CI Geometric Mean 95% CI Geometric Mean 95% CI Geometric Mean 95% CI Geometric Mean 95% CI
Energy, kcal/day
 2-Day mean 2,127 2,019, 2,242 1,541 1,304, 1,822 1,965 1,696, 2,278 2,175 1,906, 2,482 2,393 2,168, 2,641 2,046 1,858, 2,254 2,547 2,128, 3,050 0.0001
 Biomarker 2,721 2,655, 2,788 2,652 2,452, 2,869 2,730 2,547, 2,926 2,528 2,376, 2,690 2,869 2,738, 3,005 2,790 2,666, 2,920 2,666 2,449, 2,902 0.0364
 2-Day mean/biomarker 78.2 74.0, 82.6 58.1 48.6, 69.5 72.0 61.5, 84.3 86.0 74.7, 99.1 83.4 75.0, 92.7 73.3 66.1, 81.3 95.6 78.8, 115.9 0.0012
Protein, g/day
 2-Day mean 86.4 81.7, 91.4 65.6 55.0, 78.2 85.0 72.5, 99.7 84.8 73.7, 97.5 97.8 87.9, 108.7 82.8 74.6, 91.9 105.5 85.0, 131.0 0.0023
 Biomarker 101.2 96.2, 106.5 108.1 92.0, 127.0 100.3 86.6, 116.1 99.6 87.5, 113.3 104.7 95.0, 115.5 93.7 85.2, 103.2 124.8 102.3, 152.2 0.1618
 2-Day mean/biomarker 85.3 79.5, 91.6 60.7 48.5, 75.9 84.8 69.2, 103.9 85.1 71.1, 101.8 93.3 81.5, 106.9 88.4 77.3, 101.0 84.6 64.1, 111.5 0.0555
Protein densityc
 2-Day mean 16.5 15.9, 17.1 17.3 15.3, 19.4 17.4 15.6, 19.3 15.6 14.2, 17.3 16.5 15.4, 17.8 16.5 15.4, 17.7 16.3 14.1, 18.9 0.7702
 Biomarker 14.8 14.0, 15.6 16.3 13.8, 19.2 14.7 12.7, 17.1 15.6 13.6, 17.9 14.5 13.1, 16.0 13.5 12.2, 14.9 17.9 14.6, 22.0 0.1188
 2-Day mean/biomarker 111.3 105.1, 118.0 105.9 88.3, 126.8 118.0 100.1, 139.0 100.1 86.2, 116.4 114.2 102.2, 127.5 122.4 109.9, 136.3 91.1 72.8, 113.9 0.1101

Abbreviation: CI, confidence interval.

a Overall sample sizes among males for energy, protein, and protein density are 186, 175, and 172, respectively.

b Global test (5 df) for Hispanic/Latino background has equal (geometric) mean intakes.

c Percentage of energy derived from protein.

Table 4 shows the fitted multivariate regression model of log (self-report/biomarker), which is parameterized so that the intercept represents the mean for the baseline group of inactive, nonsmoking males of average age (46 years), average body mass index of 29.6, Spanish language preference, low income (≤$10,000), less than high school education, Mexican background, no alcohol use, and <75% of meals and snacks taken at home. The exponentiated coefficient indicates change in the ratio of the geometric means of the self-report and biomarker nutrients.

Table 4.

Linear Regression of Log(Self-Report) Minus Log(Biomarker) on the Participant Characteristics for Self-Reported Measures of Energy, Protein, and Protein Density, Study of Latinos: Nutrition and Physical Activity Assessment Study, 2010–2012a

Energy
Protein
Protein Density
β SE P Valueb β SE P Valueb β SE P Valueb
Intercept −0.1927 0.075 0.010 −0.0905 0.093 0.328 0.1077 0.080 0.179
Age, yearsc −0.0010 0.001 0.486 −0.0052 0.002 0.004 −0.0042 0.002 0.007
BMIc −0.0183 0.003 <0.001 −0.0153 0.003 <0.001 0.0034 0.003 0.196
Female −0.0286 0.034 0.405 −0.0397 0.042 0.346 −0.0198 0.038 0.599
Hispanic/Latino background
 Central American −0.0914 0.053 0.083 −0.0199 0.067 0.768 0.0839 0.056 0.136
 Cuban 0.0037 0.056 0.948 0.0023 0.060 0.969 −0.0173 0.066 0.792
 Dominican −0.2861 0.056 <0.001 −0.2756 0.074 <0.001 0.0221 0.065 0.735
 Puerto Rican −0.1323 0.047 0.004 −0.0184 0.057 0.746 0.1051 0.052 0.042
 South American −0.0028 0.053 0.958 −0.1132 0.080 0.158 −0.0998 0.060 0.096
English preference 0.0601 0.044 0.170 0.1152 0.058 0.047 0.0882 0.054 0.103
Income
 Missing −0.0387 0.075 0.605 −0.1744 0.078 0.025 −0.1527 0.064 0.018
 $10,001–$20,000 −0.0312 0.049 0.526 −0.1090 0.061 0.075 −0.0861 0.046 0.061
 $20,001–$40,000 −0.0272 0.053 0.608 −0.0753 0.069 0.274 −0.0620 0.053 0.242
 $40,001–$50,000 −0.0932 0.066 0.160 −0.1625 0.078 0.038 −0.0707 0.071 0.317
 $50,001–$75,000 0.0300 0.077 0.695 0.0068 0.095 0.943 −0.0298 0.099 0.764
 >$75,000 −0.0447 0.127 0.725 −0.1926 0.149 0.196 −0.2082 0.112 0.063
Education
 High school 0.0100 0.044 0.818 0.0374 0.054 0.492 0.0396 0.043 0.359
 Trade school −0.0084 0.056 0.880 0.0005 0.059 0.994 0.0241 0.064 0.707
 University 0.0439 0.039 0.260 0.0443 0.052 0.392 −0.0052 0.046 0.910
Current smoker −0.0155 0.044 0.725 0.1165 0.056 0.037 0.1454 0.048 0.002
Alcohol drinking level
 Low-level use 0.0036 0.032 0.909 −0.0329 0.038 0.391 −0.0262 0.034 0.446
 High-level use 0.0064 0.095 0.946 0.0550 0.144 0.702 0.0549 0.109 0.614
≥75% of meals and snacks at home −0.0472 0.032 0.143 −0.0292 0.042 0.489 0.0237 0.037 0.522
Physical activityd
 Low 0.0316 0.050 0.524 −0.0150 0.069 0.829 −0.0372 0.052 0.476
 Moderate 0.0238 0.057 0.676 0.0154 0.066 0.815 −0.0002 0.058 0.997
 High 0.0395 0.044 0.374 0.0110 0.049 0.821 −0.0353 0.044 0.427

Abbreviations: BMI, body mass index; SE, standard error.

a Regression model was fitted by using data from both the primary and reliability studies and using generalized estimating equations with a working independence assumption.

b Overall P values for Hispanic/Latino background were significant for energy (P < 0.001), protein (P = 0.006), and protein density (P = 0.027).

c Baseline group: Mexican, age centered at mean of 46 years, BMI (weight (kg)/height (m)2) centered at mean of 29.6, male, Spanish language preference, low income (≤$10,000), low education (less than high school), nonsmoker, no alcohol use, and no physical activity.

d Global Physical Activity Questionnaire.

For energy intake, the ratio of geometric means for the self-reported intake/biomarker is approximately 12% lower for Puerto Ricans, compared with Mexicans, keeping all other factors the same. For energy intake, body mass index (P < 0.001) and Hispanic/Latino background (P < 0.001) were independently associated with the difference between the log values of self-reported and biomarker values. Increasing body mass index was associated with more underreporting. Adjusted for the other factors in the model, the reference group, Mexicans, had significant underreporting of energy, and Dominicans, Central Americans, and Puerto Ricans had greater underreporting compared with Mexicans.

For protein intake, age (P = 0.004), body mass index (P < 0.001), Hispanic/Latino background (P = 0.006), language preference (P = 0.047), and smoking (P = 0.037) were significant independent predictors of misreporting. Increasing age, increasing body mass index, and Spanish language preference were associated with more underreporting of protein intake. Being a current smoker was associated with less underreporting and potentially overreporting of protein intake. For protein density, age (P = 0.007), Hispanic/Latino background (P = 0.027), and smoking (P = 0.002) were independent significant predictors of misreporting. Being younger or a smoker was associated with increased overreporting, while English language preference was borderline significant for overreporting (P = 0.103). Table 5 presents the regression calibration coefficients for the logarithm of energy, protein, and protein density biomarkers. Education level, physical activity, alcohol use, and frequent (≥75%) meals and snacks at home were not selected for any calibration equations (P > 0.10) and are not included in Table 5.

Table 5.

Regression Calibration Coefficients for Log-Transformed Biomarker, Where the Log-Transformed Self-Report Values Are Based on the 2-Day Mean of the 24-Hour Dietary Recall, Study of Latinos: Nutrition and Physical Activity Assessment Study, 2010–2012a

Energy
Protein
Protein Density
β SE P Valueb β SE P Valueb β SE P Valueb
Intercept 7.9232 0.026 <0.001 4.7273 0.037 <0.001 2.7858 0.027 <0.001
2-Day meanc 0.0495 0.018 0.006 0.1327 0.039 0.001 0.2622 0.061 <0.001
Agec −0.0018 <0.001 <0.001 0.0030 0.001 0.018
BMIc 0.0152 0.001 <0.001 0.0160 0.002 <0.001
Female −0.2306 0.015 <0.001 −0.2192 0.033 <0.001
Hispanic/Latino background
 Central American −0.0507 0.025 0.041 −0.1007 0.046 0.029 −0.0545 0.046 0.239
 Cuban −0.0510 0.021 0.016 −0.0621 0.046 0.174 −0.0053 0.049 0.914
 Dominican −0.0419 0.026 0.107 −0.0409 0.055 0.459 −0.0195 0.058 0.736
 Puerto Rican −0.0140 0.017 0.413 −0.1168 0.042 0.006 −0.1141 0.045 0.012
 South American −0.0685 0.021 0.001 0.0614 0.047 0.193 0.1296 0.049 0.009
English preference −0.1262 0.042 0.002 −0.0916 0.050 0.066
Income
 Missing 0.0087 0.029 0.762
 $10,001–$20,000 0.0068 0.023 0.767
 $20,001–$40,000 0.0337 0.024 0.166
 $40,001–$50,000 0.0200 0.030 0.509
 $50,001–$75,000 0.0236 0.034 0.493
 >$75,000 −0.0968 0.044 0.027
Current smoker 0.0419 0.016 0.009 −0.1223 0.040 0.002 −0.1704 0.042 <0.001

Abbreviations: BMI, body mass index; SE, standard error.

a Regression model was fitted by using data from both the primary and reliability studies and using generalized estimating equations with a working independence assumption.

b Overall P values for Hispanic/Latino background were significant for energy (P = 0.007), protein (P = 0.004), and protein density (P < 0.001). Income was significant for energy only (P = 0.04).

c Age centered on mean of 46 years, BMI (weight (kg)/height (m)2) centered on mean of 29.6, log-transformed 2-day mean of energy centered on 7.489026, log-transformed 2-day mean of protein centered on 4.285903, and log-transformed 2-day mean of protein density centered on 2.80012.

Figure 2 shows the main and reliability studies for the biomarker and self-reported intakes, along with the within-person correlations. The reliability of the biomarker, that is, the correlation between the repeat measures from the reliability subset, was r = 0.81 for energy, 0.66 for protein, and 0.59 for protein density. For 24-hour dietary recall, r = 0.58 for energy intake, r = 0.51 for protein intake, and r = 0.24 for protein density. For protein and protein density, a sensitivity analysis was performed to investigate whether excluding observations with extremely low urinary volume (below the 10th percentile), reflecting potential incompleteness of the 24-hour urine collection, influenced the regression calibration coefficients. As presented in Web Table 1, the results were similar to those in Table 5, while the strength of evidence for language preference was weakened slightly.

Figure 2.

Figure 2.

Comparison of the logarithm (log) of visit 1 and visit 3 measures (n = 96 and n = 90 for energy and protein, respectively), Study of Latinos: Nutrition and Physical Activity Assessment Study, 2010–2012. A) Biomarker energy (kcal), correlation = 0.81; B) 24-HR energy intake (kcal), correlation = 0.58; C) biomarker protein (g/day), correlation = 0.66; D) 24-HR protein intake (g/day), correlation = 0.51; E) biomarker protein density (percentage of energy derived from protein), correlation = 0.59; F) 24-HR protein density (percentage of energy derived from protein), correlation = 0.24. DLW, doubly labeled water; 24 HR, 24-hour dietary recall; UN, urinary nitrogen.

Web Table 2 shows the traditional R2 and partial R2 values for each of the covariates in the models in Table 4, estimated by using the model without the repeated measures. Adjusted R2 coefficients, which adjust for the within-person variability in the biomarker, are also calculated by using the algorithm (7). The model R2 value was 54.0% for the energy calibration model; the adjusted R2 for self-report increases to 9.7% from 7.9%, and body mass index increases from 12.5% to 15.5%. Being female had the highest adjusted partial R2 (33.1%) followed by body mass index (15.5%). For the protein model, the R2 values were 26%, and being female had the highest adjusted partial R2 at 12.2% followed by body mass index (8.3%). For protein density, Hispanic/Latino background and language preference had the highest adjusted R2 values of 7.5% and 7.2%, respectively. We considered possible sex differences in dietary reporting by testing for an interaction between sex and the self-reported intake in the calibration model in Table 5; none of these interactions was significant. The coefficients for the sex-stratified calibration model are presented in Web Tables 3 and 4. For men, the adjusted R-squared coefficients were 46%, 28%, and 28% for energy, protein, and protein density, respectively; for women, these values were 47%, 32%, and 30%.

DISCUSSION

This is the first study in the United States to use recovery biomarkers to describe the measurement error structure of the 24-hour dietary recall method in a diverse Hispanic/Latino cohort of males and females. Our findings indicate underreporting of energy intake, more modest underreporting of protein intake, and overreporting of protein density. The findings are comparable to other validation studies, such as the Women's Health Initiative, the Energetics Dietary Assessment Study, and the National Cancer Institute's Observing Protein and Energy (OPEN) Study (2, 5, 24, 25). Differences were identified within Hispanic/Latino groups, with Dominicans reflecting the most underestimation and South Americans the least underestimation of energy. Math and spatial skills necessary to describe foods, a skill that is difficult to assess, may explain this finding. As well, different food preferences may tax these skills differently. For example, amorphous foods such as rice or mixed dishes may be more difficult to describe than protein dishes, which are less amorphous. Also, South Americans in our study had higher actual protein intake than other groups did, which may have facilitated recall of protein foods.

Although the urinary recovery of the PABA pill (tested on 10% of the sample) was low (44%), we cannot categorically state that the biomarker-based protein values should be higher than those we report. First, we did not test the entire sample, given the cost and the recent recommendation that PABA testing is not needed (21). Second, the mean biomarker protein did not differ significantly between participants deemed to have <70% vs. ≥70% PABA recovery (P = 0.109) (refer to the assessment of urinary completion in the Web Appendix). Third, we also performed sensitivity analysis by excluding observations with low urinary volumes and did not see any differences in the coefficients for calibration equations, adding independent evidence that we have reasonable compliance. Finally, although the participants were very dedicated, the study protocol had high participant burden, and we cannot ascertain for sure whether all 3 PABA pills were taken as stated. Note that participants had a relatively low level of education (32% with less than a high school education) and income (majority with household incomes of <$30,000) compared with participants in the Observing Protein and Energy or Women's Health Initiative biomarker studies. In addition, the reliabilities of the biomarker total energy expenditure from the DLW measure (r = 0.81), urinary nitrogen measure (r = 0.66), and protein density measure (r = 0.59) were higher, and the self-report measures for energy (r = 0.58), protein (r = 0.51), and protein density (r = 0.24) were lower compared with the respective measures reported in the Women's Health Initiative biomarker studies for the 24-hour dietary recall (2, 5).

Strengths of this study include having a recovery biomarker measure in close temporal proximity to data collected in the HCHS/SOL parent study; an ethnically diverse cohort of Hispanics/Latinos in the United States; a wide age range and representation of both sexes; and a reliability study that retested 20% of the sample that showed strong correlation coefficients. The large sample size allowed us to test systematic biases associated with misreporting and to develop calibrated consumption estimates to enhance the ability to perceive diet-disease relationships. The adjusted R2 coefficient for the calibration model was nearly 70% for energy, 40% for protein intake, and 30% for protein density, suggesting that although the calibrated estimates recover more of the variance in protein intake and protein density than self-reported values alone, the calibrated estimates have some limitation as a surrogate measure for the target intake. Much of the explained variation for calibrated nutrients came from variables other than the 2-day mean, including body mass index and sex for energy and protein models and English language and ethnicity for protein density. The sex-specific adjusted R2 coefficients were 46%, 28%, and 28% for energy, protein, and protein density, respectively, for men; for women, these values were 47%, 32%, and 30%.

Limitations of the study include a smaller number of some of the Hispanic/Latino subgroups, such as the Central Americans, Dominicans, and South Americans, particularly for males. Additional research with larger numbers from these ethnic groups is warranted.

In conclusion, we used recovery biomarkers to determine measurement error of self-report measures of energy, protein, and protein density from the 24-hour dietary recall in this diverse sample of Hispanics/Latinos residing in 4 cities in the United States. Overall, underreporting of energy and protein intakes and overreporting of protein density were prevalent. The extent of under- or overreporting of energy and protein intakes and protein density was characterized by body mass index, age, preferred language, and Hispanic/Latino background. These equations can be applied to diet association studies relating to diabetes, cardiovascular disease, cancer, sleep, or functional measures. In application of these equations, careful attention needs to be made to time frames, causality, and assumptions about the stability of dietary intake, which may not apply to individuals experiencing major weight change. Additionally, participants in the calibration study may be healthier and have higher social desirability (that can influence self-report) than the sample as a whole. These equations will advance our understanding of the role diet plays in chronic diseases that place a major burden on this nation's health.

Supplementary Material

Web Material

ACKNOWLEDGMENTS

Author affiliations: Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York (Yasmin Mossavar-Rahmani, Judith Wylie-Rosett); Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania (Pamela A. Shaw); Agricultural Research Service, US Department of Agriculture, and Children's Nutrition Research Center, Department of Pediatrics, Baylor College of Medicine, Houston, Texas (William W. Wong); Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina (Daniela Sotres-Alvarez, Fang-Shu Ou); Department of Psychology, University of Miami, Coral Gables, Florida (Marc D. Gellman); Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois (Linda Van Horn); Department of Public Health Sciences, Miller School of Medicine, University of Miami, Miami, Florida (Mark Stoutenberg); Institute for Minority Health Research, University of Illinois, Chicago, Illinois (Martha L. Daviglus); Departments of Nutrition and Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina (Anna Maria Siega-Riz); and Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington (Ross L. Prentice).

This work was supported by grant 11414319-R01HL095856 from the National Heart, Lung, and Blood Institute. The Hispanic Community Health Study/Study of Latinos was carried out as a collaborative study supported by contracts from the National Heart, Lung, and Blood Institute to the University of North Carolina (NO1-HC65233), University of Miami (N01-HC65234), Albert Einstein College of Medicine (N01-HC65235), Northwestern University (N01-HC65236), and San Diego State University (N01-HC65237). The following contribute to the Hispanic Community Health Study/Study of Latinos through a transfer of funds to the National Heart, Lung, and Blood Institute: the National Center on Minority Health and Health Disparities, the National Institute of Deafness and Other Communications Disorders, the National Institute of Dental and Craniofacial Research, the National Institute of Diabetes and Digestive and Kidney Diseases, the National Institute of Neurological Disorders and Stroke, and the Office of Dietary Supplements. Additional support at the Albert Einstein College of Medicine was provided from the Clinical and Translational Science Award (UL1 TR001073) from the National Center for Advancing Translational Sciences at the National Institutes of Health, and support at the Fred Hutchinson Cancer Research Center was provided from National Cancer Institute grantCA53996.

We thank the investigators and staff of the Hispanic Community Health Study/Study of Latinos for their valuable contributions. A complete list of staff and investigators can be found elsewhere (11).

An excerpt of this work was presented at a minisymposium entitled “Nutritional Epidemiology: Epidemiologic Methods in Examining Health Outcomes in Diverse Populations” at the American Society for Nutrition Annual Meeting, April 27, 2014, San Diego, California.

This trial was registered at clinicaltrials.gov as NCT02060344.

The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Conflict of interest: none declared.

REFERENCES

  • 1.Prentice RL, Huang Y, Kuller LH, et al. Biomarker-calibrated energy and protein consumption and cardiovascular disease risk among postmenopausal women. Epidemiology. 2011;222:170–179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Prentice RL, Mossavar-Rahmani Y, Huang Y, et al. Evaluation and comparison of food records, recalls, and frequencies for energy and protein assessment by using recovery biomarkers. Am J Epidemiol. 2011;1745:591–603. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Prentice RL, Sugar E, Wang CY, et al. Research strategies and the use of nutrient biomarkers in studies of diet and chronic disease. Public Health Nutr. 2002;5(6A):977–984. [DOI] [PubMed] [Google Scholar]
  • 4.Tooze JA, Subar AF, Thompson FE, et al. Psychosocial predictors of energy underreporting in a large doubly labeled water study. Am J Clin Nutr. 2004;795:795–804. [DOI] [PubMed] [Google Scholar]
  • 5.Neuhouser ML, Tinker L, Shaw PA, et al. Use of recovery biomarkers to calibrate nutrient consumption self-reports in the Women's Health Initiative. Am J Epidemiol. 2008;16710:1247–1259. [DOI] [PubMed] [Google Scholar]
  • 6.Tinker LF, Sarto GE, Howard BV, et al. Biomarker-calibrated dietary energy and protein intake associations with diabetes risk among postmenopausal women from the Women's Health Initiative. Am J Clin Nutr. 2011;946:1600–1606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Mossavar-Rahmani Y, Tinker LF, Huang Y, et al. Factors relating to eating style, social desirability, body image and eating meals at home increase the precision of calibration equations correcting self-report measures of diet using recovery biomarkers: findings from the Women's Health Initiative. Nutr J. 2013;12:63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Harrison GG, Galal OM, Ibrahim N, et al. Underreporting of food intake by dietary recall is not universal: a comparison of data from Egyptian and American women. J Nutr. 2000;1308:2049–2054. [DOI] [PubMed] [Google Scholar]
  • 9.Prentice RL. Covariate measurement errors and parameter estimation in a failure time regression model. Biometrika. 1982;692:331–342. [Google Scholar]
  • 10.Prentice RL, Shaw PA, Bingham SA, et al. Biomarker-calibrated energy and protein consumption and increased cancer risk among postmenopausal women. Am J Epidemiol. 2009;1698:977–989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Sorlie PD, Avilés-Santa LM, Wassertheil-Smoller S, et al. Design and implementation of the Hispanic Community Health Study/Study of Latinos. Ann Epidemiol. 2010;208:629–641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Lavange LM, Kalsbeek WD, Sorlie PD, et al. Sample design and cohort selection in the Hispanic Community Health Study/Study of Latinos. Ann Epidemiol. 2010;208:642–649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Daviglus ML, Talavera GA, Avilés-Santa ML, et al. Prevalence of major cardiovascular risk factors and cardiovascular diseases among Hispanic/Latino individuals of diverse backgrounds in the United States. JAMA. 2012;30817:1775–1784. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Schoeller DA, Hnilicka JM. Reliability of the doubly labeled water method for the measurement of total daily energy expenditure in free-living subjects. J Nutr. 1996;1261:348S–354S. [PubMed] [Google Scholar]
  • 15.Blanc S, Colligan AS, Trabulsi J, et al. Influence of delayed isotopic equilibration in urine on the accuracy of the 2H2 18O method in the elderly. J Appl Physiol. 2002;923:1036–1044. [DOI] [PubMed] [Google Scholar]
  • 16.Wong WW, Lee LS, Klein PD. Deuterium and oxygen-18 measurements on microliter samples of urine, plasma, saliva, and human milk. Am J Clin Nutr. 1987;455:905–913. [DOI] [PubMed] [Google Scholar]
  • 17.Wong WW, Clarke LL, Llaurador M, et al. A new zinc product for the reduction of water in physiological fluids to hydrogen gas for 2H/1H isotope ratio measurements. Eur J Clin Nutr. 1992;461:69–71. [PubMed] [Google Scholar]
  • 18.Weir JB. New methods for calculating metabolic rate with special reference to protein metabolism. J Physiol. 1949;109(1-2):1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Black AE, Prentice AM, Coward WA. Use of food quotients to predict respiratory quotients for the doubly-labelled water method of measuring energy expenditure. Hum Nutr Clin Nutr. 1986;405:381–391. [PubMed] [Google Scholar]
  • 20.Bingham SA. The use of 24-h urine samples and energy expenditure to validate dietary assessments. Am J Clin Nutr. 1994;59(1 suppl):227S–231S. [DOI] [PubMed] [Google Scholar]
  • 21.Subar AF, Midthune D, Tasevska N, et al. Checking for completeness of 24-h urine collection using para-amino benzoic acid not necessary in the Observing Protein and Energy Nutrition Study. Eur J Clin Nutr. 2013;678:863–867. [DOI] [PubMed] [Google Scholar]
  • 22.Siega-Riz AM, Sotres-Alvarez D, Ayala GX, et al. Food-group and nutrient-density intakes by Hispanic and Latino backgrounds in the Hispanic Community Health Study/Study of Latinos. Am J Clin Nutr. 2014;996:1487–1498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Kay MC, Carroll DD, Carlson SA, et al. Awareness and knowledge of the 2008 Physical Activity Guidelines for Americans. J Phys Act Health. 2014;114:693–698. [DOI] [PubMed] [Google Scholar]
  • 24.Emond JA, Patterson RE, Jardack PM, et al. Using doubly labeled water to validate associations between sugar-sweetened beverage intake and body mass among white and African-American adults. Int J Obes (Lond). 2014;384:603–609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kipnis V, Subar AF, Midthune D, et al. Structure of dietary measurement error: results of the OPEN biomarker study. Am J Epidemiol. 2003;1581:14–21; discussion 22–26. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web Material

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES