Skip to main content
The Journal of Nutrition logoLink to The Journal of Nutrition
. 2023 Feb 18;152(4):1107–1117. doi: 10.1093/jn/nxac004

Biomarkers for Components of Dietary Protein and Carbohydrate with Application to Chronic Disease Risk in Postmenopausal Women

Ross L Prentice 1,2,*, Mary Pettinger 1, Cheng Zheng 3, Marian L Neuhouser 1,2, Daniel Raftery 4, G A Nagana Gowda 4, Ying Huang 1,2, Lesley F Tinker 1, Barbara V Howard 5, JoAnn E Manson 6, Linda Van Horn 7, Robert Wallace 8, Yasmin Mossavar-Rahmani 9, Karen C Johnson 10, Linda Snetselaar 8, Johanna W Lampe 1,2
PMCID: PMC8970980  PMID: 35015878

Abstract

Background

We recently developed protein and carbohydrate intake biomarkers using metabolomics profiles in serum and urine, and used them to correct self-reported dietary data for measurement error. Biomarker-calibrated carbohydrate density was inversely associated with chronic disease risk, whereas protein density associations were mixed.

Objectives

To elucidate and extend this earlier work through biomarker development for protein and carbohydrate components, including animal protein and fiber.

Methods

Prospective disease association analyses were undertaken in Women's Health Initiative (WHI) cohorts of postmenopausal US women, aged 50–79 y when enrolled at 40 US clinical centers. Biomarkers were developed using an embedded human feeding study (n = 153). Calibration equations for protein and carbohydrate components were developed using a WHI nutritional biomarker study (n = 436). Calibrated intakes were associated with chronic disease incidence in WHI cohorts (n = 81,954) over a 20-y (median) follow-up period, using HR regression methods.

Results

Previously reported elevations in cardiovascular disease (CVD) with higher-protein diets tended to be explained by animal protein density. For example, for coronary heart disease a 20% increment in animal protein density had an HR of 1.20 (95% CI: 1.02, 1.42) relative to the HR for total protein density. In comparison, cancer and diabetes risk showed little association with animal protein density beyond that attributable to total protein density. Inverse carbohydrate density associations with total CVD were mostly attributable to fiber density, with a 20% increment HR factor of 0.89 (95% CI: 0.83, 0.94). Cancer risk showed little association with fiber density, whereas diabetes risk had a 20% increment HR of 0.93 (95% CI: 0.88, 0.98) relative to the HRs for total carbohydrate density.

Conclusions

In a population of postmenopausal US women, CVD risk was associated with high-animal-protein and low-fiber diets, cancer risk was associated with low-carbohydrate diets, and diabetes risk was associated with low-fiber/low-carbohydrate diets.

Key words: animal protein, biomarker, cancer, carbohydrate, cardiovascular disease, diabetes, dietary measurement error, fiber, metabolomics

Introduction

We recently examined (1) the utility of serum and urine metabolomics profiles for the assessment of macronutrient intakes in a 153-participant feeding study (2) within Women's Health Initiative (WHI) cohorts (3). Biomarkers meeting criteria for explaining feeding study intake variation in participants were obtained for protein, carbohydrate, protein energy/total energy (protein density), and carbohydrate energy/total energy (carbohydrate density), with metabolomics data included in each biomarker specification. For absolute protein intake, the new biomarker increased the percentage of feeding study variation explained to 48%, compared with ∼40% for the established urinary nitrogen (UN) biomarker (4). For absolute carbohydrate, the biomarker made use of the doubly labeled water (DLW) total energy intake biomarker (5). Biomarkers for protein density and carbohydrate density relied primarily on metabolomics data. We also considered personal characteristics, such as age, race/ethnicity, and BMI, to augment the fraction of feeding study intake variation explained by linear regression biomarker equations.

Additionally, we calculated biomarker-calibrated intake estimates and examined macronutrient associations with cardiovascular diseases (CVDs), cancers, and type 2 diabetes (T2D) in WHI cohorts, over follow-up periods having a median of ≤20 y (6). Calibrated carbohydrate density related inversely to disease risk for several of the outcomes considered, whereas associations with calibrated protein density were mixed. We also noted a considerable epidemiological literature reporting positive associations of self-reported dietary animal protein, with CVD incidence and mortality, as well as reports of differing associations of carbohydrate according to carbohydrate quality, including fiber content, with chronic disease risk (e.g., 7, 8, 9, 10, 11, 12, 13, 14, 15), thereby suggesting additional important hypotheses to be tested using objective intake biomarkers.

Here we consider the ability to define intake biomarkers for animal and plant protein, and for dietary fiber and added sugars components of carbohydrate, both for absolute intakes and for densities. As elaborated below we were able to develop suitable biomarkers for animal protein, fiber, and added sugars and their densities, but our biomarker criteria were not met for plant protein or its density. We then considered the ability to develop suitable calibration equations for dietary variables by regressing newly identified biomarkers on pertinent self-reported dietary data and participant characteristics. Calibration equations meeting criteria can be developed for animal protein and its density and for fiber density, but not for the other dietary variables. Calibrated intakes for dietary variables having a suitable calibration equation are then calculated for participants in larger WHI cohorts, and these are used in association analyses to elucidate and to extend our previous results on macronutrient intakes and chronic disease risk.

Methods

Study cohorts

During 1993–1998, 48,835 participants were randomly assigned in the WHI Dietary Modification (DM) trial, with 29,294 assigned to the usual diet comparison group (DM-C) and with 19,541 assigned to a low-fat dietary pattern intervention; 93,676 participants were enrolled in the companion prospective WHI Observational Study (OS) (3). All participants were postmenopausal women in the age range 50–79 y when enrolled at 40 US clinical centers. The WHI FFQ (16) targeted dietary intake over the preceding 3-mo period, and was administered at baseline and year 1 in the DM trial, and approximately every 3 y thereafter during the trial intervention period (ended March 31, 2005), and was administered at baseline and at year 3 in the OS. Here, we used FFQs collected at 1 y following randomization in the DM-C, rather than at enrollment, to reduce assessment biases due to the DM trial eligibility criterion of FFQ energy from fat of ≥32%. The 1-y visit following randomization is regarded here as “baseline” in the DM-C, and the 1-y FFQ assessments are referred to as “baseline FFQs.” FFQs at enrollment were used for baseline self-reported intake estimation in the OS. All nutrient content estimates from self-report assessments were derived from the University of Minnesota's Nutrition Data System for Research (NDS-R version 2005). Participants completed core questionnaires at WHI enrollment including medical history, reproductive history, family history, personal habits, medications and dietary supplements, and also provided a fasting blood sample (3).

Nutrition and Physical Activity Assessment Study

Following an initial Nutrition Biomarker Study in the DM trial cohort (17), we conducted a Nutrition and Physical Activity Assessment Study (NPAAS) (18) in 450 OS participants during 2007–2009. Its purposes were to examine the measurement properties of dietary self-report data for nutritional variables having an established biomarker, and to use biomarker data to correct dietary self-report data for measurement error in disease association analyses. WHI participants were recruited at 9 clinical centers to NPAAS, with an overrepresentation of black and Hispanic women and of women having BMI >30.0 kg/m2. The study protocol required 2 clinic visits separated by 2 wk and at-home activities. A 20% reliability subsample repeated the protocol ∼6 mo after their initial study participation. The first NPAAS visit included DLW dosing for total energy consumption assessment, completion of an FFQ, dietary supplement, and other questionnaires, collection of a blood specimen, and instructions and a kit for at-home 24-h urine collection. At the second clinic visit, participants brought 24-h urine specimens collected over the preceding day, provided a fasting blood specimen, and provided spot urine specimens to complete the DLW protocol. For 3 participants 24-h urine collections were incomplete and these were excluded from calibration equation development analyses. Baseline characteristics in the NPAAS cohort have been presented (18). Participants were similar in age to other WHI participants, 60% were overweight or obese (i.e., BMI ≥25.0), 95% were nonsmokers; 51% had a college degree or higher education; and 19%, 14%, and 64% self-classified respectively as being of black, Hispanic, or non-Hispanic white race/ethnicity.

NPAAS Feeding Study

We conducted the NPAAS Feeding Study (NPAAS-FS) in 153 WHI women in the Seattle area during 2010–2014. Fourteen of the 153 had previously participated in NPAAS. Participants were provided food and beverages over a 2-wk feeding period, with individualized diets that were intended to approximate their usual diets (2), so that blood and urine concentrations would stabilize quickly and intake variations in the study cohort would be substantially retained during the feeding period. The usual diet for a participant was assessed by starting with a 4-d food record, then making adjustments based on participant interview by a study nutritionist. The individualized diet was formulated to retain the nutrient content in this usual diet, using foods similar to the participant's usual choices, with priority given to foods having well-characterized nutrient content. Biomarker development relies on metabolomics profiles from the second clinic visit serum and 24-h urine, along with the inclusion of readily available participant characteristics. Baseline demographic and lifestyle characteristics for participants in the NPAAS-FS have been reported (2). Participants were well educated (83% college degree or higher), and nonsmokers (98%). Most were white (95%), overweight or obese (60%), and overall were of similar ages to other WHI enrollees.

Metabolite profiling

Serum and 24-h urine metabolomics profiles for both NPAAS and NPAAS-FS participants were derived at the Northwest Metabolomics Lab at the University of Washington, and included the following (1).

Serum metabolite measurements

Serum samples were analyzed by targeted LC-MS/MS using LC coupled to a Sciex Triple Quad 6500+ mass spectrometer. A total of 303 metabolites were targeted, of which 155 were detected with <20% missing values. Median CV based on blind duplicates was 7.2%. Separately, lipid metabolites were measured using the Sciex QTRAP 5500 Lipidyzer platform including the SelexION differential mobility spectrometry method. The method targeted 1070 lipids in 13 major lipid classes. Absolute concentrations of lipids were obtained based on 54 isotope-labeled internal standards, which resulted in the measurement of 664 serum lipids that had <20% missing values, and these had a median CV of 5.5%.

Urine metabolite measurements

Metabolite profiles from 24-h urine samples were analyzed by NMR spectroscopy using a Bruker Avance III 800 MHz NMR spectrometer. Relative concentrations for 57 targeted metabolites were obtained, for which the median CV was 4.0%. None of these metabolites had missing values. Urine metabolites were analyzed also by untargeted GC-MS using an Agilent 7890A/5875C instrument resulting in the identification of 285 metabolites, of which 275 for the 24-h urine samples had <20% missing values, and these had a larger median CV of 31.3%.

Outcome ascertainment, follow-up, and disease categories

Clinical outcomes were reported biannually in the DM trial and annually in the OS, by self-administered questionnaire (19) throughout the time from enrollment in 1993–1998 to the end of the intervention period (March 31, 2005), and annually thereafter in both cohorts. An initial report of CVD during cohort follow-up was confirmed by review of medical records by physician-adjudicators. Additionally, coronary heart disease (CHD, defined as nonfatal myocardial infarction plus CHD death), stroke (ischemic plus hemorrhagic), heart failure, and all deaths were centrally reviewed by expert physician investigator committees. All invasive cancers, except nonmelanoma skin cancer, were centrally coded using the National Cancer Institute's Surveillance Epidemiology and End results (SEER) procedures. Prevalent, treated T2D at baseline was self-reported during eligibility screening. Incident T2D during follow-up was documented by self-report at each annual contact. These sources have been shown to be consistent with medication inventories of oral agents or insulin (20).

Following the intervention period, WHI participants had the opportunity to enroll in additional follow-up through September 30, 2010, and subsequently for additional open-ended follow-up, with >80% of women doing so on each occasion. Cancer, diabetes, and all-cause mortality (including National Death Index matching) outcomes through February 28, 2020 are included here. Follow-up for CVD incidence is included only through September 30, 2010, because self-reports for most WHI participants were not adjudicated after that date. Also heart failure adjudication in WHI cohorts stopped after March 31, 2005. The median follow-up duration is 11.3 y for CVD incidence, 7.8 y for heart failure, and ∼20 y for cancer, diabetes, and mortality. Disease outcome categories are those considered in our previous report on total protein and total carbohydrate and their densities (6).

Statistical methods

Biomarker development in NPAAS-FS

Biomarker equations for macronutrient variables were considered by linear regression of log-transformed feeding study intakes on log-transformed serum and 24-h metabolite concentrations and other variables. Available additional measures include DLW-based measures of total energy intake (5), and UN estimates of protein intake (4). Our biomarker development activities (6) that led to novel biomarkers for each of protein, protein density, carbohydrate, and carbohydrate density (1), arose from linear regression of log-transformed NPAAS-FS intake on log-transformed metabolite concentrations and log-transformed DLW and UN measures. Participant characteristics, including dietary supplement use, race/ethnicity, season, education, age, BMI, and self-reported leisure activity were also considered for inclusion, with a significance level of P < 0.10 for both inclusion and retention in model building. Biomarker identification required a regression equation with cross-validated R2 (CV-R2) of ≥36% in NPAAS-FS.

Using these same procedures, biomarker equations for animal and plant protein, dietary fiber, and added sugars, and their corresponding densities, were considered using linear regression of log-transformed feeding study intakes on log-transformed metabolite concentrations, on log-transformed DLW and UN measures, and on participant characteristics. As in our preceding macronutrient biomarker article (6), log-transformed baseline FFQ measures for pertinent nutritional variables were considered as potential additional variables that could increase the percentage of variation (R2) in consumed diet explained by the intake measures, with FFQ values retained in the biomarker specification if associated at P < 0.10. As elaborated below, this consideration prepares the biomarker for use in disease association analyses, which condition on baseline FFQ estimates.

Variable selection was performed in the NPAAS-FS using the LASSO (21) procedure with 5-fold cross-validation for choosing the tuning parameter. The linear regression cross-validation procedure for biomarker development involved randomly splitting the data into 2 approximately equal-sized components, carrying out model building in one (the training set) and calculating the regression R2 for the fitted model in the other (the test set). The CV-R2 values presented are averages over 100 such random splits.

Calibration equation development in NPAAS

Biomarker equations meeting a 36% CV-R2 criterion were used to calculate biomarker-based intakes for each of the log-transformed dietary variables under study for the 436 participants in NPAAS who were not a part of NPAAS-FS. These NPAAS biomarker values were regressed linearly on concurrent NPAAS log-FFQ assessments, and on a disease category–specific set of personal characteristics listed in Supplemental Table 1 for each of these dietary variables, toward developing calibration equations for estimating macronutrient intakes in larger WHI cohorts. An assumption of independent measurement errors for the 2 biomarker assessments in the 14-participant NPAAS and NPAAS-FS overlap, which were based on specimen collections separated by ∼4 y, leads to regression R2 values that are adjusted for temporal variation in the biomarker. The adjustment involves dividing the linear equation R2 values by the estimated correlations between the paired biomarker assessments (18). An adjusted R2 value of ≥36% was required for a suitable calibration equation.

The use of FFQ data both for the calculation of biomarker responses and as principal explanatory variables in calibration equations deserves further comment. The FFQ data used for biomarker development are from baseline, a decade or more before the concurrent FFQs used in regression variables in NPAAS for calibration equation development were obtained. Furthermore, the FFQs used for biomarker development (n = 153) were from a nonoverlapping set of participants from those used as calibration equation regression variables in NPAAS (n = 436), implying independence of the noise component measurement error of the 2 sets of FFQ assessments. The possibility of some weak dependence between systematic aspects of measurement error for the 2 sets of assessments is mitigated by the inclusion of personal characteristics (e.g., BMI, age, or race/ethnicity) that might explain any shared systematic bias, in calibration equations. It follows that the dual use of FFQs from a different set of participants at quite different points in time is unlikely to cause bias in calibration equation developments. Note, however, that a lack of consideration of baseline FFQ measures for potential inclusion in biomarker equations could lead to bias when calibrated estimates are used in disease association analyses, as has been demonstrated in extensive simulation studies (22). This bias typically occurs when biomarkers are of only moderate strength, opening the possibility that baseline FFQ data can contribute to an explanation of feeding study intake variation, beyond that otherwise explained by the biomarker equation.

Disease association analyses in the DM-C and OS using biomarker-calibrated FFQ data

Table 1 presents baseline demographic and lifestyle characteristics for the same set of 81,954 participants, 16,939 from the DM-C and 65,015 from the OS, used in our previous macronutrient and chronic disease association analyses (6). Figure 1, also presented in reference 6, shows cohorts and participant flow in the WHI DM-C and the OS, and in the NPAAS and NPAAS-FS subcohorts, over the intervention and postintervention phases of WHI.

TABLE 1.

Baseline demographic and lifestyle characteristics of participants from the DM-C and from the OS, enrolled during 1993–1998 at 40 US clinical centers1

OS (n = 65,015) DM-C (n = 16,939)
Characteristic n % n %
Age, y
50–54 9126 14.0 1522 9.0
55–59 12,573 19.3 3634 21.5
60–64 14,381 22.1 4286 25.3
65–69 14,204 21.8 3902 23.0
70–74 10,259 15.8 2518 14.9
≥75 4472 6.9 1077 6.4
BMI, kg/m2
<25 27,020 41.6 4579 27.0
25 to <30 22,140 34.1 6013 35.5
≥30 15,855 24.4 6347 37.5
Race/ethnicity
White 56,032 86.2 14,250 84.1
Black 4122 6.3 1401 8.3
Hispanic 2022 3.1 536 3.2
American Indian 223 0.3 58 0.3
Asian/PI 1799 2.8 477 2.8
Unknown 817 1.3 217 1.3
Education
<High school 2414 3.7 607 3.6
High school/GED 10,223 15.7 2876 17.0
School after high school 23,573 36.3 6648 39.2
College degree or higher 28,805 44.3 6808 40.2
Family income, USD/y
<$20k 9118 14.0 2258 13.3
$20k to <$35k 14,967 23.0 4084 24.1
$35k to <$50k 13,278 20.4 3664 21.6
$50k to <$75k 13,584 20.9 3671 21.7
≥$75k 14,068 21.6 3262 19.3
Season of FFQ completion
Spring 16,755 25.8 4406 26.0
Summer 18,135 27.9 4172 24.6
Fall 15,148 23.3 4180 24.7
Winter 14,977 23.0 4181 24.7
Current smoker
No 61,120 94.0 15,917 94.0
Yes 3895 6.0 1022 6.0
Alcohol2
Nondrinker 18,410 28.3 5830 34.4
<1 drink/wk 20,583 31.7 4934 29.1
1 to <7 drinks/wk 17,424 26.8 4591 27.1
≥7 drinks/wk 8598 13.2 1584 9.4
Any dietary supplement use 36,358 55.9 8349 49.3
Medication use
Antihyperlipidemic medication 5996 9.2 1562 9.2
Antidiabetic medication 1916 2.9 686 4.0
Antihypertensive medication 19,098 29.4 5611 33.1
Postmenopausal hormone use
Never 25,334 39.0 6782 40.0
Past 9637 14.8 3357 19.8
Estrogens alone 16,451 25.3 3932 23.2
Estrogens + progestin 13,593 20.9 2868 16.9
Recreational physical activity, MET-h/wk
None 8318 12.8 2952 17.4
>0 to ≤9.5 22,703 34.9 6910 40.8
>9.5 to ≤20.5 18,017 27.7 4110 24.3
>20.5 15,977 24.6 2967 17.5
History of CVD3
No 61,934 95.3 16,263 96.0
Yes 3081 4.7 676 4.0
History of MI 1410 2.2 330 1.9
History of CABG/PCI 1139 1.8 215 1.3
History of heart failure 643 1.0 136 0.8
History of stroke 833 1.3 184 1.1
History of cancer
No 56,826 87.4 16,104 95.1
Yes 8189 12.6 835 4.9
Breast 3743 5.8 74 0.4
Colorectal 586 0.9 15 0.1
Ovary 427 0.7 72 0.4
Endometrium 1120 1.7 158 0.9
Thyroid 354 0.5 64 0.4
Cervix 794 1.2 211 1.2
Melanoma 877 1.3 113 0.7
Liver 24 0.0 1 0.0
Lung 145 0.2 15 0.1
Brain 32 0.0 6 0.0
Bone 42 0.1 9 0.1
Stomach 34 0.1 1 0.0
Leukemia 58 0.1 6 0.0
Bladder 120 0.2 12 0.1
Non-Hodgkin lymphoma 148 0.2 6 0.0
Hodgkin lymphoma 42 0.1 6 0.0
History of treated hypertension 15,954 24.5 5197 30.7
History of treated type 2 diabetes 2360 3.6 826 4.9
Family history of MI 33,803 52.0 8740 51.6
Family history of stroke 24,694 38.0 6404 37.8
Family history of breast cancer 9882 15.9 2333 14.4
Family history of colorectal cancer 10,831 16.7 2687 15.9
Family history of diabetes 20,889 32.1 5859 34.6
Gail model breast cancer risk score (tertiles)
<1.26 18,972 29.2 5607 33.1
1.27–1.80 22,329 34.3 5900 34.8
>1.80 23,714 36.5 5432 32.1
1

CABG/PCI, coronary artery bypass graft or percutaneous coronary intervention; CVD, cardiovascular disease; DM-C, dietary modification comparison group; GED, general educational development; MET, metabolic equivalent unit; MI, myocardial infarction; OS, Observational Study; PI, Pacific Islander.

2

Drinks of alcohol defined as serving in milliliters (345 for beer, 177 for wine, 43 for liquor).

3

Nonfatal MI, CABG/PCI, heart failure, or stroke.

FIGURE 1.

FIGURE 1

Study samples and flow in the Women's Health Initiative (WHI) cohorts of postmenopausal women aged 50–79 y at enrollment during 1993–1998 at 40 US clinical centers, and in Nutrition and Physical Activity Assessment Study (NPAAS) subcohorts.

These participants averaged ∼62 y of age at baseline. About 60% were overweight or obese, 85% were white, >40% had a college degree or higher, and 94% were nonsmokers. Participants having CVD, invasive cancer, or treated T2D prior to enrollment were excluded from respective CVD, cancer, or diabetes analyses.

We entered calibrated intake values into Cox regression models (23), along with disease-specific potential confounding factors. We assumed a linear modeling of log-HR on log–dietary variables, and this implies a fixed HR for a fractional increase in a dietary variable, at specified values for the other modeled variables. We present HR estimates for a 20% increment in nutrient intake. For the macronutrient components under consideration, a 20% increase is well within the intake variation estimated in WHI cohorts. Specifically, the FFQ geometric mean (95% confidence region) in the combined cohorts (n = 81,954) are 41.4 (15.4, 111.0) g/d for animal protein, 0.11 (0.06, 0.21) for animal protein density, 18.5 (8.2, 42.0) g/d for plant protein, 0.05 (0.03, 0.08) for plant protein density, 15.0 (6.4, 35.0) g/d for fiber, 0.04 (0.02, 0.08) for fiber density, 38.5 (11.6, 127.4) g/d for added sugars, and 0.10 (0.04, 0.26) for added sugars density.

As in reference 6, we stratified baseline hazard rates in the Cox model analyses on baseline age (i.e., year 1 in DM-C, enrollment in OS) in 5-y categories, race/ethnicity, on cohort (DM-C or OS), and, in the DM-C, also on participation in the WHI hormone therapy trials (estrogen, estrogen placebo, estrogen plus progestin, estrogen plus progestin placebo, not randomized). Importantly, for density variables the (log-transformed) macronutrient components were considered in conjunction with total protein density, total carbohydrate density and total energy (each log-transformed). This implies that HRs for macronutrient component density variables estimate a disease risk factor beyond that for the specific macronutrient density. Similarly, for absolute intake variables the (log-transformed) macronutrient components were considered in conjunction with (log-transformed) total protein and total carbohydrate intake, so that HRs for the macronutrient variable estimate a disease risk factor beyond that for the specific macronutrient intake. The set of disease-specific potential confounding factors considered are those listed in Supplemental Table 1, exclusive of BMI. Briefly, CVD outcome analyses included age (linear); family income; education; cigarette smoking history; alcohol consumption; leisure physical activity; any dietary supplement use; prior menopausal hormone use; hypertension; personal history of cancer; family history of myocardial infarction, stroke, or diabetes; use of medications to lower blood pressure, blood lipids, or blood glucose; and season in which the FFQ was completed. Invasive cancer analyses included these same variables, exclusive of personal history of CVD and of family history of myocardial infarction, stroke, or diabetes, and inclusive of Gail model 5-y breast cancer risk score, family history of colorectal cancer, and personal history of colon polyp removal. T2D analyses included the same variables as the CVD analyses except for family history of myocardial infarction or stroke. The proportion with missing data was generally low, but ≥20% participants had missing data on ≥1 modeled covariates in some analyses. Participants were excluded from outcome-specific analyses if any modeled covariate was missing. Based on sensitivity analyses that dropped covariates having relatively high missingness thereby including additional participants, this exclusion was not expected to materially affect disease association HR estimates.

We defined disease occurrence time for a “case” developing a study outcome as days from “baseline” (year 1 in the DM-C and enrollment in the OS) to diagnosis. We defined censoring time for “noncases” as days from baseline to the earliest of date of death without the outcome under study, last contact, or March 31, 2005 for heart failure, September 30, 2010 for other CVD incidence outcomes, or February 28, 2020 for cancer, diabetes, and mortality outcomes. Because of uncertainty in the coefficients in the calibrated intake estimating equations, a “sandwich-type” estimator was used to estimate the variance for the log-HR parameter estimates in calibrated intake analyses (24, 25, 26). We present disease rates and numbers of included participants with events during follow-up in Supplemental Table 2.

Because BMI could be an important mediator of the relation between macronutrient variables and chronic disease risk, our principal analyses excluded BMI from the disease risk model. However, we also carried out additional analyses with BMI added to the disease risk model for further insight.

Log-transformation of dietary variables is used in modeling for a couple of reasons. With dietary variables that take only positive values log-transformation typically yields regression variables that are close to normally distributed and homoscedastic, and amenable to classical measurement error modeling. Also, log-transformation leads to calibrated values that can be inserted into log HR models in Cox regression, giving HRs that depend linearly on the underlying dietary intake under measurement model assumptions. This gives HRs that are constant for a specified fractional increase in the dietary variable under modeling assumptions, and a simple interpretation for dietary variable associations with disease risk.

Linearity of the associations between log-HR and macronutrient component intake was studied by adding quadratic terms in calibrated intake to the log-HR regression equations, and examining evidence for nonzero quadratic coefficients.

Ethics

The WHI is funded primarily by the National Heart, Lung, and Blood Institute. Participants provided written informed consent for their overall WHI, NPAAS, and NPAAS-FS activities. Related protocols were approved by the Institutional Review Boards at the Fred Hutchinson Cancer Research Center and at each participating clinical center (clinicaltrials.gov identifier: NCT00000611).

Results

Biomarker equations meeting a 36% CV-R2 criterion were developed for (log-transformed) animal protein, fiber, and added sugars, both for absolute intake and for densities, but biomarkers meeting our 36% CV-R2 criterion did not emerge for plant protein or its density. CV-R2 values for macronutrient component biomarkers meeting this criterion are shown in Table 2, and the details of the biomarker specifications are shown in Supplemental Tables 3–5. Each biomarker relies on both serum and urine metabolites, with some dependencies also on established biomarkers and baseline FFQ measures. Also each of the 4 instruments used for metabolite profiling contribute to these specifications. The biomarker specification is relatively strong for animal protein and animal protein density, with respective CV-R2 values of 46.5% and 42.5%, whereas CV-R2 values for fiber and added sugars and their densities are only a little above the 36% threshold.

TABLE 2.

Linear regression percentage of feeding study intake explained by serum and urine metabolite profiles, total energy and protein biomarkers, and baseline FFQ intakes in NPAAS-FS conducted during 2010–2014 in 153 Seattle Women's Health Initiative participants1

Animal protein, g/d Animal protein density Fiber, g/d Fiber density Added sugars, g/d Added sugars density
R2 72.6 72.0 59.6 59.8 59.8 61.7
CV-R2 46.5 42.5 38.2 36.7 37.7 37.5
1

R2 is percentage of log-transformed feeding study intake variation explained, and CV-R2 is the corresponding cross-validated percentage of log-feeding study intake variation explained, by linear regression analyses. NPAAS-FS, Nutrition and Physical Activity Assessment Study Feeding Study.

These biomarker values were calculated for participants in NPAAS who were not in NPAAS-FS (n = 436), and the log biomarker-based intake assessments were regressed linearly on concurrent log-FFQ intake and participant characteristics. R2 values were adjusted for temporal biomarker variation by dividing by the following paired biomarker sample correlations: 0.330 and 0.367, respectively, for animal protein and its density; 0.853 and 0.356 for fiber and its density; and 0.518 and 0.488 for added sugars and its density. Table 3 shows adjusted R2 values for concurrent FFQ assessments in these linear equations, as well as the adjusted total regression R2 values, in NPAAS. Calibration equations meeting a 36% adjusted total R2 criterion could be developed for animal protein and its density, for fiber density, and for added sugars, but not for fiber or added sugars density. Details of potential calibration equations are given in Supplemental Tables 6–8. The adjusted R2 value for added sugars was close to the threshold value and, importantly, the log-FFQ added sugars variable contributed a very small amount to the adjusted R2 value. Therefore, we chose to proceed with disease association analyses for animal protein and its density and for fiber density, but not for fiber, and not for added sugars or its density. The sample correlations between calibrated log–animal protein density and calibrated log–total protein density in the cohorts (n = 81,894) used for disease association analyses were 0.61, 0.70, and 0.58 according to whether CVD, cancer, or T2D covariates were considered in calibration equation development. Corresponding sample correlations for calibrated log–fiber density and calibrated log–total carbohydrate density were 0.33, 0.34, and 0.33.

TABLE 3.

Adjusted R2 values in linear regression of log–macronutrient component biomarkers on corresponding log-FFQ intake and disease-specific covariates in NPAAS (n = 436), conducted during 2007–2009 in Women's Health Initiative participants enrolled during 1994–1998 at 9 US clinical centers1

Adjusted R2 values2
Covariate set Animal protein Animal protein density Fiber density Added sugars
CVD Log(FFQ)3 29.8 43.9 24.6 4.3
Total 71.6 64.1 72.0 39.9
Cancer Log(FFQ)3 28.3 36.8 25.2 2.2
Total 73.8 54.4 70.3 37.9
T2D Log(FFQ)3 31.1 40.7 28.2 2.6
Total 80.6 62.6 76.9 35.2
1

CVD, cardiovascular disease; NPAAS, Nutrition and Physical Activity Assessment Study; NPAAS-FS, Nutrition and Physical Activity Assessment Study Feeding Study; T2D, type 2 diabetes.

2

Adjusted R2 values from linear regression of biomarker log-intake on log-FFQ intake and participant characteristics selected in model building, divided by paired correlation of replicate log-intake biomarker measures for participants (n = 14) in both NPAAS and NPAAS-FS. The estimated correlations are 0.330 and 0.367 for animal protein and its density; 0.356 for fiber density; and 0.518 for added sugars.

3

Adjusted R2 values for components of calibration equations formed by fitting components individually and rescaling adjusted R2 values so that they add to the total regression adjusted R2 value.

Tables 46 show HR estimates and 95% CIs for a 20% increase in calibrated animal protein density (left side) and for a 20% increment in calibrated fiber density (right side), in disease risk analyses that also include calibrated total protein density, calibrated total carbohydrate density, and DLW-calibrated energy.

TABLE 4.

Cardiovascular disease risk HRs and 95% CIs for 20% increments in animal protein density (left) and fiber density (right), in analyses that also included total protein density, total carbohydrate density, and total energy, in Women's Health Initiative cohorts of postmenopausal US women enrolled during 1993–1998 at 40 US clinical centers and followed through 20201

HR estimation in analyses that include an animal protein density factor HR estimation in analyses that include a fiber density factor
Outcome (n participants with events) Animal protein density Protein density Carbohydrate density Fiber density Protein density Carbohydrate density
HR (95% CI)2 HR (95% CI) HR (95% CI) HR (95% CI) HR (95% CI) HR (95% CI)
Nonfatal MI (2102) 1.19 (0.98, 1.43) 0.74 (0.53, 1.05) 0.86 (0.70, 1.06) 0.83 (0.74, 0.94) 1.01 (0.86, 1.17) 0.99 (0.75, 1.30)
Coronary death (3254) 1.23 (1.04, 1.44) 0.71 (0.53, 0.94) 0.95 (0.79, 1.14) 0.82 (0.75, 0.91) 1.02 (0.90, 1.15) 1.10 (0.87, 1.39)
Total CHD3 (2869) 1.20 (1.02, 1.42) 0.75 (0.56, 1.01) 0.90 (0.74, 1.08) 0.80 (0.72, 0.89) 1.05 (0.92, 1.20) 1.09 (0.86, 1.38)
Ischemic stroke (1776) 1.02 (0.84, 1.25) 0.86 (0.60, 1.25) 0.83 (0.68, 1.03) 0.92 (0.81, 1.04) 0.89 (0.76, 1.04) 0.88 (0.65, 1.18)
Hemorrhagic stroke (395) 1.28 (0.84, 1.94) 0.79 (0.37, 1.70) 1.27 (0.77, 2.10) 0.88 (0.68, 1.14) 1.22 (0.88, 1.70) 1.33 (0.73, 2.43)
Total stroke3 (2425) 1.07 (0.90, 1.27) 0.86 (0.63, 1.17) 0.86 (0.72, 1.03) 0.91 (0.82, 1.02) 0.96 (0.84, 1.10) 0.91 (0.70, 1.17)
CHD + stroke3 (5023) 1.12 (0.99, 1.26) 0.82 (0.66, 1.03) 0.88 (0.76, 1.01) 0.85 (0.79, 0.92) 1.01 (0.91, 1.11) 1.01 (0.84, 1.21)
CABG + PCI3 (3119) 0.94 (0.82, 1.08) 1.13 (0.87, 1.46) 0.88 (0.73, 1.05) 0.98 (0.89, 1.08) 1.02 (0.90, 1.16) 0.92 (0.73, 1.15)
Total CVD3,4 (6964) 1.05 (0.95, 1.16) 0.92 (0.77, 1.10) 0.89 (0.79, 1.01) 0.89 (0.83, 0.94) 1.02 (0.94, 1.11) 1.02 (0.87, 1.18)
Heart failure (1381) 0.95 (0.76, 1.18) 1.14 (0.76, 1.72) 0.72 (0.56, 0.91) 0.87 (0.75, 1.01) 1.05 (0.87, 1.27) 0.85 (0.59, 1.21)
1

CABG/PCI, coronary artery bypass graft or percutaneous coronary intervention; CHD, coronary heart disease; CVD, cardiovascular disease; DM-C, dietary modification comparison group; MI, myocardial infarction; OS, Observational Study.

2

HR estimates and 95% CIs are based on Cox models with baseline hazard rates stratified on study component (DM-C or OS), hormone therapy trial status (estrogen plus progestin, estrogen plus progestin placebo, estrogen-alone, estrogen-alone placebo, not randomized), age at enrollment (50–54, 55–59, 60–64, 65–69, 70–74, ≥75), and race/ethnicity, and with adjustment for a disease-specific set of potential confounding factors. Note that the animal protein density HR reflects disease associations with animal protein density at specified total protein density, and that the fiber density HR reflects disease associations with fiber density at specified total carbohydrate density.

3

Time to event for composite outcomes defined as time to the earliest of the outcomes being combined.

4

Total CVD comprised of CHD + CABG + PCI + stroke.

TABLE 6.

Type 2 diabetes HRs and 95% CIs for 20% increments in animal protein density (left) and fiber density (right) in analyses that also included total protein density, total carbohydrate density, and total energy, in Women's Health Initiative cohorts of postmenopausal US women enrolled during 1993–1998 at 40 US clinical centers and followed through 20201

Outcome (n participants with outcome) HR estimation in analyses that include an animal protein density factor HR estimation in analyses that include a fiber density factor
Animal protein density Protein density Carbohydrate density Fiber density Protein density Carbohydrate density
HR (95% CI)2 HR (95% CI) HR (95% CI) HR (95% CI) HR (95% CI) HR (95% CI)
T2D (12,145) 1.03 (0.95, 1.12) 1.11 (0.97, 1.28) 0.74 (0.66, 0.82) 0.93 (0.88, 0.98) 1.20 (1.12, 1.29) 0.83 (0.73, 0.93)
1

DM-C, dietary modification comparison group; OS, Observational Study; T2D, type 2 diabetes.

2

HR estimates and 95% CIs are based on Cox models with baseline hazard rates stratified on study component (DM-C or OS), hormone therapy trial status (estrogen plus progestin, estrogen plus progestin placebo, estrogen-alone, estrogen-alone placebo, not randomized), age at enrollment (50–54, 55–59, 60–64, 65–69, 70–74, ≥75), and race/ethnicity, and with adjustment for a disease-specific set of potential confounding factors. Note that the animal protein density HR reflects disease associations with animal protein density at specified total protein density, and that the fiber density HR reflects disease associations with fiber density at specified total carbohydrate density.

Table 4 shows results for CVDs. The animal protein density factor has an elevated HR of 1.23 (95% CI: 1.04, 1.44) for coronary death, whereas the coronary death HR for total protein density is a reduced 0.71 (95% CI: 0.53, 0.94) after separating off the animal protein component. HRs for myocardial infarction and total CHD incidence were consistent with these coronary death HRs. Otherwise cardiovascular outcomes do not depend significantly on the animal protein density factor or on total protein density in these analyses. From the right side of Table 4 one sees reduced HR factors for a 20% increment in fiber density for several CVD outcomes, whereas HRs for total carbohydrate density in these analyses do not differ significantly from the null. For example, fiber density factor HRs are 0.80 (95% CI: 0.72, 0.89) for total CHD, and 0.89 (95% CI: 0.83, 0.94) for total CVD.

In contrast, Table 5 shows little association between animal protein density and cancer risk beyond that attributable to total protein density, with a nominally significant lymphoma elevation as a possible exception. Corresponding total protein density HRs are also nonsignificant in these analyses. Fiber density HRs were not significantly associated with cancer risk beyond that attributable to total carbohydrate density, with an ovarian cancer risk elevation as a possible exception. Total carbohydrate density, however, remained inversely related to cancer risk after allowing fiber density to have a separate HR factor, for several cancer sites. For example, 20% increment HRs are 0.85 (95% CI: 0.74, 0.97) for obesity-related cancers and 0.88 (95% CI: 0.79, 0.97) for total invasive cancers.

TABLE 5.

Cancer incidence HRs and 95% CIs for 20% increments in animal protein density (left) and fiber density (right) in analyses that also included total protein density, total carbohydrate density, and total energy, in Women's Health Initiative cohorts of postmenopausal US women enrolled during 1993–1998 at 40 US clinical centers and followed through 20201

HR estimation in analyses that include an animal protein density factor HR estimation in analyses that include a fiber density factor
Cancer site (n participants with events) Animal protein density Protein density Carbohydrate density Fiber density Protein density Carbohydrate density
HR (95% CI)2 HR (95% CI) HR (95% CI) HR (95% CI) HR (95% CI) HR (95% CI)
Breast (5139) 1.03 (0.91, 1.17) 0.92 (0.75, 1.12) 0.84 (0.74, 0.95) 0.97 (0.90, 1.06) 0.96 (0.87, 1.06) 0.85 (0.72, 1.00)
Colon (1060) 1.28 (0.97, 1.70) 0.59 (0.38, 0.92) 0.93 (0.73, 1.18) 0.99 (0.82, 1.19) 0.82 (0.67, 1.01) 0.81 (0.57, 1.15)
Rectum (158) 0.81 (0.50, 1.31) 0.76 (0.31, 1.88) 1.21 (0.59, 2.50) 1.39 (0.86, 2.25) 0.51 (0.29, 0.90) 0.88 (0.34, 2.26)
Endometrium (881) 0.82 (0.65, 1.02) 1.07 (0.70, 1.62) 0.82 (0.59, 1.14) 1.21 (0.98, 1.48) 0.75 (0.59, 0.97) 0.71 (0.47, 1.07)
Ovary (471) 0.97 (0.72, 1.30) 0.95 (0.56, 1.63) 1.31 (0.81, 2.10) 1.40 (1.08, 1.81) 0.84 (0.61, 1.15) 0.85 (0.49, 1.49)
Leukemia (439) 0.95 (0.67, 1.35) 0.85 (0.47, 1.52) 0.70 (0.53, 0.91) 1.02 (0.78, 1.32) 0.72 (0.53, 0.99) 0.60 (0.37, 0.98)
Lung (1426) 0.97 (0.77, 1.23) 1.07 (0.73, 1.57) 0.88 (0.69, 1.12) 0.89 (0.77, 1.03) 1.06 (0.89, 1.27) 1.03 (0.77, 1.38)
Lymphoma (804) 1.45 (1.01, 2.07) 0.74 (0.43, 1.27) 0.85 (0.76, 1.15) 0.90 (0.74, 1.11) 1.27 (1.00, 1.62) 0.83 (0.57, 1.22)
Bladder (166) 0.73 (0.50, 1.06) 1.28 (0.63, 2.61) 0.82 (0.44, 1.54) 1.57 (0.96, 2.58) 0.71 (0.42, 1.21) 0.53 (0.21, 1.34)
Kidney (309) 0.89 (0.55, 1.44) 1.38 (0.62, 3.10) 0.92 (0.55, 1.54) 0.83 (0.59, 1.17) 1.23 (0.82, 1.84) 1.21 (0.64, 2.32)
Pancreas (416) 1.04 (0.69, 1.55) 1.04 (0.55, 1.96) 1.14 (0.71, 1.83) 0.88 (0.67, 1.16) 1.13 (0.80, 1.59) 1.32 (0.75, 2.30)
Obesity-related3,4 (7313) 1.01 (0.91, 1.12) 0.90 (0.76, 1.06) 0.86 (0.77, 0.95) 1.00 (0.93, 1.07) 0.91 (0.84, 0.98) 0.85 (0.74, 0.97)
Total invasive3 (12,804) 1.04 (0.96, 1.12) 0.89 (0.79, 1.01) 0.88 (0.81, 0.95) 0.98 (0.93, 1.04) 0.94 (0.88, 1.00) 0.88 (0.79, 0.97)
1

DM-C, dietary modification comparison group; OS, Observational Study.

2

HR estimates and 95% CIs are based on Cox models with baseline hazard rates stratified on study component (DM-C or OS), hormone therapy trial status (estrogen plus progestin, estrogen plus progestin placebo, estrogen-alone, estrogen-alone placebo, not randomized), age at enrollment (50–54, 55–59, 60–64, 65–69, 70–74, ≥75), and race/ethnicity, and with adjustment for a disease-specific set of potential confounding factors. Note that the animal protein density HR reflects disease associations with animal protein density at specified total protein density, and that the fiber density HR reflects disease associations with fiber density at specified total carbohydrate density.

3

Time to event for composite outcomes defined as time to the earliest of the outcomes being combined.

4

Obesity-related cancer defined here as breast, colon, rectum, endometrium, or kidney cancer.

For T2D (Table 6), neither the animal protein density factor nor total protein density were significant. In contrast, fiber density had a 20% increment HR of 0.93 (95% CI: 0.88, 0.98) relative to that for total carbohydrate density. Also the corresponding total carbohydrate density retained a strong inverse association with diabetes risk after including a separate fiber density HR factor in the analysis, with a 20% increment HR of 0.83 (95% CI: 0.73, 0.93).

The analyses in Tables 46 were extended by adding quadratic terms in the logarithms of the macronutrient components to the HR model. These further analyses suggest associations for clinical outcome log-HRs that are substantially linear as a function of both calibrated log–animal protein density and calibrated log–fiber density. Specifically, none of the quadratic coefficients for log–animal protein density when added to these analyses differed significantly from zero, whereas for log–fiber density quadratic coefficients were significant only for nonfatal myocardial infarction (P = 0.046) and for CHD plus stroke (P = 0.04), each with a small positive estimated quadratic coefficient.

Supplemental Tables 9–11 display results from the same sets of analyses as Tables 46, but using FFQ assessments without biomarker calibration. HR estimates for intake without biomarker calibration were mostly consistent with those shown in Tables 46, but tended to be much attenuated toward the null. In spite of this attenuation, a few of the HRs without biomarker calibration were slightly more highly significant than their counterpart with biomarker calibration, as can occur because of allowance for random variation in calibration equation coefficient estimates in the disease association analyses of Tables 46.

Analyses in Tables 46 were repeated also with BMI added to the disease risk model, leading to only minimal change in HR factors and corresponding CIs for both animal protein density and fiber density (Supplemental Tables 12–14).

Analyses like those shown in Tables 46 were also carried out for calibrated (log-transformed) absolute animal protein, along with absolute total protein and absolute total carbohydrate intake, in analyses that did not include log–total energy, with and without biomarker intake calibration (Supplemental Tables 15–17). The resulting HR factors for a 20% increment in calibrated animal protein did not differ significantly from the null for any of the CVD outcomes. For obesity-related and total invasive cancer and for T2D, however, calibrated absolute protein intake was associated with higher risk, with respective 20% increment HRs of 1.62 (95% CI: 1.20, 2.18), 1.40 (95% CI: 1.11, 1.75), and 1.54 (95% CI: 1.18, 2.01), elevations that can presumably be attributed to correlations with total energy intake, and elevations that were not evident in corresponding analyses without biomarker calibration. These HRs were attenuated toward the null when BMI was added to the disease risk model (Supplemental Tables 18–20) but remained elevated for T2D, with an HR of 1.70 (95% CI: 1.31, 2.19).

Discussion

In our previous macronutrient intake and chronic disease report (6), protein density, with or without biomarker calibration, was not significantly related to incidence rates for the CVD outcomes considered. Recent nutritional epidemiology literature, however, provides evidence that diets based substantially on plant rather than animal sources might be preferable (7, 8, 9, 10). We were able to develop a biomarker for animal protein and for animal protein density using serum and 24-h urine metabolomics measures. The metabolites selected in biomarker development have a direct or indirect association with dietary components. In particular, metabolites that contributed most to CV-R2 are urinary creatinine and serum cholesteryl ester. Creatinine is a major urinary nitrogenous compound associated with muscle metabolism. It is well known that dietary protein consumption increases serum creatinine concentrations (27), which results in higher urinary creatinine concentrations. Cholesteryl ester is a dietary component of dairy and meat products (28). The new biomarkers led to biomarker-calibrated intake estimates for animal protein and for animal protein density in WHI cohorts. In analyses that also included biomarker-calibrated total protein density, total carbohydrate density, and total energy, we found evidence for a comparatively elevated coronary death risk at higher animal protein density, with a corresponding significant risk reduction at higher total protein density after allowing for animal protein density (Table 4). In our previous report we found modestly reduced obesity-related and total invasive cancer risk at higher calibrated protein density, but after separating out animal protein density this reduction was only suggestive, whereas the animal protein density factor was mostly not significantly associated with these cancer categories. Note, however, the intriguing exception that protein density is inversely related to colon cancer risk after including an HR factor for animal protein density, perhaps suggesting a benefit for plant protein. For T2D our previous findings included a risk elevation at higher protein density, whereas after including an animal protein density factor, neither it, nor total protein density, related significantly to risk. Calibrated carbohydrate density remained inversely associated with risk for these chronic diseases following the inclusion of an animal protein density factor, in most of these analyses.

Our current analyses also show that our previously reported (6) inverse carbohydrate density associations with CVDs, cancers, and T2D, could be largely attributed to the dietary fiber component for CVD and T2D, but not for invasive cancer. There is a substantial epidemiological literature relating self-reported dietary fiber, or self-reported vegetable and fruit intake, to lower chronic disease risk (e.g., 15, 29, 30, 31, 32). We were able to develop a suitable biomarker for fiber and for fiber density, primarily based on serum and 24-h urine metabolites. In particular, indole-3-propionate in serum and methylguanidine in urine showed substantial contributions to biomarker CV-R2. Indole-3-propionate is bacterially synthesized in the gut and it is known to be highly associated with fiber and carbohydrate intake (29). Methylguanidine is derived from protein catabolism as well as creatinine, and it is rich in beef (30). Concerning CVD risk, our modeled fiber density factor is inversely associated with CHD, total CVD, and heart failure, with corresponding total carbohydrate density no longer significantly related to these diseases after making a fiber density allowance. This is not the case for total invasive and site-specific cancers, however, for which the fiber density factor is nonsignificant whereas total carbohydrate density remains inversely related to T2D risk after separating out dietary fiber. In comparison fiber density was inversely related to T2D risk beyond that attributable to total carbohydrate, whereas total carbohydrate density also remained inversely related to disease risk.

We were also able to develop metabolomics-based biomarkers for sugars and sugars density, though the FFQ added sugars assessment did not lead to a suitable corresponding calibration equation, preventing us from using these biomarkers in disease association analyses. The sugars biomarkers involved biologically plausible metabolites, including urinary maltose, gentibiose, and dodecanoic acid methyl ester, each contributing to CV-R2. Both maltose and gentibiose are sugars that occur naturally in plant-based foods (31, 32). Maltose is also a component of added sugar (33). Dodecanoic acid methyl ester is a derivative of dodecanoic acid, which is an SFA found naturally in many plant and animal fats and oils.

Strengths of this report include novel intake biomarkers that rely on metabolomic profiles from both serum and 24-h urine, and a substantial biomarker study for calibration equation development, both nested within large WHI cohorts having careful outcome ascertainment and long-term follow-up for disease incidence.

Limitations include those that typically attend observational studies in the nutritional epidemiology area, including potential residual confounding and measurement error in dietary exposure assessment. The latter limitation is potentially dominating, if unattended. Here this limitation is mitigated by the use of objective intake biomarkers to correct dietary self-report assessments for measurement error, and to strengthen such assessments more generally. Note that the comparatively large median CV in the untargeted GC-MS metabolite quality control analyses suggests that more precise and comprehensive untargeted assessments could lead to a fuller use of urine metabolites in biomarker development. Also, the study is restricted to postmenopausal women in the United States.

In summary, novel metabolomics-based intake biomarkers have been proposed for key protein and carbohydrate dietary components. In biomarker-calibrated dietary intake analyses, CVD risk tends to be elevated with higher-animal-protein diets and reduced with plant-based higher-fiber diets, cancer risk tends to be reduced with higher-total-carbohydrate diets, and T2D risk is also reduced with higher-carbohydrate diets, especially if the diet is also high in dietary fiber. Accordingly, plant-based, fiber-rich diets are evidently associated with relatively low chronic disease risk in postmenopausal US women.

Acknowledgments

We acknowledge the following investigators in the WHI Program: Program Office (National Heart, Lung, and Blood Institute, Bethesda, MD): Jacques Rossouw, Shari Ludlam, Joan McGowan, Leslie Ford, and Nancy Geller. Clinical Coordinating Center (Fred Hutchinson Cancer Research Center, Seattle, WA): Garnet Anderson, Ross Prentice, Andrea LaCroix, and Charles Kooperberg. Investigators and Academic Centers: (Brigham and Women's Hospital, Harvard Medical School, Boston, MA) JoAnn E Manson; (MedStar Health Research Institute/Howard University, Washington, DC) Barbara V Howard; (Stanford Prevention Research Center, Stanford, CA) Marcia L Stefanick; (The Ohio State University, Columbus, OH) Rebecca Jackson; (University of Arizona, Tucson/Phoenix, AZ) Cynthia A Thomson; (University at Buffalo, Buffalo, NY) Jean Wactawski-Wende; (University of Florida, Gainesville/Jacksonville, FL) Marian Limacher; (University of Iowa, Iowa City/Davenport, IA) Jennifer Robinson; (University of Pittsburgh, Pittsburgh, PA) Lewis Kuller; (Wake Forest University School of Medicine, Winston-Salem, NC) Sally Shumaker; (University of Nevada, Reno, NV) Robert Brunner. Women's Health Initiative Memory Study (Wake Forest University School of Medicine, Winston-Salem, NC): Mark Espeland. For a list of all the investigators who have contributed to WHI science, please visit: https://www-whi-org.s3.us-west-2.amazonaws.com/wp-content/uploads/WHI-Investigator-Long-List.pdf. The authors’ responsibilities were as follows—RLP, MLN, DR, CZ, GANG, YH, LFT, JWL: designed the research; RLP, MP, MLN, DR, GANG, CZ, LFT, YH, JLW: conducted the research, and drafted the manuscript; RLP: had primary responsibility for the final content; and all authors: participated actively in critical evaluation of the manuscript, and read and approved the final manuscript.

Data Availability

Data, codebook, and analytic code used in this report may be accessed in a collaborative mode as described on the Women's Health Initiative website (www.whi.org).

Footnotes

This work was supported by the National Heart, Lung, and Blood Institute (NHLBI), National Institutes of Health, US Department of Health and Human Services (contracts HHSN268201600046C, HHSN268201600001C, HHSN268201600002C, HHSN268201600003C, HHSN268201600004C, and HHSN271201600004C); National Institute for Diabetes and Digestive and Kidney Diseases grant P30DK035816; National Cancer Institute grants R01 CA119171 and P30 CA15704, and NIH instrumentation grant S10 OD021562.

The views expressed are those of the authors and do not necessarily represent the views of the NHLBI, the NIH, or the US Department of Health and Human Services. Decisions concerning study design, data collection and analysis, interpretation of the results, the preparation of the manuscript, and the decision to submit the manuscript for publication resided with committees comprised of Women's Health Initiative investigators that included NHLBI representatives. The contents of the article are solely the responsibility of the authors.

Author disclosures: The authors report no conflicts of interest. MLN is an Associate Editor on the Journal of Nutrition and played no role in the Journal's evaluation of the manuscript.

Supplemental Tables 1–20 are available from the “Supplementary data” link in the online posting of the article and from the same link in the online table of contents at https://academic.oup.com/jn/.

Supplementary Material

mmc1-sup1-supinfo.docx (82.4KB, docx)

References

  • 1.Zheng C, Nagana Gowda GA, Raftery D, Neuhouser ML, Tinker LF, Prentice RL, Beresford SAA, Zhang Y, Djukovic D, Gu H, et al. Development of potential metabolomics-based biomarkers of protein, carbohydrate and fat intakes using a controlled feeding study. Eur J Nutr. 2021;60:4207–4218. doi: 10.1007/s00394-021-02577-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Lampe JW, Huang Y, Neuhouser ML, Tinker LF, Song X, Schoeller DA, Kim S, Raftery D, Di C, Zheng C, et al. Dietary biomarker evaluation in a controlled feeding study in women from the Women's Health Initiative cohort. Am J Clin Nutr. 2017;105(2):466–475. doi: 10.3945/ajcn.116.144840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Women's Health Initiative Study Group. Design of the Women's Health Initiative clinical trial and observational study. Control Clin Trials. 1998;19(1):61–109. doi: 10.1016/s0197-2456(97)00078-0. [DOI] [PubMed] [Google Scholar]
  • 4.Bingham SA. Urine nitrogen as a biomarker for the validation of dietary protein intake. J Nutr. 2003;133(3):921S–924S. doi: 10.1093/jn/133.3.921S. [DOI] [PubMed] [Google Scholar]
  • 5.Schoeller DA. Recent advances from application of doubly-labeled water to measurement of human energy expenditure. J Nutr. 1999;129(10):1765–1768. doi: 10.1093/jn/129.10.1765. [DOI] [PubMed] [Google Scholar]
  • 6.Prentice RL, Pettinger M, Neuhouser ML, Raftery D, Zheng C, Gowda N, Huang Y, Tinker LF, Howard BV, Manson JE, et al. Biomarker-calibrated macronutrient intake and chronic disease risk among postmenopausal women. J Nutr. 2021;151(8):2330–2341. doi: 10.1093/jn/nxab091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Keleman LE, Kushi LH, Jacobs DR, Cerhan JR. Associations of dietary protein with disease and mortality in a prospective study of postmenopausal women. Am J Epidemiol. 2005;161(3):239–249. doi: 10.1093/aje/kwi038. [DOI] [PubMed] [Google Scholar]
  • 8.Hatton TL, Willett WC, Liu S, Manson JE, Albert CM, Rexrode K, Hu FB. Low-carbohydrate diet score and the risk of coronary heart disease in women. N Engl J Med. 2006;355:1991–2002. doi: 10.1056/NEJMoa055317. [DOI] [PubMed] [Google Scholar]
  • 9.de Koning L, Fung TT, Liao X, Chiuve SE, Rimm EB, Willett WC, Spiegelmann D, Hu FB. Low-carbohydrate diet scores and risk of type 2 diabetes in men. Am J Clin Nutr. 2011;93:844–850. doi: 10.3945/ajcn.110.004333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Song M, Fung TT, Hu FB, Willett WC, Longo VD, Chan AT, Giovannucci EL. Association of animal and plant protein intake with all-cause and cause-specific mortality. JAMA Intern Med. 2016;176(10):1453–1463. doi: 10.1001/jamainternmed.2016.4182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Fung TT, van Dahm RM, Hankinson SE, Stampfer M, Willett WC, Hu F. Low-carbohydrate diets and all-cause and cause-specific mortality: two cohort studies. Ann Intern Med. 2010;153(5):289–298. doi: 10.1059/0003-4819-153-5-201009070-00003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Noto H, Goto A, Tsujimoto T, Noda M. Low-carbohydrate diets and all-cause mortality: a systematic review and meta-analysis of observational studies. PLoS One. 2013;8(1):e55030. doi: 10.1371/journal.pone.0055030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Dehghan M, Mente A, Zhang X, Swaminathan S, Li W, Mohan V, Lqbal R, Kumar R, Wentzel-Viljoen E, Rosengren A, et al. Associations of fats and carbohydrate intake with cardiovascular disease and mortality in 18 countries from 5 continents (PURE): a prospective cohort study. Lancet. 2017;390(10107):2050–2062. doi: 10.1016/S0140-6736(17)32252-3. [DOI] [PubMed] [Google Scholar]
  • 14.Seidelmann SB, Claggett B, Cheng S, Henglin M, Shah A, Steffen LM, Folsom AR, Rimm EB, Willett WC, Solomon SD. Dietary carbohydrate intake and mortality: a prospective cohort study and meta-analysis. Lancet Public Health. 2018;3(9):e419–28. doi: 10.1016/S2468-2667(18)30135-X. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ho FK, Gray SR, Welsh P, Petermann-Rocha F, Foster H, Waddell H, Anderson J, Lyall D, Sattar N, Gill JMR, et al. Associations of fat and carbohydrate intake with cardiovascular disease and mortality: a prospective study of UK Biobank participants. BMJ. 2020;368:m688. doi: 10.1136/bmj.m688. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Patterson RE, Kristal AR, Tinker LF, Carter RA, Bolton MP, Agurs-Collins T. Measurement characteristics of the Women's Health Initiative food frequency questionnaire. Ann Epidemiol. 1999;9(3):178–187. doi: 10.1016/s1047-2797(98)00055-6. [DOI] [PubMed] [Google Scholar]
  • 17.Neuhouser ML, Tinker L, Shaw PA, Schoeller D, Bingham SA, Van Horn L, Beresford SAA, Caan B, Thompson C, Satterfield S, et al. Use of recovery biomarkers to calibrate nutrient consumption self-reports in the Women's Health Initiative. Am J Epidemiol. 2008;167(10):1247–1259. doi: 10.1093/aje/kwn026. [DOI] [PubMed] [Google Scholar]
  • 18.Prentice RL, Mossavar-Rahmani Y, Huang Y, Van Horn L, Beresford SA, Caan B, Tinker L, Schoeller D, Bingham S, Eaton CB, et al. Evaluation and comparison of food records, recalls and frequencies for energy and protein assessment using recovery biomarkers. Am J Epidemiol. 2011;174(5):591–603. doi: 10.1093/aje/kwr140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Curb JD, McTiernan A, Heckbert SR, Kooperberg C, Stanford J, Nevitt M, Johnson KC, Proulx-Burns L, Pastore L, Criqui M, et al. Outcomes ascertainment and adjudication methods in the Women's Health Initiative. Ann Epidemiol. 2003;13(9):S122–8. doi: 10.1016/s1047-2797(03)00048-6. [DOI] [PubMed] [Google Scholar]
  • 20.Margolis KL, Qi L, Brzyski R, Bonds DE, Howard BV, Kempainen S, Liu S, Robinson JG, Safford MM, Tinker LF, et al. Validity of diabetes self-reports in the Women's Health Initiative: comparison with medication inventories and fasting glucose measurements. Clinical Trials. 2008;5(3):240–247. doi: 10.1177/1740774508091749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol. 1996;58:267–288. [Google Scholar]
  • 22.Cox DR. Regression analysis and life tables (with discussion). J R Stat Soc Series B Methodol. 1972;34:187–220. [Google Scholar]
  • 23.Huang Y, Zheng C, Tinker LF, Neuhouser ML, Prentice RL. Biomarker-based methods and study designs to calibrate dietary intake for assessing diet-disease associations. J Nutr. [Internet] 2021;nxab420. [DOI] [PMC free article] [PubMed]
  • 24.Prentice RL. Covariate measurement errors and parameter estimation in a failure time regression model. Biometrika. 1982;69(2):331–342. [Google Scholar]
  • 25.Wang CY, Hsu L, Feng ZD, Prentice RL. Regression calibration in failure time regression. Biometrics. 1997;53(1):131–145. [PubMed] [Google Scholar]
  • 26.Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM. Measurement error in nonlinear models, a modern perspective. 2nd ed. Boca Raton (FL): Chapman and Hall/CRC; 2006.
  • 27.Butani L, Polinsky MS, Kaiser BA, Baluarte HJ. Dietary protein intake significantly affects the serum creatinine concentration. Kidney Int. 2002;61(5):1907. doi: 10.1046/j.1523-1755.2002.00342.x. [DOI] [PubMed] [Google Scholar]
  • 28.Jacobsen FK, Christensen CK, Mogensen CE, Andreasen F, Heilskov NS. Pronounced increase in serum creatinine concentration after eating cooked meat. BMJ. 1979;1(6170):1049–1050. doi: 10.1136/bmj.1.6170.1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.De Mello VD, Paananen J, Lindstrom J, Lankinen MA, Shi L, Kuusisto J, Pihlajamaki J, Seppo A, Marko L, Olov R, et al. Indolepropionic acid and novel lipid metabolites are associated with a lower risk of type 2 diabetes in the Finnish Diabetes Prevention Study. Sci Rep. 2017;7(1):46337. doi: 10.1038/srep46337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Gonella M, Barsotti G, Lupetti S, Giovanetti S. Factors affecting the metabolic production of methylquanidine. Clin Sci Mol Med. 1975;48:341–347. doi: 10.1042/cs0480341. [DOI] [PubMed] [Google Scholar]
  • 31.Qi X, Tester RF. Lactose, maltose, and sucrose in health and disease. Mol Nutr Food Res. 2020;64(8):1901082. doi: 10.1002/mnfr.201901082. [DOI] [PubMed] [Google Scholar]
  • 32.Ucar RA, Perez-Diaz IM, Dean LL. Gentiobiose and cellobiose content in fresh and fermenting cucumbers and utilization of such disaccharides by lactic acid bacteria in fermented cucumber juice medium. Food Sci Nutr. 2020;8(11):5798–5810. doi: 10.1002/fsn3.1830. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.US Department of Health and Human Services, US Department of Agriculture. 2015-2020 dietary guidelines for Americans, 8th ed [Internet]. 2015; [cited]. Available from: https://health.gov/dietaryguidelines/2015/guidelines/, Accessed: October 31, 2021.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

mmc1-sup1-supinfo.docx (82.4KB, docx)

Data Availability Statement

Data, codebook, and analytic code used in this report may be accessed in a collaborative mode as described on the Women's Health Initiative website (www.whi.org).


Articles from The Journal of Nutrition are provided here courtesy of American Society for Nutrition

RESOURCES