Skip to main content
Current Developments in Nutrition logoLink to Current Developments in Nutrition
. 2025 Apr 5;9(5):107435. doi: 10.1016/j.cdnut.2025.107435

The Dietary Biomarkers Development Consortium: An Initiative for Discovery and Validation of Dietary Biomarkers for Precision Nutrition

Hrishikesh Chakraborty 1,2,⁎,, Qi Sun 3,4,5,, Shilpa N Bhupathiraju 3,4,, Jeannette M Schenk 6,, Darya O Mishchuk 7,, James R Bain 8, Xuan He 7,9, Jianghao Sun 10, James Harnly 10, William Simmons 2, Daniel Raftery 6,11, Liming Liang 5, John W Newman 9,12,13, Oliver Fiehn 13,14, Clary B Clish 15, Johanna W Lampe 6, Brian J Bennett 9,12, Sandi L Navarro 6, Ying Wang 16, Cheng Zheng 17, Yasmin Mossavar-Rahmani 18, Marjorie L McCullough 16, Ying Huang 6, Ali Shojaie 19, Wentao Zhu 11, Danijel Djukovic 11, Frank Sacks 3, Jonathan Williams 3,4, Francene M Steinberg 9, Sean H Adams 20,21, Frank B Hu 3,4,5, Marian L Neuhouser 6, Carolyn M Slupsky 7,9, Padma Maruvada 22
PMCID: PMC12242990  PMID: 40641655

Abstract

Diet is a complex exposure that affects health across the lifespan. Objective biomarkers that can reliably reflect intake of nutrients, foods, and dietary patterns with sufficient accuracy are an important tool for assessing associations of diet with health outcomes. Advances in metabolomics, coupled with feeding trials and high-dimensional bioinformatics analyses, pave the way for discovering compounds that can serve as sensitive and specific biomarkers of dietary exposures. The Dietary Biomarkers Development Consortium (DBDC) is leading the first major effort to improve dietary assessment through the discovery and validation of biomarkers for foods commonly consumed in the United States diet. To achieve this goal, a 3-phase approach will be implemented to identify, evaluate, and validate food biomarkers. In phase 1, 3 controlled feeding trial designs will be implemented by administering test foods in prespecified amounts to healthy participants, followed by metabolomic profiling of blood and urine specimens collected during the feeding trials to identify candidate compounds. Data from these studies will characterize the pharmacokinetic parameters of candidate biomarkers associated with specific foods. In phase 2, the ability of candidate biomarkers to identify individuals eating the biomarker-associated foods will be evaluated using controlled feeding studies of various dietary patterns. In phase 3, the validity of candidate biomarkers to predict recent and habitual consumption of specific test foods will be evaluated in independent observational settings. Data generated during all study phases will be archived in a publicly accessible database as a resource for the research community. The DBDC aims to significantly expand the list of validated biomarkers of intake for foods consumed in the United States diet, which can help advance understanding of how diet influences human health. This manuscript discusses the DBDC’s organizational infrastructure, study design, laboratory methods, and strategies for dietary biomarker discovery and validation.

Trial registration number

This trial was registered at Phase 1 Seattle Dietary Biomarkers Development Center (P1-SDBDC) as NCT05580653, at Fruit and Vegetable Biomarker Discovery (UCD-DBDC) as NCT05621863, and at Dietary Biomarkers Intervention Core as NCT05616585.

Keywords: dietary biomarkers, dietary exposures, metabolomics, food chemistry, consortium

Introduction

Poor diet quality is among the most important modifiable chronic disease risk factors [1,2]. However, the accurate assessment of diet in free-living populations remains a challenge in nutrition research. Diet is a complex pattern of intercorrelated exposures, both of known and unknown constituents, coupled with relatively large intra- and interpersonal variability [[3], [4], [5], [6]]. Current dietary assessment approaches rely heavily on self-reported methodologies, such as food frequency questionnaires (FFQs), multiple-day food diaries, 24-h recalls, and similar instruments. These approaches are often distorted by a variety of systematic and random measurement errors [[7], [8], [9], [10]]. Increasing evidence indicates that dietary biomarkers measured in biological specimens, such as serum, plasma, and urine, may provide an objective means for measuring the intake of specific nutrients and foods. Biomarkers that represent the true “bioavailable” dose of the dietary exposure can potentially complement the development and validation of other dietary assessment methods. However, many existing dietary and nutrient biomarkers are often not sensitive to intake or have low specificity, and a limited number of dietary biomarkers have been identified for the intake of specific foods or food groups [7,11,12].

Recent advances in metabolomic profiling techniques offer exciting new opportunities for food-based biomarker discovery [12]. Using metabolomic technologies, the Food Biomarker Alliance, or FoodBAll Consortium, explored markers of food intake among different populations in Europe [[13], [14], [15]]. Similar systematic and concerted efforts for dietary biomarker discovery in United States populations are lacking, especially in view of transatlantic differences in food preferences and governmental regulations of foodstuffs and recommendations regarding diet [[16], [17], [18], [19], [20]]. Importantly, few metabolites have met the criteria for serving as valid biomarkers of food intake as proposed by Dragsted et al [13,21], including plausibility, dose–response (DR), time–response, analytic detection performance, chemical stability, robustness, and temporal reliability in free-living populations consuming complex diets. Of note, most dietary biomarker studies have not examined pharmacokinetic (PK) and DR relationships between food intake and metabolite levels, which could facilitate the development of new methods to quantify and calibrate measurement errors in self-reported measures [12,22]. Feeding trials with known food intake, coupled with high-dimensional bioinformatics analyses of metabolite patterns and postprandial kinetics, are essential for discovering novel compounds that can serve as sensitive and specific biomarkers of target foods [12].

In 2020, the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) and the USDA-National Institute of Food and Agriculture (USDA-NIFA) called for “objective biomarkers of dietary intake that can serve as independent markers of dietary intake and complement current dietary intake assessment methods” (RFA-DK-20-005). The Dietary Biomarkers Development Consortium (DBDC) was subsequently formed in 2021 to lead a pioneering effort to improve dietary assessment through the discovery and validation of food biomarkers. The DBDC is conducting systematic controlled feeding studies to characterize blood and urine metabolite patterns associated with a variety of foods across a diverse United States population. The collective purpose of the DBDC is to systematically catalog specific, reliable, and validated postingestion plasma and urine metabolomic signatures of commonly consumed foods in the United States using the USDA MyPlate Guidelines for selecting test foods [23].

Structure of the DBDC

As part of the DBDC consortium, 3 study centers were established at academic medical centers throughout the United States, which include Harvard University, in collaboration with the Broad Institute of the Massachusetts Institute of Technology and Harvard; the Fred Hutchinson Cancer Center, in collaboration with the University of Washington; and the University of California Davis, in collaboration with the USDA Agricultural Research Service (ARS) at Davis. All study centers have an independent infrastructure, comprising multiple cores that focus on dietary intervention trials (intervention core), metabolomic profiling (metabolomics core), statistical analyses (data analysis core), and administration (administrative core).

A data coordinating center (DCC) was established at Duke University (RFA-DK-20-007). The DBDC DCC spearheads administrative activities, which include data quality control (QC), data analysis for monthly progress reports and Data Safety Monitoring Board reports, and streamlined operations using a central document repository. The DBDC DCC will make all trial data available to internal and external researchers as appropriate and will submit data to both the NIDDK Central Repository and Metabolomics Workbench at the end of the trial [[24], [25], [26]]. Through a coordinated effort, the DCC is responsible for detailed data monitoring of consented participant data and overall trial monitoring through guidance and oversight. The DCC analysis team ensures efficient and standardized data capture for regular reporting and analyses. The DCC coordinates communications and management of standing committee and subcommittee teleconferences, working group meetings, and in-person Steering Committee meetings. A website (https://dietarybiomarkerconsortium.org/) developed and maintained by the DCC enables data deposition and includes a cloud analysis platform and a SharePoint site for central filing of all finalized consortium-wide and site-specific essential documents.

A Data Safety Monitoring Board of independent experts in relevant fields of biomedical science regularly reviews the progress of the DBDC, including how well the DBDC is meeting its overarching goals, with a special emphasis on the safety of human participants, protection and integrity of accrued data, scientific rigor, and timely progress of the study. All studies operate under the aegis of their local Institutional Review Boards.

Collectively, the NIDDK, USDA-NIFA, DCC, and study centers constitute the DBDC consortium. The DBDC organizational structure is similar to that of other multicenter trials that enroll human participants [[27], [28], [29], [30]].

Governing committees

The Steering Committee is the governing body of the DBDC, comprising principal investigators and administrative core leads from the study centers and DCC, and project scientists and program officers from the NIDDK and USDA-NIFA (Figure 1). The Steering Committee participates in strategic decisions regarding the scientific and administrative objectives of the DBDC although keeping consortium goals and vision in view. Meetings are led by the Steering Committee chair who is elected annually by the principal investigators.

FIGURE 1.

FIGURE 1

Dietary Biomarkers Development Consortium organizational structure. DSMB, Data Safety Monitoring Board; NIDDK, National Institute of Diabetes and Digestive and Kidney Diseases; USDA-ARS, USDA-Agricultural Research Service; USDA-NIFA, USDA-National Institute of Food and Agriculture.

An Executive Committee supports the Steering Committee in planning consortium activities and setting monthly agenda topics. The Executive Committee comprises the Steering Committee chair, program officers from the NIDDK and USDA-NIFA, and the principal investigator from the DCC. The Executive Committee addresses time-sensitive issues, supports the Steering Committee with process-related tasks, and oversees biospecimen sharing across the sites and with the broader scientific community.

The publications, presentations, and ancillary studies committee develops guidelines and provides oversight for ancillary studies that use consortium data or specimens. Additionally, this committee oversees the review and approval of presentations and publications (manuscripts and abstracts) that will disseminate findings to the public and the scientific community.

Working groups

Three working groups support activities identified by the DBDC Steering Committee. The Dietary Intervention Working Group ensures that the DBDC fulfills its goals through a harmonized approach to execute feeding study protocols. In practice, the Dietary Intervention Working Group carries out these responsibilities by providing leadership in the scientific and logistic planning of feeding interventions, implementation of study protocols and amendments, and development of consortium intervention goals and key deliverables related to the studies. The Dietary Intervention Working Group harmonizes common data collection procedures and data elements across studies [31,32] to facilitate the dissemination of descriptors of standardized participant characteristics cross-consortium. These common procedures and data elements include: inclusion and exclusion criteria (Supplemental Material), participant baseline demographic characteristics, refractive index targets and protocols for urine screening and dilution, clinical and laboratory protocols, USDA food specimen processing and analysis protocols, 24-h PK data collection points, adverse event collection and reporting processes, and stool sample collection for potential future studies.

The Metabolomics Working Group coordinates and implements strategies for identifying sensitive and specific food biomarkers using biospecimens collected from participants at each study center. The main role of the Metabolomics Working Group is to ensure that the DBDC fulfills its goals through a harmonized approach to collecting data. In practice, the group leads the development and implementation of analytical methods for identifying food-associated markers and for optimizing data and metabolomics analyses. Each study center will use liquid chromatography-MS (LC-MS) and hydrophilic-interaction liquid chromatography (HILIC) protocols, increasing the likelihood of identifying similar molecules and molecule classes; however, site-to-site differences in instrumentation, columns, protocols, and chemical libraries are expected to yield variances in the specific metabolites identified across sites for each analytical platform. A major goal of the Metabolomics Working Group is to create systems to enhance harmonization of metabolite identifications across platforms, based on MS/MS ion patterns and retention times.

The Data Analysis/Harmonization Working Group is tasked with harmonizing data collection and analysis methods for identifying food-associated markers and implementing a coordinated approach for analyzing data. The Working Group carries out these responsibilities by providing leadership in the development of data dictionaries and data analysis plans for all 3 phases of the studies (Figure 1).

Studies of the DBDC

The DBDC study centers will use a 3-phase approach to identify, evaluate, and validate candidate food biomarkers. Using harmonized participant enrollment criteria (see Supplemental Material for the Harmonized Inclusion and Exclusion Criteria), each study center will focus on a set of test foods from a selection of USDA MyPlate food groups [[33], [34], [35]]. MyPlate food groups include fruits, vegetables, grains, protein foods, and dairy. During phase 1, each study center will conduct controlled feeding PK and DR studies to administer test foods in prespecified amounts to healthy participants for metabolomic profiling of plasma and urine samples collected from participants (see example in Figure 2). Data from these phase 1 studies will be used to collectively characterize the PK parameters and dose-responsiveness of candidate food biomarkers, with the overarching goal of identifying chemical markers associated with the intake of specific foods. Additionally, the USDA-ARS Methods and Application of Food Composition Laboratory in Beltsville, Maryland will analyze the test foods consumed by participants at study centers using MS-based metabolomics platforms. In phase 2, the performance of candidate food biomarkers identified in phase 1 will be evaluated in the context of various dietary patterns by leveraging controlled feeding studies. In phase 3, using independent observational settings, study centers will evaluate the validity of the candidate food biomarkers to predict recent and habitual consumption in cross-sectional cohort studies. The study was approved by the Institutional Review Boards of Duke University, Fred Hutch Cancer Center, Harvard T.H. Chan School of Public Health, and the University of California, Davis. Written informed consent was obtained from all participants enrolled in the DBDC data and biospecimen collection protocols at the respective participating sites. The study phases, metabolomics approaches, and statistical analyses for the 3 study centers are further described in the next section and Supplemental Table 1 in the Supplemental Material.

FIGURE 2.

FIGURE 2

Dosing scheme of each food for determination of biomarkers during phase 1 in the Dietary Biomarkers Development Consortium–Davis study design. Participants will be provided a test meal consisting of either zero dose, a medium dose (1 half-cup equivalent), or a high dose (2 half-cup equivalent) of the indicated fruit and vegetable. Test meals will be prepared using other foods (for example, chicken broth, milk, etc.) that will not interfere with the biomarker determination to create an acceptable product for consumption. In brackets are gram-weight equivalents of corresponding raw food.

Harvard University

The DBDC study center at Harvard University (DBDC at Harvard) was established to systematically identify and validate metabolomic signatures of intakes of major carbohydrate and protein sources consumed in the United States

Phase 1

Phase 1 will include PK and DR feeding trials of 8 test foods, including beef, potatoes, chicken, whole-wheat bread, corn, cheese, yogurt, and oats. Two additional foods, soy and salmon, will be tested in the DR trials only. PK studies will use a randomized crossover design, in which each eligible participant is randomly assigned to a sequence of the 8 test foods. Each food will be tested among a total of 15 participants and will be administered in the morning as a single meal consisting of 50% of the participants’ daily recommended amount for each food group. Each 8-h PK trial is preceded by a 2-d run-in diet that does not contain the test food. Participants will then visit the Brigham and Women’s Hospital Center for Clinical Investigation unit for the day for each test food feeding trial. Blood samples will be collected via an indwelling catheter at time 0 (before ingestion of the study food, after an overnight fast starting at 21:00) and every hour (blood) after eating over a period of 8 h. Urine samples will be collected at time 0 and every 2 h for 8 h. Additional fasting blood and urine samples will be collected at 24 h (Figure 3A). After each test food, participants will return to their normal diet and will be scheduled for the next test food cycle after another 2-d run-in period.

FIGURE 3.

FIGURE 3

Study design at Dietary Biomarkers Development Consortium at Harvard.

The DR studies will use a randomized crossover design, in which participants are first assigned to 1 of 5 food pairings (beef/whole-wheat bread, chicken/potato, salmon/corn, cheese/soybeans, or yogurt/oats) and then randomly assigned to 3 feeding periods of a specific combination of doses (n = 15 participants for each food pair). Using the chicken and potato pair as an example, low-dose chicken is paired with high-dose potato; high-dose chicken is paired with low-dose potato; and medium doses of chicken and potato are paired together. For each food pair, the “low” dose of a test food does not contain the food at issue; the “medium” dose corresponds to the average United States intake; and the “high” dose corresponds to the 90th percentile for the United States average, both derived from the 2017 NHANES [36]. Soybeans are benchmarked with intakes in Japan [37], where larger amounts of soy products are commonly eaten. The 3 6-d isocaloric controlled-feeding periods will be preceded by a 3-d control diet period with ≥2–3 d of washout between feeding periods. Over the 6-d controlled-feeding period, participants will consume the assigned foods within a standard diet (20% protein, 50% carbohydrate, and 30% fat) that depletes the 2 test foods of interest at all 3 dose levels. To ensure that meals are isocaloric, calorie-containing foods from all food groups, except the test food pair, will be reduced to make room for the test food pair. At the end of each dose period, fasting blood and urine samples will be collected for measuring food biomarker levels. Stool samples will be self-collected by study participants using a stool collection kit with vials containing 95% ethanol at the start of the dose diet (2 d after the run-in diet) and at the end of each dose period for potential future measurement of changes in stool metabolomics and microbiome (Figure 3B). The DR trials with 3 feeding periods will be conducted entirely ambulatory, and participants will complete a daily meal checklist of food, drink, and physical activity during the controlled feeding period. At the end of each dose period, participants will begin the next dose period after a 2-d run-in diet.

For both PK and DR trials, adherence will be monitored through staff observation, interviews, and information from the daily meal checklist. Information on habitual diet is collected from each participant using a semiquantitative FFQ [38] once eligibility is confirmed and before the start of a PK test food or a DR food pair. Additionally, objective measures, including urinary urea nitrogen levels reflective of protein intake, fasting plasma triglyceride levels reflective of relative carbohydrate to fat composition, and body weight, will be used to document adherence. For the PK phase, if a test food is consumed at <60% of the targeted dose, data will be considered missing and a new participant will be randomly assigned to receive the food using rerandomization sequences. For the DR phase, a participant must complete ≥2 different dose levels with >60% completion of test foods (not the entire meal) at each dose concentration; otherwise, they will be withdrawn from the DR phase for that food pairing and replaced with a new participant.

Phases 2 and 3

In phase 2, performance of the identified biomarkers will be evaluated using samples from the OmniHeart study [39], a completed 6-wk controlled feeding trial of 3 healthy dietary patterns emphasizing different macronutrient compositions. In phase 3, the identified dietary biomarkers will be validated in free-living populations. For this phase, data from 2 longitudinal United States observational cohorts will be used, including the Lifestyle Validation Study (n = 600) and the Study of Latinos Nutrition and Physical Activity Assessment Study (n = 477). The 3 studies considered for phases 2 and 3 studies have rich data and resources, including detailed dietary assessments by FFQs, 24-h recalls, diet records, plasma and urine samples, nutrient biomarkers, and cardiometabolic health measures. Performance of identified food biomarkers will be evaluated using multiple approaches, including a comparison with intake assessed using other instruments, recorded menu data (OmniHeart study), and benchmark nutrient biomarkers. The biomarkers will be jointly evaluated to determine whether they collectively correlate with dietary patterns in these studies.

Metabolomic data generation

Data acquisition

Plasma and urine samples collected from the feeding trials, OmniHeart study, and observational studies will be analyzed using an untargeted LC-MS metabolomics platform comprising 5 complementary protocols. Each method uses high-resolution, accurate-mass profiling in full-scan MS mode to provide a hybrid analysis of hundreds of identified metabolites and food compounds, along with thousands of yet-to-be-identified peaks [40]. The metabolomics approaches include: 1) reversed-phase (RP) C8 ultra-HPLC (UHPLC), with positive-ion mode detection (+) MS (plasma only) for broad coverage of both polar and nonpolar lipids; 2) RP C18 UHPLC, +MS (plasma and urine) for metabolites and food biomarkers of intermediate polarity; 3) RP C18 UHPLC, -MS (plasma and urine) for information on free fatty acids, bile acids, and diverse other metabolites of intermediate polarity; 4) acidic HILIC, +MS (plasma and urine) to measure cationic polar metabolites; and 5) basic HILIC, –MS (plasma and urine) to measure anionic polar metabolites. Data quality is assured using several approaches, including: 1) confirmation of LC-MS system performance using mixtures of synthetic reference compounds as well as repeated analyses of extracts from human pooled plasma before study sample analyses; 2) daily evaluation of internal standard signals to ensure that each sample injected properly and to monitor MS sensitivity; and 3) analysis of pairs of pooled samples (PREF) inserted in the analysis queue intervals of approximately every 20 study samples. One sample from each pair is used to correct for drift in MS instrument sensitivity using “nearest neighbor” scaling whereas the second reference sample serves as a passive QC for determination of the analytical coefficient of variation every identified metabolite and unknown.

Data processing

Processing of the metabolomics data begins with feature detection and identification. Targeted analysis of known metabolites is achieved using TraceFinder (Thermo Fisher Scientific) software. The laboratory at the DBDC at Harvard has amassed a set of over 3000 authentic, in-house reference standards and has confirmed the identities of >700 metabolites measurable in typical human plasma to a Metabolomics Standards Initiative confidence level of 1 [41]. Reference standards and reference samples are included in each analysis batch to confirm compound identities and to provide a basis for any later normalization procedures. Untargeted high-resolution, accurate-mass data are processed using Progenesis QI software (Nonlinear Dynamics, Waters Corporation) to detect peaks, perform chromatographic retention time alignment, and integrate peak areas. Metabolites of confirmed identity are annotated, and unknowns are tagged using their measured mass-to-charge ratios (m/z) and retention times. PREF QC samples are used to correct for drift in MS sensitivity as previously described [41,42]. Redundant ion features generated by electrospray ionization (ESI) are identified by determining Spearman correlation coefficients among coeluting peaks, and the most abundant feature is retained. Unknowns are aligned between batches using unambiguous landmark features (peaks with locally unique m/z and retention time values) to develop locally weighted scatterplot smoothing (LOWESS) regression models that enable fitting all measured m/z and retention time values in 1 batch to the other for accurate matching. The alignment is done in both directions (for example, batch 1 to batch 2 and then batch 2 to batch 1), and only robustly aligned features are kept.

Statistical analyses

Standard and novel methods will be used to identify biomarkers that best discriminate intakes of different foods, including a novel log-linear regression model based on γ function to fully capture the responsive dynamics of metabolite concentrations from the 10 dense timepoints in the PK trial. Incorporating a time-series design, sparse partial least squares discriminant analysis, bilinear/trilinear partial least squares [43], and support vector machine [44] will be used to determine the best combination of multibiofluid (plasma and urine) multimetabolites that can distinguish different foods. Additionally, penalized regressions, including LASSO and Elastic Net [45,46], will be used to select the set of biomarkers and timepoints most predictable for specific food intake. The kinetic profiles of targeted and untargeted metabolites will be characterized to better understand the formation, distribution, metabolism, and excretion of the food biomarkers. Kinetic parameters of biomarkers, including elimination rate constant, volume of distribution, total clearance, plasma maximum concentration (Cmax) or urinary max excretion (Emax), time (Tmax) from food consumption to archiving Cmax or Emax, and T1/2, will be determined based on several well-established PK models [[47], [48], [49], [50], [51], [52], [53], [54], [55]]. The consistency of parameter estimates will be compared among models and across foods to determine the optimal PK modeling method for dietary biomarkers. To develop a calibration equation for DR relationships, linear mixed regression will be used to model the effects of test food doses, feeding period, and dose-by-period interactions on individual biomarkers collected from the DR trial [56]. The fitted model will be input to curve-fitting software (WinCurveFit) [56] to develop calibration curves. Additionally, machine learning models, including elastic net regression will be used to develop metabolomic signatures of foods and dietary patterns in the OmniHeart study, Lifestyle Validation Study, and Study of Latinos Nutrition and Physical Activity Assessment Study.

Fred Hutchinson Cancer Center

The central mission of the DBDC study center at Fred Hutchinson Cancer Center (DBDC-Fred Hutch) is to advance the science of measuring complex dietary exposures by rigorous identification and validation of dietary biomarkers that reduce the measurement error associated with a self-reported diet. DBDC-Fred Hutch will focus on plant- and animal-based protein foods commonly consumed in the United States

Phase 1

The objective of phase 1 is to discover biomarkers for 4 individual foods from the USDA MyPlate protein food group. Phase 1 will consist of 2 crossover feeding trials for discovery of metabolomics-based plasma and urine biomarkers of intake of 4 high-protein test foods: beef, eggs, black beans, and pinto beans. Each trial will involve 3 7-d DR feeding periods and 2 PK evaluations and will pair an animal- and plant-based protein food (beef and pinto beans; eggs and black beans). Each DR feeding period within a trial will use the same controlled 2-d rotating background menu, and only the source of protein will vary between all protein from the animal source (feeding period A), half from animal and half from plant (feeding period B), and all protein from the plant source (feeding period C) (Figure 4). Test food doses are equivalent to 100% of the MyPlate daily recommended target for protein foods for prespecified calorie levels [35]. Before enrollment, participants will complete an FFQ, at-home 24-h urine collection, and in-clinic collected fasting plasma. The FFQ was developed and validated by the Nutrition Assessment Shares Resource of the Fred Hutchinson Cancer Center, Seattle, Washington [57].

FIGURE 4.

FIGURE 4

Study design at Dietary Biomarkers Development Consortium–Fred Hutch. PK, pharmacokinetic; HEI, Healthy Eating Index.

For each phase 1 trial, 15 healthy adults will complete 1 of 3 randomly assigned feeding period orders: A–B–C, B–C–A, or C–A–B. Each 7-d eucaloric controlled-feeding period will be preceded by a 2-d run-in, with at least a 7-d washout between feeding periods. Run-in menus are identical to the 7-d feeding period menus but are devoid of the test plant or animal protein food. Fasting plasma (after an overnight fast starting at 21:00) and 12- or 24-h urine samples will be collected on days 0, 3, and 7 of each feeding period to capture metabolome changes as the protein source shifts (Figure 4). At enrollment and at the end of each feeding period, stool samples will be self-collected by participants in vials containing 95% ethanol and stored for future use.

A PK evaluation will be conducted on day 0 of the all-animal and all-plant protein feeding periods for each phase 1 trial. At the start of the PK evaluation (hour 0 before ingestion of the test food), an indwelling catheter will be placed to collect fasting plasma and urine, followed by participant consumption of the entire day’s test food dose. Postprandial plasma and urine samples will then be collected at 1 (plasma only), 2, 4, 6, and 8 h after consumption of the test food dose (Figure 4). At hour 6, the remaining background foods from breakfast and lunch, in addition to coffee and tea (if requested), will be provided for consumption; any uneaten background foods will be packaged with the study dinner to be consumed at home. Participants will be provided ad libitum water throughout the PK evaluation. After the hour 8 plasma and urine collections, participants will leave the Fred Hutch clinic, continue to collect and pool urine at home, eat the study dinner before 21:00, and return to the clinic the following morning (at hour 24) for the final fasting plasma and urine collections of the PK evaluation. For the feeding trials, adherence will be evaluated through weigh back all returned foods and containers, and measured participant weight throughout the feeding periods.

Phase 2

The objective of phase 2 is to discover metabolomics-based biomarkers of lower and higher Healthy Eating Index-2020 (HEI-2020) [58] dietary patterns, confirm phase 1 food biomarkers, and assess the performance of metabolomics-based biomarkers under habitual dietary conditions. Phase 2 will involve a 2-period crossover feeding trial, comparing lower HEI-2020 (HEI score of ∼45; feeding period A) and higher HEI-2020 (HEI score of ∼75; feeding period B) diets designed to reflect better and poorer quality diets in the United States [58]. Phase 2 diets will include selected phase 1 test foods from all DBDC study centers. Each 7-d eucaloric controlled-feeding period will use the same 2-d rotating background menus, but the amount or form of individual test foods will vary to achieve the low and high HEI-2020 score. Before enrollment, participants will complete an FFQ, a 4-d food record (phase 2 only), at-home 24-h urine collection, and in-clinic collected fasting plasma.

For phase 2, 30 healthy adults will complete 2 feeding periods in random order: A–B or B–A. Within each feeding period, participants will also receive the 2-d rotating menu in random order: day 1–day 2 or day 2–day 1. Each 7-d controlled feeding period will be preceded by a 2-d run-in, with at least a 14-d washout between feeding periods. Run-in menus will be devoid of the test foods. Fasting plasma and 12- or 24-h urine samples will be collected on days 0, 3, and 7 of each feeding period to capture metabolome changes as the HEI score shifts (Figure 4). Stool samples collected in vials containing 95% ethanol at enrollment and at the end of each feeding period will be banked for future use.

Phase 3

The objective of phase 3 is to externally validate and determine the transferability of metabolomic markers to a broader population. Discovery biomarkers in phases 1 and 2, together with those identified by the other study centers, will be used to create 1 or more targeted LC-MS panels for use in phase 3. Validation will use archived serum/plasma and urine specimens from 3 diverse cohorts: American Cancer Society’s Cancer Prevention Study-3 Dietary Assessment Substudy [59], Study of Latinos Nutrition and Physical Activity Assessment Study [60], and Women’s Health Initiative Nutrition and Physical Activity Assessment Study [61]. These cohorts have existing dietary self-report (24-h recalls, FFQs, or 4-d food records) and benchmark biomarker data (for example, urinary nitrogen for protein, urinary sodium and potassium, or doubly labeled water for energy), and biospecimens available for metabolomic analysis.

Metabolomic data generation

Data acquisition

During the initial discovery phase, plasma and urine samples will be analyzed using an untargeted LC/MS metabolomics platform that includes 4 complementary protocols: 1) RP C18 UHPLC, +MS; 2) RP C18 UHPLC, -MS; 3) HILIC UHPLC, +MS; and 4) HILIC UHPLC, -MS. The RP C18 methods are optimized for high throughput using 5-min gradients, whereas the HILIC methods operate in normal mode with 25-min gradients, to allow the required column reconditioning. In a typical analysis of urine or plasma, the quadrupole, time-of-flight 6546 MS system (Agilent Technologies) detects ∼600–800 annotated aqueous metabolites (via full-scan, MS1 annotation), together with ∼2000 unidentified features/compounds from the aqueous and organic fractions [62]. The candidate biomarkers identified through this approach will be compiled to enhance the previously established LC-MS/MS-based dietary biomarker assessment panels [63].

Subsequently, putative biomarkers will be validated and quantified using targeted metabolomics. Targeted metabolomics is performed on a duplex-LC-MS system composed of 2 Shimadzu ultra-performance liquid chromatography pumps, CTC Analytics PAL HTC-xt temperature-controlled auto-sampler, and AB Sciex 6500+ Triple Quadrupole MS equipped with an ESI ionization source (AB Sciex) [64]. MS data acquisition is performed in multiple-reaction-monitoring mode.

Data processing

Untargeted data will be processed using Progenesis QI software to reduce missing values and to perform initial metabolite identification via comparison with an in-house database and the METLIN spectral library [65]. To annotate metabolites, the high-resolution m/z ratio, isotope abundance patterns, tandem MS analysis (MS/MS or MS2), and spectral matching to mass-spectral reference libraries of metabolites will be reviewed [66]. Molecular dynamics of major fragmentation reactions, based on established organic chemistry principles, will also be considered to inform structural elucidation [67]. Additionally, features will be annotated using the collaborative Global Natural Products Social Molecular Networking methods, which provide a very effective data-sharing, web-based platform for metabolite identification [68].

Statistical analyses

Phase 1 will include analyses for both PK and DR. For PK analyses, a 1-compartment PK model, with measures of maximum metabolite concentration (Cmax), time of maximum concentration (Tmax), metabolite half-life, absorption half-life, and AUC will be used to characterize the absorption and disposition of identified metabolites for each test food. Paired t-tests with multiple-comparison adjustments will be used to test the difference in PK measures across the animal and plant protein feeding periods. For phase 1 DR analysis, a linear mixed-effect model will regress test food intake compared with each metabolite measured at the end of the feeding period, adjusting for participant characteristics (for example, age, sex, race, BMI, smoking status, percentage of test food consumed, and assigned feeding period order). Metabolite–test food associations with a false discovery rate <20% will be identified. Sensitivity analyses will adjust for habitual test food intake.

Each food tested in phase 1 is expected to yield several metabolites reflecting the intake of that specific food rather than a single-food/single-metabolite paradigm. To reflect this likely scenario, biomarker panels will be developed by regressing the test food intake on the identified candidate biomarkers. At the end of the feeding period, candidate biomarker panels will be identified by selecting metabolites detected during the PK and DR tests for each food using model selection based on a mixed-effect model combined with LASSO. For each test food in phase 1, panel performance will be characterized through random 5-fold cross-validated R2. Phase 1 panels with the largest cross-validated R2 will be forwarded to phase 2 for confirmation.

Similar analytic methods will be used for phase 2, where analyses will focus on metabolomics discovery of low and high HEI-2020 pattern score diets, confirmation of panels discovered in phase 1, and evaluation of self-report diets. For each metabolite, a generalized linear mixed-effect model will be fit that regresses HEI-2020 pattern score intake (low, high) compared with metabolites measured at the end of the feeding period, adjusting for participant characteristics and diet order. Metabolite–pattern score associations with a false discovery rate <20% will be identified.

For biomarker panel development, a multivariate linear regression model will regress the HEI-2020 pattern score on the identified metabolites measured at the end of the corresponding feeding period. Additionally, a binary indicator for high and low HEI-2020 pattern scores will be regressed on metabolites measured at the end of the corresponding feeding period using a generalized linear mixed model with a logistic link. The panel with the largest cross-validated R2 for each food subgroup will be forwarded to phase 3 for confirmation.

For each selected phase 1 test food evaluated within phase 2, a linear mixed-effect model will be fit that regresses the amount of test food consumed compared with the univariate metabolite panel score generated in phase 1, adjusting for participant characteristics and diet order. Panels with an estimated R2>36% will move to phase 3. Models developed in phase 2 will also be applied to evaluate the strength of associations of FFQ and 4-day food record (4DFR)-derived HEI scores with metabolites under habitual dietary conditions. FFQ and 4DFR-derived HEI pattern scores will be regressed (in separate models) on the HEI-2020 biomarker panel identified in phase 2, and R2 values from the FFQ and 4DFR models will be compared with the R2 values from the feeding study models.

In phase 3, to validate candidate biomarkers confirmed in phase 2 under habitual dietary conditions, self-report-derived HEI-2020 scores and protein food intake from the Women’s Health Initiative Nutrition and Physical Activity Assessment Study, American Cancer Society’s Cancer Prevention Study-3, and Study of Latinos Nutrition and Physical Activity Assessment Study cohorts will be regressed on metabolites from targeted panels (developed in phase 2), and R2 will be calculated. Each cohort will first be examined separately, but if appropriate, data will be harmonized and joint analyses presented.

University of California Davis

The overall objective of the biomarkers project at the DBDC study center at the University of California, Davis (DBDC-Davis) is to identify and validate biomarkers for selected fruits and vegetables commonly eaten in the United States population. The rationale for fruit and vegetable selection was multitiered. First, the food items selected in our study are in the top 85% of NHANES-based fruit and vegetable intake in the United States population, ensuring the maximum value of the identified markers. Second, we subselected 1 fruit (banana) and 1 vegetable (tomato) with previously reported biomarkers that could be used as a positive control for our discovery effort although providing validation of reported markers to the scientific community. We then selected 1 fruit (strawberries) and 1 vegetable (carrots) with reported, but nonspecific, biomarkers to facilitate the search for more specific markers. Finally, we selected 1 fruit (peaches) and 1 vegetable (green beans) for which no markers were reported at the time of the study design. The overall study design is illustrated in Figure 5.

FIGURE 5.

FIGURE 5

Study design at Dietary Biomarkers Development Consortium–Davis. A total of 280 healthy subjects from the Davis and Sacramento areas in California will be enrolled in this study, which is divided into 3 phases. In phase 1, 30 participants will undergo a randomly assigned crossover dietary intervention involving 3 meal challenges with varying levels of fruits and vegetables for breakfast (refer to Figure 1 for dosing schemes). In phase 2, 50 participants will be stratified and randomly assigned into 3 dietary groups: 1) typical American diet (TAD), consisting of 10 participants receiving a diet devoid of target foods (bananas, peaches, strawberries, tomatoes, green beans, and carrots); 2) TAD+, involving 20 participants, includes the same diet as TAD but incorporates low levels of the target fruits and vegetables; 3) Dietary Guidelines for Americans (DGA), also including 20 participants, offers a diet enriched with higher levels of fruits and vegetables, including the target foods. Phase 3 involves a cross-sectional study with 200 participants who will provide dietary data and biological samples.

Phase 1

In phase 1, 30 clinically healthy adult males and females will participate in a randomly assigned 3-session crossover dietary intervention with a factorial design. Different levels (or doses) of specific fruit or vegetable will be served in the meal challenge: a zero dose, a medium dose [1 half-cup equivalent (HCE)], or a high dose (2 HCEs). Each meal challenge includes a total of 6 HCEs of fruits and vegetables (fruits: banana, peaches, strawberries; vegetables: tomatoes, carrots, green beans), and the dosing scheme is presented in Figure 2. Between each of the 3 sessions, a minimum 48-h washout period will be employed. The order in which meal challenges are provided will be randomly determined. Participant enrollment will follow a set of harmonized consortium guidelines. After enrollment, participants will complete an FFQ and the Stanford Brief Physical Activity Survey (SBPAS). The SBPAS, along with height, weight, and age data, will be used to determine calorie requirements for the 2-d run-in diet (devoid of study fruits and vegetables), and test day diets, using the Mifflin-St. Jeor equation. The FFQ used at this site was the same as used at the Fred Hutchinson site [57].

Participants will adhere to a run-in diet for 2 d before the test day and record all foods and beverages consumed during this period. They will be required to arrive at the facility in an overnight-fasted state, having collected all urine from 21:00 the previous evening. On the test day, after an overnight fast, participants will be catheterized for blood collection around 20:00, and a baseline fasting blood sample will be taken. After the meal challenge is provided, blood is collected hourly for 8 h, and urine is pooled in 2-h increments throughout this period. Participants may request coffee or tea in the morning and will be given water ad libitum during the day. A predefined lunch with limited fruits and vegetables will be served 6 h after the meal challenge, and participants will be provided with another predefined dinner to consume at home. After leaving the testing site, participants will continue to pool all urine and return the next morning to provide another fasting blood sample. This protocol is designed to determine the kinetics of the appearance and disappearance of dietary markers over a 24-h period. Additionally, a stool sample will be collected for future analysis correlating metabolites with fecal microbiome.

Phase 2

This phase will evaluate whether the biomarkers identified in phase 1 are reliable indicators of fruit and vegetable intake from habitual dietary patterns that include differing amounts of test foods. Specifically, the study will compare a typical American diet (TAD), which is low in fruits and vegetables, with the fruit- and vegetable-enriched Dietary Guidelines for Americans (DGA) diet designed based on differences in HEI scores [69], utilizing a parallel design. A total of 50 participants will be recruited and stratified by sex and BMI to ensure equal representation in each group, and randomly assigned into 3 groups: 1) TAD, where 10 participants will be provided controlled meals consistent with a TAD without any target fruits and vegetables (fruits: banana, peaches, strawberries; vegetables: tomatoes, carrots, green beans) (HEI score of ∼42); 2) TAD Plus (TAD+), where 20 participants will follow the TAD diet, including low levels of target fruits [0.71 cup equivalents (CE) or 105 g] and vegetables (0.58 CE or 75 g) (HEI score of ∼49); and 3) DGA, where 20 participants will be provided controlled meals that are rich in a variety of fruits and vegetables, including the target fruits and vegetables (HEI score of ∼71). This phase of the study includes target foods tested at the other study centers, including beef, eggs, pinto and black beans, chicken, potatoes, oats, corn, and cheese, which will be used to validate biomarkers generated at these study centers in phase 1 of the overall study, especially for test foods consumed in the final 2 intervention days. Diets were designed by a postdoctoral scholar (PhD, RDN) in collaboration with a principal dietitian (MS, RDN) at the Western Human Nutrition Research Center and a registered dietetic technician at the Western Human Nutrition Research Center Metabolic Kitchen. Each individual working on the diets has 10–20 y of culinary experience and specialized experience in planning and implementing research study diets.

Upon enrollment, participants will be required to complete an FFQ and the SBPAS, adhering to the harmonized consortium guidelines. The SBPAS, along with height, weight, and age data, will be used to determine caloric need for the run-in diet and the controlled diets using the Mifflin-St. Jeor equation. Similar to phase 1, participants will start with a 2-d controlled run-in diet, during which the overnight urine and fasting blood will be collected. All participants will be instructed to strictly adhere to the controlled diet for a duration of 6 d. Compliance will be monitored through daily food checklists, and assessments will be made based on any reported deviations from study menu.

After completing the 6-d controlled diet, participants will provide overnight urine, fasting blood, and stool sample the following day. The 40 individuals in the TAD+ and DGA groups will then follow a 1-d run-out diet that excludes the target fruits and vegetables. After this day, participants will return the following morning to provide a final set of overnight urine and fasting blood samples. This design allows researchers to assess the persistence of biomarkers 36–48 h after the last consumption of the target fruits and vegetables during the feeding trial.

Phase 3

This phase will be a cross-sectional study designed to evaluate the robustness and reliability of the food-exposure markers identified and validated in phase 1 and phase 2. A total of 200 participants will be enrolled. Before sample collection, participants’ habitual dietary patterns and recent dietary intake over the past 3 d will be assessed via FFQ and the Automated Self-Administered 24-h Dietary Assessment Tool (ASA-24). After the completion of the FFQ and 3-d ASA-24, overnight urine (from 21:00 the previous evening), stool, and fasting blood samples will be collected.

Metabolomic data generation

Data acquisition

During the initial discovery phase of untargeted metabolomics, plasma and urine samples will undergo protein precipitation with solvents in the presence of a panel of selected, heavy-isotope-labeled internal standards. Metabolites will be separated by RP and HILIC protocols and analyzed using a Q-Exactive HF orbitrap MS (Thermo Fisher). For tandem MS (MS/MS) analyses, samples will receive back-to-back injections in both positive- and negative ESI modes. LC-MS/MS analyses in the positive mode will be performed using acetonitrile/water and buffered isopropanol/acetonitrile gradients, whereas ammonium acetate-buffered gradients will be used for the negative ESI mode. Centroided data will be acquired with a heated ESI ion source (Thermo Fisher) in data-dependent MS/MS mode. To maximize the number of MS/MS spectra obtained, 5 runs with iterative MS/MS exclusions will be performed using the R package, IE-Omics (innovativeomics.com). Data will be collected across full scans of m/z 100–1200. The identified dietary biomarkers will be compiled and utilized to enhance the previously established LC-MS/MS-based dietary biomarker assessment panels [63].

Data processing

All raw data files will be processed using MS-DIAL software [70], version 4.90, for deconvolution, peak picking, alignment, and compound identification. Adduct ions under consideration include: [M + H]+, [M + NH4]+, [M + Na]+, [2M + H]+, [2M + NH4]+, and [2 M + Na]+ for positive ESI, and [M-H]–, [M + Cl]–, and [M + HAc-H]– for negative ESI. To enhance sensitivity and detect postingestion secondary metabolites, HILIC analyses will be performed using neutral-loss and precursor-scan MS experiments, with ramped collision energies, and data-dependent MS/MS scan collections triggered by ions associated with common metabolite conjugates (for example, glucuronides, glucosides, sulfates, glycines, and taurines). Candidate biomarkers identified through this approach will inform subsequent validation and quantitation via targeted metabolomics. These analyses will be performed on the QTRAP 6500 instrument (Thermo Fisher) using RP chromatography and precursor-product ion mass transitions. For metabolites with commercially available authentic reference reagents, calibration curves will be constructed. In cases where standards are not available, dilutions of pooled participant samples will be used to create pseudocalibration curves and retention time controls for select targets, enabling semiquantitative analysis.

Statistical analyses

In phase 1, univariate analyses will be conducted using a linear mixed-effect model and nonparametric Friedman repeated measures analysis of variance (ANOVA) to identify metabolites that differ significantly by postprandial time and test food dose. To test for potential confounders, we will use linear mixed-effect models. Additionally, to determine the effect of the meal challenge on each metabolite, planned Conover’s tests (a nonparametric test for pairwise comparisons following a repeated measures ANOVA) will be conducted on each sequential time point (for example, baseline compared with 1 h, 1 h compared with 2 h, etc.) and to compare each time point with the baseline (for example, baseline compared with 2 h, baseline compared with 3 h, etc.). Following the preliminary univariate analysis, multivariate statistical techniques, such as principal component analysis, cluster analysis, random forest techniques, and a mixed-effect multinomial logistic regression model, will be used to explore the complex interrelations between multiple metabolites and their collective impact on distinguishing dietary interventions.

Next, to understand the absorption, distribution, metabolism, and excretion of potential biomarkers of dietary intake (BDI), statistics similar to PK modeling will be integrated using potential biomarkers that originate from food compounds. For compounds that yield univariate statistical thresholds of P<0.05 in either time, dose or time × dose interactions, PK models will be constructed to describe the time course of these potential BDI-metabolite concentrations in the body using noncompartmental analysis for initial explorations, followed by more complex compartmental models as needed. Key PK parameters, such as maximum concentration (Cmax), time to maximum concentration (Tmax), half-life (t1/2), and AUC, will be estimated. Importantly, statistical analyses can only be part of the process to discover and validate biomarkers of dietary intake. For example, endogenous metabolites may be affected in an unspecific way by foods. Hence, metabolites will be scrutinized and curtailed toward exogenous compounds (and their derivatives) to be used in validation phases 2 and 3.

In phase 2, biomarker efficacy (individual or a combination of biomarkers indicating consumption of the specific fruit or vegetable) in approximating both consumption (yes or no) and the level of consumption (none, low or high consumption) will be evaluated. Bayesian statistics will be used to determine the accuracy of detecting consumption and the level of consumption within 48 h.

In phase 3, biomarkers will be validated in free-living participants and biomarkers for specific fruits and vegetables will be correlated with dietary recall. Statistical analyses will be harmonized with the other centers [71].

Food sample analysis at the USDA-ARS

Sample acquisition

The Methods and Application of Food Composition Laboratory at USDA-ARS in Beltsville, Maryland will analyze frozen samples collected for all phase 1 test foods. Foods acquired at the 3 study centers are summarized in Table 1. All samples will be extracted for analysis of nonlipid metabolites. All analyses will be nontargeted, and no analyses are planned for inorganic elements.

TABLE 1.

USDA-Agricultural Research Service food sample analysis.

Food group Harvard University (n = 15) Fred Hutchinson Cancer Center (n = 15) University of California Davis (n = 30)
Protein Beef
Chicken
Salmon
Soy
Beef
Eggs
Dairy Cheese
Yogurt
Pulses Pinto beans
Black beans
Fruits Bananas
Strawberries
Peaches
Vegetables Potatoes
Corn
Tomatoes
Carrots
Green beans
Grains Whole-wheat bread
Oats

Data acquisition

Raw and cooked lyophilized samples will be ground and extracted with methanol/water (70:30, vol:vol) and analyzed by reverse-phase UHPLC-high-resolution accurate mass/tandem MS [72,73]. For comprehensive metabolite coverage, separation will be performed on an Agilent RRHD Eclipse Plus C18 column (2.1 × 150 mm, 1.8 μm). Data will be acquired using atmospheric pressure chemical ionization and ESI under both positive- and negative-ionization modes. The mass spectrometer will be calibrated to specification on the same day of the sample run. Pooled QC samples will be prepared by mixing aliquots of different batches of samples. Blank samples and QC samples will be analyzed at routine intervals. QC pool samples will be monitored visually during data acquisition.

Data processing

All raw data files will be processed using Xcalibur 4.2 Data Processing. Compounds in each food sample will be annotated using high-resolution mass spectrometric full-scan data, multistage MSn data, and UV spectra data. Additionally, compounds will be annotated using the previous literature and public databases, such as the Food Database, Human Metabolome Database, Metabolomics Workbench, and METLIN.

Data and sample dissemination for future research

In accordance with the NIDDK Data Sharing Policy [74] and in consultation with the study centers, the DCC developed a plan for transferring DBDC study specimens and data to the NIDDK Central Repository [24]. Each study center developed site-specific data and specimen-sharing plans, with a proposed release timeline, specimen shipment procedures, and specimen-collection quantities. The centers will submit deidentified data and biospecimens from study participants who have consented to future research use at the NIDDK-CR.

The study centers will set aside a representative volume of 20% for each biospecimen type (blood products and urine) for each participant and each timepoint, which will be archived at the NIDDK-CR. The DCC will submit clinical data in a limited dataset format (and corresponding study metadata) and linking files to connect specimens and clinical data to the NIDDK-CR at the end of each collection phase. Additionally, the DCC will submit informed consent metadata to enable the NIDDK-CR to properly manage and share these resources with the external research community.

The metabolomics and clinical data generated during all 3 study phases from all 3 study centers will be available in a DCC cloud space to all DBDC investigators and ancillary study investigators. In addition, data derived from the USDA metabolomic analysis of food samples will be stored on the DBDC cloud for access. The DCC has established workstations to enable cross-site analysis of data. At the end of the trial, all metabolomics and clinical data with necessary linkage files will be deposited in the Metabolomics Workbench as a resource to the research community [25,26].

Discussion

The DBDC consortium represents a unique collaboration among the NIH, USDA, and academic centers in one of the first major efforts to improve dietary assessment through the discovery and validation of food biomarkers in the United States The combination of feeding studies and metabolomic profiling offers a new approach for agnostic discovery of novel biomarkers of specific foods. The number of publications on food biomarker discovery has been rapidly increasing since the early 2000s [15].

The DBDC is charged with the overarching goal of discovering and validating novel biomarkers for a spectrum of food items commonly consumed by the United States population. The DBDC will examine >20 common foods that are among the primary sources of proteins, carbohydrates, or phytochemicals in United States diets. To achieve its goal, the DBDC will use controlled feeding trials, real-world epidemiological and cross-sectional studies in diverse populations, cutting-edge metabolomic profiling, diverse food sample collection, and biospecimen collection. Although data collection and participant enrollment strategies will be harmonized across the 3 sites, each site will select their own study designs and metabolomics profiling approaches based on considerations balancing costs and efficiency. The test foods will be largely independent among the 3 sites (except beef), although each site will include the food biomarkers discovered at all sites in their own metabolomics platforms, which will allow for the cross-examination of the biomarkers at all sites. These combined approaches will greatly advance the discovery of sensitive and specific food markers and enhance the generalizability of findings to the broader United States population. Additionally, each study center shares a common structure, consisting of administration, intervention, metabolomics, data analysis, and biomarkers project cores to ensure consistency of food biomarker discovery and validation across the DBDC. The Steering Committee, DCC, and multiple working groups further reinforce the coordination, timeline, and consistency of study activities at each site. Human biospecimens and food samples collected during each DBDC study phase will give rise to a unique resource for the broader research community that will facilitate the discovery and characterization of food biomarkers in the current project and future studies on food biomarkers.

Food biomarkers discovered and validated by the DBDC are expected to significantly expand the list of sensitive and specific food biomarkers for these selected major sources of macronutrients in the United States diet. The application of these and other existing food biomarkers in epidemiological and clinical studies will greatly enhance the ability to more accurately measure human diets, evaluate compliance, and calibrate measurement errors. These biomarkers are particularly valuable in research settings where the collection of dietary data using questionnaires or other traditional instruments is challenging or infeasible, such as among children or individuals with cognitive impairment.

In conclusion, the DBDC is among the first consortia established in the United States to focus on novel food biomarker discovery, characterization, and validation. Findings from the DBDC studies, in conjunction with existing knowledge, will likely facilitate research in the realms of dietary interventions, nutritional epidemiology, precision nutrition, and multidisciplinary nutrition research.

Author contributions

The authors’ responsibilities were as follows – HC, QS, SNB, JRB, JWN, OF, JWL, BJB, SLN, FS, JW, FMS, SHA, FBH, MLN, CMS, PM: designed the research; JMS, DOM, XH, JS, DR, SNB, QS, WS, HC, JRB, PM: conducted the research; DR, OF, YM-R: provided essential reagents or materials; HC, QS, SNB, JMS, DOM, JRB, XH, JS, JH, WS, DR, LL, JWN, OF, CBC, JWL, BJB, SLN, YW, CZ, YM-R, MM, YH, AS, WZ, DD, FS, JW, FMS, SHA, FBH, MLN, CMS, PM: wrote the manuscript; JH coordinated processing and analysis of food samples; and all authors: read and approved the final manuscript.

Data availability

A data share statement is not applicable. This manuscript describes a study design and does not include any data.

Funding

Research reported in this publication was supported by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) of the NIH under award numbers 1U24DK129557 and U2CDK129670, and the USDA under award numbers 13370729, 13231928, 2022-67017-38475, 1021411, and 2021-67017-35783. Funding for BJB and JWN was provided by the USDA, Agricultural Research Service projects 2032-10700-003-24R and USDA Project 2032-51530-025-00D. The USDA is an equal-opportunity employer and provider. Additional support was provided by the NIH under award number P30 CA015704 and the NIDDK under award number P30 DK035816.

Conflict of interest

SHA is the founder and principal of XenoMed, LLC (dba XenoMet), which is focused on research and discovery in the area of microbial metabolism. XenoMet had no part in the research design, funding, results or writing of the manuscript. The other authors do not have any conflicts of interest to disclose. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH or the United States Department of Agriculture. SNB is an Editor and Editorial Board Member for Current Developments in Nutrition and played no role in the Journal’s evaluation of the manuscript.

Acknowledgments

We thank Brooke Walker, Duke Clinical Research Institute, who provided editorial support. Walker did not receive compensation for her contributions, apart from her employment at the institution in which this study was conducted. DBDC-Davis site investigators would like to thank Debra Tacad, Dustin Burnett, and Annie Kan, Registered Dietetic Technician, for designing the study diets.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.cdnut.2025.107435.

Appendix A. Supplementary data

The following is the Supplementary data to this article:

multimedia component 1
mmc1.docx (34.6KB, docx)

References

  • 1.US Burden of Disease Collaborators. Mokdad A.H., Ballestros K., Echko M., Glenn S., Olsen H.E. The State of US Health, 1990–2016: burden of diseases, injuries, and risk factors among US states. JAMA. 2018;319(14):1444–1472. doi: 10.1001/jama.2018.0158. et al. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Danaei G., Ding E.L., Mozaffarian D., Taylor B., Rehm J., Murray C.J., et al. The preventable causes of death in the United States: comparative risk assessment of dietary, lifestyle, and metabolic risk factors. PLOS Med. 2009;6(4) doi: 10.1371/journal.pmed.1000058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Turrini A. Perspectives of dietary assessment in human health and disease. Nutrients. 2022;14(4):830. doi: 10.3390/nu14040830. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Booth S.L., Sallis J.F., Ritenbaugh C., Hill J.O., Birch L.L., Frank L.D., et al. Environmental and societal factors affect food choice and physical activity: rationale, influences, and leverage points. Nutr. Rev. 2001;59(3 Pt 2):S21–S39. doi: 10.1111/j.1753-4887.2001.tb06983.x. discussion: S57–S65. [DOI] [PubMed] [Google Scholar]
  • 5.Wardle J., Haase A.M., Steptoe A., Nillapun M., Jonwutiwes K., Bellisle F. Gender differences in food choice: the contribution of health beliefs and dieting. Ann. Behav. Med. 2004;27(2):107–116. doi: 10.1207/s15324796abm2702_5. [DOI] [PubMed] [Google Scholar]
  • 6.Glanz K., Basil M., Maibach E., Goldberg J., Snyder D. Why Americans eat what they do: taste, nutrition, cost, convenience, and weight control concerns as influences on food consumption. J. Am. Diet. Assoc. 1998;98(10):1118–1126. doi: 10.1016/S0002-8223(98)00260-0. [DOI] [PubMed] [Google Scholar]
  • 7.Neuhouser M.L., Prentice R.L., Tinker L.F., Lampe J.W. Enhancing capacity for food and nutrient intake assessment in population sciences research. Annu. Rev. Public Health. 2023;44:37–54. doi: 10.1146/annurev-publhealth-071521-121621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Briefel R.R., Flegal K.M., Winn D.M., Loria C.M., Johnson C.L., Sempos C.T. Assessing the nation's diet: limitations of the food frequency questionnaire. J. Am. Diet. Assoc. 1992;92(8):959–962. [PubMed] [Google Scholar]
  • 9.Krall E.A., Dwyer J.T. Validity of a food frequency questionnaire and a food diary in a short-term recall situation. J. Am. Diet. Assoc. 1987;87(10):1374–1377. [PubMed] [Google Scholar]
  • 10.Schatzkin A., Kipnis V., Carroll R.J., Midthune D., Subar A.F., Bingham S., et al. A comparison of a food frequency questionnaire with a 24-hour recall for use in an epidemiological cohort study: results from the biomarker-based Observing Protein and Energy Nutrition (OPEN) study. Int. J. Epidemiol. 2003;32(6):1054–1062. doi: 10.1093/ije/dyg264. [DOI] [PubMed] [Google Scholar]
  • 11.Guasch-Ferre M., Bhupathiraju S.N., Hu F.B. Use of metabolomics in improving assessment of dietary intake. Clin. Chem. 2018;64(1):82–98. doi: 10.1373/clinchem.2017.272344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Maruvada P., Lampe J.W., Wishart D.S., Barupal D., Chester D.N., Dodd D., et al. Perspective: dietary biomarkers of intake and exposure-exploration with omics approaches. Adv. Nutr. 2020;11(2):200–215. doi: 10.1093/advances/nmz075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Dragsted L.O., Gao Q., Pratico G., Manach C., Wishart D.S., Scalbert A., et al. Dietary and health biomarkers-time for an update. Genes Nutr. 2017;12:24. doi: 10.1186/s12263-017-0578-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Brouwer-Brolsma E.M., Brennan L., Drevon C.A., van Kranen H., Manach C., Dragsted L.O., et al. Combining traditional dietary assessment methods with novel metabolomics techniques: present efforts by the Food Biomarker Alliance. Proc. Nutr. Soc. 2017;76(4):619–627. doi: 10.1017/S0029665117003949. [DOI] [PubMed] [Google Scholar]
  • 15.Rafiq T., Azab S.M., Teo K.K., Thabane L., Anand S.S., Morrison K.M., et al. Nutritional metabolomics and the classification of dietary biomarker candidates: a critical review. Adv. Nutr. 2021;12(6):2333–2357. doi: 10.1093/advances/nmab054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Shill J., Mavoa H., Allender S., Lawrence M., Sacks G., Peeters A., et al. Government regulation to promote healthy food environments—a view from inside state governments. Obes. Rev. 2012;13(2):162–173. doi: 10.1111/j.1467-789X.2011.00937.x. [DOI] [PubMed] [Google Scholar]
  • 17.Popkin B.M., Hawkes C. Sweetening of the global diet, particularly beverages: patterns, trends, and policy responses. Lancet Diabetes Endocrinol. 2016;4(2):174–186. doi: 10.1016/S2213-8587(15)00419-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Dunford E.K., Ng S.W., Taillie L.S. How does the healthfulness of the US food supply compare to international guidelines for marketing to children and adolescents? Matern. Child Health J. 2019;23(6):768–776. doi: 10.1007/s10995-018-02693-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Herforth A., Arimond M., Alvarez-Sanchez C., Coates J., Christianson K., Muehlhoff E. A global review of food-based dietary guidelines. Adv. Nutr. 2019;10(4):590–605. doi: 10.1093/advances/nmy130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Rosi A., Paolella G., Biasini B., Scazzina F. SINU Working Group on Nutritional Surveillance in Adolescents. Dietary habits of adolescents living in North America, Europe or Oceania: a review on fruit, vegetable and legume consumption, sodium intake, and adherence to the Mediterranean diet. Nutr. Metab. Cardiovasc. Dis. 2019;29(6):544–560. doi: 10.1016/j.numecd.2019.03.003. [DOI] [PubMed] [Google Scholar]
  • 21.Dragsted L.O., Gao Q., Scalbert A., Vergeres G., Kolehmainen M., Manach C., et al. Validation of biomarkers of food intake-critical assessment of candidate biomarkers. Genes Nutr. 2018;13:14. doi: 10.1186/s12263-018-0603-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Lampe J.W., Huang Y., Neuhouser M.L., Tinker L.F., Song X., Schoeller D.A., et al. Dietary biomarker evaluation in a controlled feeding study in women from the Women's Health Initiative cohort. Am. J. Clin. Nutr. 2017;105(2):466–475. doi: 10.3945/ajcn.116.144840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Britten P., Cleveland L.E., Koegel K.L., Kuczynski K.J., Nickols-Richardson S.M. Updated US Department of Agriculture Food Patterns meet goals of the 2010 dietary guidelines. J. Acad. Nutr. Diet. 2012;112(10):1648–1655. doi: 10.1016/j.jand.2012.05.021. [DOI] [PubMed] [Google Scholar]
  • 24.National Institute of Diabetes and Diagestive and Kidney Diseases. NIDDK-CR Resources for Research (R4R) [Internet]. [date updated; March 19, 2025]. Available from: https://repository.niddk.nih.gov/home/.
  • 25.University of California, San Diego. Metabolomics workbench [Internet]. [April 21, 2025; March 19, 2025] Available from: https://www.metabolomicsworkbench.org/.
  • 26.Sud M., Fahy E., Cotter D., Azam K., Vadivelu I., Burant C., et al. Metabolomics Workbench: an international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools. Nucleic Acids Res. 2016;44(D1):D463–D470. doi: 10.1093/nar/gkv1042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Cheng K., Gupta S.K., Kantor S., Kuhl J.T., Aceves S.S., Bonis P.A., et al. Creating a multi-center rare disease consortium—the Consortium of Eosinophilic Gastrointestinal Disease Researchers (CEGIR), Transl. Sci. Rare Dis. 2017;2(3–4):141–155. doi: 10.3233/TRD-170016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Bonifacio E., Beyerlein A., Hippich M., Winkler C., Vehik K., Weedon M.N., et al. Genetic scores to stratify risk of developing multiple islet autoantibodies and type 1 diabetes: a prospective study in children. PLOS Med. 2018;15(4) doi: 10.1371/journal.pmed.1002548. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Frank N.M., Lynch K.F., Uusitalo U., Yang J., Lonnrot M., Virtanen S.M., et al. The relationship between breastfeeding and reported respiratory and gastrointestinal infection rates in young children. BMC Pediatr. 2019;19(1):339. doi: 10.1186/s12887-019-1693-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ginsburg G.S., Cavallari L.H., Chakraborty H., Cooper-DeHoff R.M., Dexter P.R., Eadon M.T., et al. Establishing the value of genomics in medicine: the IGNITE Pragmatic Trials Network. Genet. Med. 2021;23(7):1185–1191. doi: 10.1038/s41436-021-01118-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Reid M.C., Eccleston C., Pillemer K. Management of chronic pain in older adults. BMJ. 2015;350 doi: 10.1136/bmj.h532. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Lesko C.R., Jacobson L.P., Althoff K.N., Abraham A.G., Gange S.J., Moore R.D., et al. Collaborative, pooled and harmonized study designs for epidemiologic research: challenges and opportunities. Int. J. Epidemiol. 2018;47(2):654–668. doi: 10.1093/ije/dyx283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Levine E., Abbatangelo-Gray J., Mobley A.R., McLaughlin G.R., Herzog J. Evaluating MyPlate: an expanded framework using traditional and nontraditional metrics for assessing health communication campaigns. J. Nutr. Educ. Behav. 2012;44(4):S2–S12. doi: 10.1016/j.jneb.2012.05.011. [DOI] [PubMed] [Google Scholar]
  • 34.Chrisman M., Diaz Rios L.K. Evaluating MyPlate after 8 years: a perspective. J. Nutr. Educ. Behav. 2019;51(7):899–903. doi: 10.1016/j.jneb.2019.02.006. [DOI] [PubMed] [Google Scholar]
  • 35.Agriculture USDo. MyPlate plan [Internet]. [date updated; March 19, 2025]. Available from: https://www.myplate.gov/myplate-plan.
  • 36.Chen T.C., Clark J., Riddles M.K., Mohadjer L.K., Fakhouri T.H.I. National Health and Nutrition Examination Survey, 2015–2018: sample design and estimation procedures. Vital Health Stat. 2020;2(184):1–35. [PubMed] [Google Scholar]
  • 37.Katagiri R., Sawada N., Goto A., Yamaji T., Iwasaki M., Noda M., et al. Association of soy and fermented soy product intake with total and cause specific mortality: prospective cohort study. BMJ. 2020;368 doi: 10.1136/bmj.m34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.HARVARD T.H, CHAN School of Public Health Nutrition Questionnaire Service Center [Internet] 2025. https://hsph.harvard.edu/department/nutrition/nutrition-questionnaire-service-center/ [date updated; March 19, 2025]. Available from:
  • 39.Appel L.J., Sacks F.M., Carey V.J., Obarzanek E., Swain J.F., Miller E.R., 3rd, et al. Effects of protein, monounsaturated fat, and carbohydrate intake on blood pressure and serum lipids: results of the OmniHeart randomized trial. JAMA. 2005;294(19):2455–2464. doi: 10.1001/jama.294.19.2455. [DOI] [PubMed] [Google Scholar]
  • 40.Schirmer M., Strazar M., Avila-Pacheco J., Rojas-Tapias D.F., Brown E.M., Temple E., et al. Linking microbial genes to plasma and stool metabolites uncovers host-microbial interactions underlying ulcerative colitis disease course. Cell Host Microbe. 2024;32(2):209–226.e7. doi: 10.1016/j.chom.2023.12.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Evans A.M., O'Donovan C., Playdon M., Beecher C., Beger R.D., Bowden J.A., et al. Dissemination and analysis of the quality assurance (QA) and quality control (QC) practices of LC-MS based untargeted metabolomics practitioners. Metabolomics. 2020;16(10):113. doi: 10.1007/s11306-020-01728-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Townsend M.K., Clish C.B., Kraft P., Wu C., Souza A.L., Deik A.A., et al. Reproducibility of metabolomic profiles among men and women in 2 large cohort studies. Clin. Chem. 2013;59(11):1657–1667. doi: 10.1373/clinchem.2012.199133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Gao Q., Dragsted L.O., Ebbels T. Comparison of bi- and tri-linear PLS models for variable selection in metabolomic time-series experiments. Metabolites. 2019;9(5):92. doi: 10.3390/metabo9050092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Meyer D., Wien F.T. Support vector machines. The interface to libsvm in package. 2015 [Google Scholar]
  • 45.Zou H., Hastie T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. B. 2005;67:301–320. [Google Scholar]
  • 46.Tibshirani R. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. B. 1996;58(1):267–288. [Google Scholar]
  • 47.Goutelle S., Woillard J.B., Buclin T., Bourguignon L., Yamada W., Csajka C., et al. Parametric and nonparametric methods in population pharmacokinetics: experts' discussion on use, strengths, and limitations. J. Clin. Pharmacol. 2022;62(2):158–170. doi: 10.1002/jcph.1993. [DOI] [PubMed] [Google Scholar]
  • 48.Gabrielsson J., Weiner D. Non-compartmental analysis. Methods Mol. Biol. 2012;929:377–389. doi: 10.1007/978-1-62703-050-2_16. [DOI] [PubMed] [Google Scholar]
  • 49.Mould D.R., Upton R.N. Basic concepts in population modeling, simulation, and model-based drug development. CPT Pharmacometrics Syst. Pharmacol. 2012;1(9):e6. doi: 10.1038/psp.2012.4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Harrold J.M., Abraham A.K. Ubiquity: a framework for physiological/mechanism-based pharmacokinetic/pharmacodynamic model development and deployment. J. Pharmacokinet. Pharmacodyn. 2014;41(2):141–151. doi: 10.1007/s10928-014-9352-6. [DOI] [PubMed] [Google Scholar]
  • 51.Bustad A., Terziivanov D., Leary R., Port R., Schumitzky A., Jelliffe R. Parametric and nonparametric population methods: their comparative performance in analysing a clinical dataset and two Monte Carlo simulation studies. Clin. Pharmacokinet. 2006;45(4):365–383. doi: 10.2165/00003088-200645040-00003. [DOI] [PubMed] [Google Scholar]
  • 52.Bonate P.L., Strougo A., Desai A., Roy M., Yassen A., van der Walt J.S., et al. Guidelines for the quality control of population pharmacokinetic-pharmacodynamic analyses: an industry perspective. AAPS J. 2012;14(4):749–758. doi: 10.1208/s12248-012-9387-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.CERTARA. The industry standard for pharmacokinetic/pharmacodynamic (PK/PD) analysis [Internet] [date updated; March 19, 2025]. Available from: https://www.certara.com/software/phoenix-winnonlin/.
  • 54.Pharmacokinetic software [Internet] [date updated; March 19, 2025]. Available from: https://www.pharmpk.com/soft.html.
  • 55.Denney W., Duvvuri S., Buckeridge C. Simple, automatic noncompartmental analysis: the PKNCA R package. J. Pharmacokinet. Pharmacodyn. 2015;42:S65. S6S. [Google Scholar]
  • 56.Gibbons H., Michielsen C.J.R., Rundle M., Frost G., McNulty B.A., Nugent A.P., et al. Demonstration of the utility of biomarkers for dietary intake assessment; proline betaine as an example. Mol. Nutr. Food Res. 2017;61(10) doi: 10.1002/mnfr.201700037. [DOI] [PubMed] [Google Scholar]
  • 57.Patterson R.E., Kristal A.R., Tinker L.F., Carter R.A., Bolton M.P., Agurs-Collins T. Measurement characteristics of the Women's Health Initiative food frequency questionnaire. Ann. Epidemiol. 1999;9(3):178–187. doi: 10.1016/s1047-2797(98)00055-6. [DOI] [PubMed] [Google Scholar]
  • 58.Reedy J. The Evolving Healthy Eating Index: advancing metrics to capture dietary patterns across a healthy eating trajectory. J. Acad. Nutr. Diet. 2023;123(9):1267–1268. doi: 10.1016/j.jand.2023.05.010. [DOI] [PubMed] [Google Scholar]
  • 59.Patel A.V., Jacobs E.J., Dudas D.M., Briggs P.J., Lichtman C.J., Bain E.B., et al. The American Cancer Society's Cancer Prevention Study 3 (CPS-3): recruitment, study design, and baseline characteristics. Cancer. 2017;123(11):2014–2024. doi: 10.1002/cncr.30561. [DOI] [PubMed] [Google Scholar]
  • 60.Sorlie P.D., Aviles-Santa L.M., Wassertheil-Smoller S., Kaplan R.C., Daviglus M.L., Giachello A.L., et al. Design and implementation of the Hispanic Community Health Study/Study of Latinos. Ann. Epidemiol. 2010;20(8):629–641. doi: 10.1016/j.annepidem.2010.03.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Design of the Women's Health Initiative clinical trial and observational study. The Women's Health Initiative Study Group. Control Clin Trials. 1998;19(1):61–109. doi: 10.1016/s0197-2456(97)00078-0. [DOI] [PubMed] [Google Scholar]
  • 62.Zhang X., Dong J., Raftery D. Five easy metrics of data quality for LC-MS-based global metabolomics. Anal. Chem. 2020;92(19):12925–12933. doi: 10.1021/acs.analchem.0c01493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Beckmann M., Wilson T., Zubair H., Lloyd A.J., Lyons L., Phillips H., et al. A standardized strategy for simultaneous quantification of urine metabolites to validate development of a biomarker panel allowing comprehensive assessment of dietary exposure. Mol. Nutr. Food Res. 2020;64(20) doi: 10.1002/mnfr.202000517. [DOI] [PubMed] [Google Scholar]
  • 64.Zheng C., Gowda G.A.N., Raftery D., Neuhouser M.L., Tinker L.F., Prentice R.L., et al. Evaluation of potential metabolomic-based biomarkers of protein, carbohydrate and fat intakes using a controlled feeding study. Eur. J. Nutr. 2021;60(8):4207–4218. doi: 10.1007/s00394-021-02577-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Guijas C., Montenegro-Burke J.R., Domingo-Almenara X., Palermo A., Warth B., Hermann G., et al. METLIN: a technology platform for identifying knowns and unknowns. Anal. Chem. 2018;90(5):3156–3164. doi: 10.1021/acs.analchem.7b04424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Neto F.C., Raftery D. Expanding urinary metabolite annotation through integrated mass spectral similarity networking. Anal. Chem. 2021;93(35):12001–12010. doi: 10.1021/acs.analchem.1c02041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Demarque D.P., Crotti A.E., Vessecchi R., Lopes J.L., Lopes N.P. Fragmentation reactions using electrospray ionization mass spectrometry: an important tool for the structural elucidation and characterization of synthetic and natural products. Nat. Prod. Rep. 2016;33(3):432–455. doi: 10.1039/c5np00073d. [DOI] [PubMed] [Google Scholar]
  • 68.Nothias L.F., Petras D., Schmid R., Duhrkop K., Rainer J., Sarvepalli A., et al. Feature-based molecular networking in the GNPS analysis environment. Nat. Methods. 2020;17(9):905–908. doi: 10.1038/s41592-020-0933-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Services USDoAaUSDoHaH Dietary Guidelines for Americans, 2020–2025 [Internet] 2020. https://www.dietaryguidelines.gov/resources/2020-2025-dietary-guidelines-online-materials 9th ed. [2025; March 19, 2025]. Available from:
  • 70.Tsugawa H., Cajka T., Kind T., Ma Y., Higgins B., Ikeda K., et al. MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat. Methods. 2015;12:523–526. doi: 10.1038/nmeth.3393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Cuparencu C., Bulmus-Tuccar T., Stanstrup J., La Barbera G., Roager H.M., Dragsted L.O. Towards nutrition with precision: unlocking biomarkers as dietary assessment tools. Nat. Metab. 2024;6(8):1438–1453. doi: 10.1038/s42255-024-01067-y. [DOI] [PubMed] [Google Scholar]
  • 72.Sun J., Liu X., Yang T., Slovin J., Chen P. Profiling polyphenols of two diploid strawberry (Fragaria vesca) inbred lines using UHPLC-HRMS(n.) Food Chem. 2014;146:289–298. doi: 10.1016/j.foodchem.2013.08.089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Geng P., Sun J., Chen P., Brand E., Frame J., Meissner H., et al. Characterization of maca (Lepidium meyenii/Lepidium peruvianum) using a mass spectral fingerprinting, metabolomic analysis, and genetic sequencing approach. Planta Med. 2020;86(10):674–685. doi: 10.1055/a-1161-0372. [DOI] [PubMed] [Google Scholar]
  • 74.National Institutes of Health Final NIH policy for data management and sharing [Internet] 2020. https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html [January 25, 2023; March 19, 2025]. Available from:

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

multimedia component 1
mmc1.docx (34.6KB, docx)

Data Availability Statement

A data share statement is not applicable. This manuscript describes a study design and does not include any data.


Articles from Current Developments in Nutrition are provided here courtesy of American Society for Nutrition

RESOURCES