Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Apr 15.
Published in final edited form as: Trends Analyt Chem. 2023 Dec 12;171:117478. doi: 10.1016/j.trac.2023.117478

Best Practices in NMR Metabolomics: Current State

Robert Powers 1,*, Erik R Andersson 2, Amanda L Bayless 3, Robert B Brua 4, Mario C Chang 5, Leo L Cheng 6, Chaevien S Clendinen 7, Darcy Cochran 1, Valérie Copié 8, John R Cort 7, Alexandra A Crook 1, Hamid R Eghbalnia 9, Anthony Giacalone 6, Goncalo J Gouveia 10, Jeffrey C Hoch 9, Micah J Jeppesen 1, Amith S Maroli 1, Matthew E Merritt 5, Wimal Pathmasiri 11, Heidi E Roth 1, Anna Rushin 5, Isin T Sakallioglu 1, Saurav Sarma 12,, Tracey B Schock 3, Lloyd W Sumner 12, Panteleimon Takis 13, Mario Uchimiya 14, David S Wishart 15; Metabolomics Association of North America (MANA), NMR Special Interest Group; Metabolomics Quality Assurance & Quality Control Consortium (mQACC), NMR Technology Group
PMCID: PMC11999570  NIHMSID: NIHMS2033748  PMID: 40237011

Abstract

A literature survey was conducted to identify current practices used by NMR metabolomics investigators when conducting and reporting their metabolomics studies. A total of 463 papers from 2020 and 80 papers from 2010 were selected from PubMed and were manually analyzed by a team of investigators to assess the extent and completeness of the experimental procedures and protocols reported. A significant number of the papers did not report on essential experimental details, incompletely stated which statistical methods were used, improperly applied supervised multivariate statistical analyses, or lacked validation of statistical models. A large diversity of protocols and software were identified, which suggests a lack of consensus and a relatively limited use of commonly agreed upon standards for conducting and reporting NMR metabolomics studies. The overall intent of the survey is to inform and encourage the NMR metabolomics community to develop and adopt best-practices for the field.

Keywords: NMR, Metabolomics, Best Practices, Literature survey

Graphical Abstract

A picture containing text

1. Introduction

NMR metabolomics has beneficially impacted a diversity of fields, from issues related to the environment [1] and nutrition [2] to drug discovery [35] and disease evaluation [69]. As a result, NMR metabolomics, as a field, continues to experience expansive growth [10, 11], particularly for medical applications due to its potential for biomarker discovery [69, 12] and in vivo examinations [1317]. Overall, metabolomics has benefited from the technical advantages of NMR, which includes its simplicity in measurements, often without the need for sample pre-processing, ready metabolite quantification, and redundant means to validate metabolite assignments [16, 1823]. These advantages present a low barrier of entry into the field that has contributed to the growth of NMR metabolomics, but, unfortunately, has also led to an abundance of tenuous metabolomics research studies populating the scientific literature. Routine concerns include misunderstandings about the proper usage of statistical tools [2428], missing details fundamental to the rigor and reproducibility of the study [29, 30], and the absence of quality controls necessary to validate the conclusions [21, 31]. Research studies without a clearly defined scientific hypothesis that just catalog and measure metabolites raise additional concerns. Numerous poorly designed or executed studies that fail to generate reproducible or meaningful data have the potential to severely undermine the long-term perspective and value of metabolomics [32]. Simply put, the field risks a loss of confidence by the broader scientific community.

The lack of well-established best-practices that have been widely adopted, deployed, and validated by the broader scientific community is clearly a contributing factor to the proliferation of scientifically questionable metabolomics reports. This is despite the efforts of several community-led efforts by COSMOS: Coordination of Standards in Metabolomics, MSI: Metabolomics Standards Initiative, mQACC: Metabolomics Quality Assurance & Quality Control Consortium, and MANA: Metabolomics Association of North America to establish standard practices and minimum reporting criteria [3339]. Furthermore, a properly executed metabolomics study is highly dependent on an assortment of unique skills that includes expertise in analytical chemistry, separation techniques, and statistics, among others. The false perception that metabolomics is easy often leads to poorly conducted or designed studies due to missing expertise, an absence of pertinent experiences, or methods lacking a clear explanation. Again, insufficient guidance from community-certified experimental protocols may lead to these problematic outcomes. To address this critical issue, the NMR metabolomics community must move toward establishing and widely adopting a set of best practices along with standards specifically relevant to an NMR metabolomics study. Of course, to achieve this laudable goal, it is first necessary to identify the common problems or limitations routinely encountered in reported NMR metabolomics studies. This, in turn, requires assessing and characterizing the different experimental protocols, statistical methods, and software used by the metabolomics community because each investigator or research group follows a different if not unique data processing pipeline. Herein, we report the results of a detailed survey of the scientific literature to document exactly how NMR has been routinely used in metabolomics research. The overarching intent of this analysis is to outline both the successes and existing problems with current NMR metabolomics studies to inform best practices. The outcome of the literature survey lays a foundation for how an NMR metabolomics study should be designed, executed, and reported in the future.

2. Discussion

2.1. The set of 2010 and 2020 NMR-based metabolomics papers.

The PubMed database was first queried to identify papers published in 2020 that described an NMR-based metabolomics study. An initial query identified 844 potentially relevant papers. A manual analysis to remove papers focused on software, NMR protocol reviews and natural product identification reduced the total number of papers that were surveyed to 487. A query check confirmed that 61 of the removed papers were reviews. A further evaluation of the assigned papers by the project participants removed an additional 24 papers for various reasons, but mainly because the focus of the study was not metabolomics. In total, 463 manuscripts from 2020 were used in the survey. Using an identical approach, a total of 80 manuscripts from 2010 were used in the survey. Please see Appendix A for a detailed description of the methods used to complete the literature survey. The survey results from the 2010 and 2020 papers were then compared to determine if the practices used to report NMR metabolomics data and findings remained consistent or improved with time.

2.2. Differences in reported protocols between 2010 and 2020 papers.

Excluding magnitude differences due to the larger number of papers in 2020, a comparison of the survey data between the two periods indicated an overall high similarity in the results. Most responses were remarkably similar between the two sets of papers. This overall lack in improvement in reporting standards between 2010 and 2020 suggests a strong reliance on citing prior studies as a source of established procedures, which themselves may not contain a complete protocol. This practice has resulted in the perpetuation of poor reporting practices. Breaking this cycle requires the community to adhere to new recommendations of best practices promoted by societies and consortia while acknowledging that the current scientific literature is riddled with problematic inaccurate, confusing, or incomplete content.

There were a few minor areas that indicated a modest improvement from 2010 to 2020. The average number of identified and quantified metabolites increased by over 50% from 2010 to 2020. Similarly, there was a small increase from 0 to ~11% in the number of studies that reported submitting metabolomics data to a repository. Another promising trend was the general increase of ~20 to ~30% in the application and/or the reporting of standard statistical validation parameters such as p-values [40], R2/Q2 [41], and false discovery rate (FDR) corrections [4244].

2.3. Study Design Results.

A total of nine Yes-No survey questions were grouped together to broadly characterize how investigators reported their overall study design. A series of bar graphs are depicted in Figure 1 that summarizes these survey responses.

Figure 1.

A chart of different colors

Bar graphs of Yes-No survey questions related to study design of NMR-based metabolomic studies.

One of the most surprising and concerning outcomes of the survey was the observation that less than 50% of all papers provided a clearly stated scientific hypothesis. One potential explanation for this result is the fact that metabolomics lends itself to metabolite discovery and the relatively simple goal of just cataloging detectable metabolites in a given biological sample. While the rationale behind such a study may be to inform future investigations, it may not be particularly valuable since any follow-up study would also likely discover the metabolites that are detectable as a first step of the project. Further, without the availability of the complete data set and procedures, the potential utility of any metabolomics project limited to cataloging the observed metabolites is dubious. In this regard, not a single paper in 2010 reported depositing their metabolomics data to an appropriate repository. To be fair, few journals required in 2010 or currently require the deposition of metabolomics data in data repositories, as the Metabolomics Workbench (established 2016) [45] and MetaboLights [46] are relatively new enterprises. With the emergence of metabolomics specific repositories, approximately 11% of the 2020 papers indicated the data was deposited in such a repository, which represents a modest improvement. The use and reuse of metabolomics data is an important issue and an expanding need as evident by a recent request for information by the National Cancer Institute (NOT-CA-23–00). Thus, the deposition of NMR-based metabolomics data into publicly accessible repositories must become more routine, seamless, and mandated by journals and funding agencies to enhance data deposition compliance.

Not unexpectedly, the vast majority, nearly 100%, of both the 2010 and 2020 papers employed one-dimensional (1D) 1H or 13C NMR experiments. Troubling, only 45% (2010) or 36% (2020) of the papers surveyed relied on 2D NMR experiments for metabolite identification and validation. This correlates with the nearly non-existent (~6–7%) NMR metabolomics projects that relied on isotope labeling. Since 2D NMR experiments provide a valuable and complementary approach to validating metabolite assignments, the lack of a routine usage of 2D NMR raises concerns about the accuracy and reproducibility of metabolite assignments for compounds in matrices that are not well-established within the NMR metabolomics literature. Instead, it would be prudent to commonly employ a series of 2D NMR experiments such as the 2D 1H-13C HSQC, HMBC, HSQC-TOCSY, and 2D 1H-1H TOCSY for spectra collected on pooled samples to validate metabolite annotations [47]. Specifically, 2D NMR provides multiple, correlated chemical shifts that may uniquely identify a metabolite. COLMAR and similar software tools provide an efficient and reliable approach to leverage these 2D NMR spectral signals accurately annotate complex metabolomics mixtures [4850]. This level of confidence cannot be achieved by relying solely on 1D NMR.

On the other hand, the semi-automatic assignments of well-known and abundant metabolites commonly detected in 1D 1H NMR spectra is likely to be reliable when analyzing a specific type of biofluid using specific small molecule spectral libraries [5153]. It is important to note that the false positive rate increases proportionally with the size of the reference library, necessitating the use of a targeted library for accurate metabolite annotation extracted from 1D 1H NMR spectral patterns [53]. Nevertheless, NMR can partially overcome this challenge given the fact that most metabolites produce numerous signals in a 1D 1H NMR spectrum. In this regard, matching multiple redundant signals between the experimental and reference spectrum greatly improves confidence in the assignment. This is a unique and valuable advantage of NMR compared to other analytical techniques that produce only a single signal per compound, especially for the analysis of complex mixtures.

Only ~46–50% of the surveyed papers reported an observed consistency between the key identified metabolites with other replicate studies. An observed consistency with these previous studies raises the likelihood that the metabolites identified have been correctly assigned. The reason why a comparison with prior literature is not a common occurrence in the metabolomics field is perplexing. Routine literature retrospection in the discussion section of any NMR metabolomics publications provides confidence in the analyzed data.

A failure to demonstrate consistency with previous studies raises broad concerns regarding the robustness and reproducibility of metabolomics data. Part of the issue may be a conservative outlook in which variations in sample collections, handling, and preparations may deem a comparison between studies inappropriate. It is also perfectly legitimate that a given study is unique and lacks prior replicate studies to compare against. Given the exponential growth of metabolomics studies, especially the large number of replicate clinical studies designed to identify biomarkers, the complete absence of any related prior studies is highly unlikely. Even in cases where there are legitimate concerns regarding important differences in experimental design decisions, a comparison of outcomes and procedures may still be informative. Metabolomics papers that incorrectly present a study “as the first of its kind” and fail to place their results in context with the scientific literature may hinder progress in the field. Opportunities to establish results that have been successfully replicated or to identify potential contradictions have been missed. Importantly, if disagreements are not revealed at the time of publication, the opportunity to understand or identify the source of the problem and to potentially be able to fix the inconsistencies may be lost. More troubling is the possibility that incorrect scientific conclusions may be propagated through the community leading to detrimental outcomes that may involve patients.

2.4. Statistics and Quality Control.

The application of univariate [54] and multivariate [41] statistics is a common and often necessary aspect of a metabolomics study. This is consistent with the results of the survey (Figure 1) where ~70–75% of the 2010 and 2020 papers reported using univariate statistics, ~75% of the studies used unsupervised multivariate statistics, ~52–73% of the papers reported using supervised multivariate statistics, and ~61–68% of the studies used both univariate and multivariate statistics. Since most metabolomics investigators are not necessarily trained biostatisticians, the proper application of statistics to a metabolomics study is a common concern, especially if a biostatistician has not been part of the study design [2628, 40, 41, 55]. This concern is underscored by the results of our survey. Summarized in Figure 2A are the results of the five Yes-No survey questions related to proper statistical usage. The poor performance in terms of properly reporting or using statistics is readily apparent. The good news is that there is a notable improvement between papers published in 2010 compared to 2020. For example, only ~60% of the 2010 papers reported a p-value, a fundamental assessment of the statistical significance of identified metabolites. Without this statistical measure of confidence, how is it even possible to determine reliability or the utility of the results from a metabolomics study? The number reached nearly 80% for 2020 papers. Nevertheless, the number of papers reporting an FDR or multiple hypothesis correction method such as Benjamini-Hochberg [56] or Bonferroni [43, 44] was unacceptably low, corresponding to only ~14% for 2010 and ~34% for 2020. An FDR correction of any omics data set is essential since errors accumulate as illustrated in eqn. 1:

p=1-1-αm (1)

where m is the number of hypotheses or metabolites and α is the statistical significance level or desired p-value, typically ≤ 0.05. The lack of reported FDR corrected p-values likely means a significant number of the putatively identified metabolites are false positives. This, in turn, will negatively impact any biological significance or outcomes of the study. An FDR corrected p-value [57, 58] along with a quantitative measure such as fold change is recommended as a requirement for reporting all metabolites identified as differing significantly between two or more groups in a published metabolomics manuscript.

Figure 2.

A picture containing text, screenshot, colorfulness, diagram

Bar graphs of Yes-No survey questions related to (A) statistical validation and (B) quality control methods of NMR-based metabolomic studies.

Effect size is another important statistical parameter to report, but rarely is, and, accordingly, was not specifically tabulated in our survey. Fold change, which was tabulated in our survey, is only one of various metrics that can be used to assess effect size. The measurement of an effect size is mandatory for the validation of a biomarker (e.g., a disease model predictive accuracy) and requires an independent set of samples to accomplish this assessment, which was not the main subject of the papers comprising our literature survey. Therefore, it has been difficult to make any assessment on the usage of effect size from the current survey.

The problems with statistics are not limited to univariate analysis. The survey results also indicated that a low number of papers listed standard quality factors, R2 and Q2, associated with multivariate models, such as principal component analysis (PCA), partial least squares or projections to latent structures (PLS), orthogonal projections to latent structures (OPLS), and projections to latent structures-discriminant analysis (PLSDA) that are routinely used in analysis of metabolomics data [41]. Only ~30% of 2010 papers reported R2 and Q2 values, but fortunately, the number more than doubled to ~63% in 2020. Still, in the same way that a p-value is needed to assess the statistical significance of a metabolite change, R2 and Q2 values are important to evaluate the quality of a multivariate statistical model. Specifically, R2 measures the degree of fit to the model and Q2 provides a measure of consistency between the predicted and original data. Typically, R2 is greater than Q2, Q2 ≥ 0.4, and Q2 is within 20% of R2 for an acceptable model [41]. Importantly, R2 and Q2 do not provide a measure of model validation, especially for supervised methods including PLS, OPLS, and PLSDA. Instead, an additional model validation approach is needed, which usually consists of p-values calculated by CV-ANOVA [59] and/or a permutation test (n =1000) [60]. Disappointingly, our literature survey indicated that only ~31% of 2010 papers reported an acceptable validation method for a supervised multivariate statistical model. The numbers did modestly improve to ~43% in 2020. Nevertheless, the low routine validation of statistical models raises serious concerns regarding the scientific validity of the entirety of these studies. Simply, if the statistical models are revealed to be invalid by being overfit, then any resulting scientific insight or biological inference is equally suspect. In essence, the entire outcome of the study is in doubt. Therefore, we suggest statistical model validations be reported for metabolomics studies that employ multivariate statistical methods, which may include reporting R2, Q2 and p-values calculated by CV-ANOVA [54] and a permutation test, or other validation results (CER, AUROC, random forest analysis).

Unfortunately, the survey results summarizing the NMR community’s application of quality control (QC) methods directly pertinent to the reported study were worse than the reported statistical data (Figure 2B). In fact, the response rate was so low, near zero for most categories, that a few of the questions were eliminated from the 2010 survey. The response rates ranged from a low of ~2% to a high of nearly 17% for the 2020 papers. Given the intrinsically high reproducibility of NMR experiments, the high instrument stability, and the relatively low variance (CV 5–10%) of quantitative NMR (qNMR) [22], QC has not been as strong a concern among NMR-based metabolomics investigators as it is for LC/GC-MS metabolomics data [61]. In fact, a few of the survey questions, such as pooled QC samples, are standard protocols in MS-based metabolomics studies, but have not been universally adopted by the NMR community. Nevertheless, randomization of samples to minimize potential sample-order bias is an important practice for all metabolomics studies. To note, pooled QC samples offer an evaluation of the experimental variance that may occur during sample preparation and metabolite extraction and remain important for an NMR based study. For example, combining pooled samples with 2D NMR may prove to be an efficient approach to validate metabolite assignments. Other important QC samples include process blanks, buffer blanks, standard reference samples for system suitability, standards for chemical shift referencing and even standard reference materials for quantitation and assessment of technical variance. Of course, most NMR facility managers conduct routine quality control checks of NMR instrument performance that are rarely reported in the literature. We suggest that manuscripts reporting on the use of NMR for metabolomics include pertinent information regarding the QA/QC procedures implemented for assessing spectrometer performance, including stability of 90° pulses, water suppression efficiency, and signal-to-noise values on standard samples, sample temperatures, or ERETIC signal calibrations, among other routine QA/QC procedures. The survey questions regarding QA/QC did not address whether these routine protocols were employed, but, instead, were directed to QA/QC protocols relevant to the reported study to determine variability that may emerge during sample preparation. Thus, we also recommend that other QC protocols specific to the published study such as the use of pooled, standard, and blank samples, and the randomization of sample preparation and data collection be reported.

2.5. Reporting of experimental parameters and data processing parameters.

The inclusion of standard experimental parameters should be considered relatively routine and common practice. In this regard, a survey of scientific literature would be expected to reach nearly 100%. Any outcome significantly below this mark would be troubling, which is generally what was observed (Figure 3A). The worst outcome from both the 2010 and 2020 survey was reporting sample pH, which only reached ~56% of papers. Now, the relatively low reporting rate may be a partial artifact of studies that use only organic solvents where pH has no meaning. This may also partially explain the observation that only ~81% of 2010 papers and ~74% of 2020 papers reported a buffer. Nevertheless, it doesn’t explain why only ~65% (2010) or ~75% (2020) of the survey papers reported a sample temperature. The number of samples used in a study were commonly reported but decreased from ~96% in 2010 to ~83% in 2020. More importantly, the number of samples was often not summarized in the methods section. Thus, readers had to deduce sample numbers by counting symbols in plots or careful examination of tables. A summary of the experimental design in the results section of a manuscript that includes the number of groups, the number of replicates (biological and analytical) per group, the number and type of NMR experiments would be a simple remedy to this problem.

Figure 3.

A picture containing text, screenshot, colorfulness, diagram

Bar graphs of Yes-No survey questions related to (A) experimental parameters and (B) processing parameters of NMR-based metabolomic studies.

Unfortunately, the reporting of processing parameters was significantly worse than the listing of experimental parameters for both the 2010 and 2020 papers (Figure 3B). It is well-known that the methods and parameters of NMR data processing, such as baseline correction method (~68–75%), automated or manual phasing (~58–98%), choice of window function (~34–55%), application of zero-filling (~21–33%), and the removal of solvent/buffer peaks (~58–63%), will directly impact the composition of the data matrix and the resulting statistical analysis. The 2020 papers reported a lower percentage for most of these important parameters.

As discussed above, the low reporting rate of important statistical parameters occurred again in the context of processing parameters. The fact that the data matrix was normalized and scaled is of critical importance [62, 63]. This also requires knowledge of the type of normalization and data scaling technique that has been used. The absence or improper usage of normalization and scaling techniques would completely negate the value of any multivariate statistical model and any subsequent biological interpretation. Despite the importance of reporting such data manipulation, only ~60% of both the 2010 and 2020 papers reported a normalization method which dropped to ~45% of papers that reported data scaling.

A potential explanation for the low reporting of experimental, processing, and statistical parameters is the reliance on citing previously published papers that are expected to provide a detailed explanation of the metabolomics protocols used by the study. Frustratingly, sometimes this approach leads to the backwards referencing through multiple generations of papers before the original manuscript with the complete list of protocols are located. While the procedures themselves may eventually be found in the literature, the simple specification of the protocol is inadequate. As noted in a Nature survey from 2016 [64], more than 60% of respondents were unable to repeat other scientists’ experiments. The American Society for Cell Biologists similarly found that more than 70% of its members were unable to replicate a published experimental result using standard well-established protocols because of incomplete details reported in the original protocols (https://www.ascb.org/science-policy-public-outreach/advocacy-policy/ascb-examines-difficulty-reproducing-research-data/). The individual parameters for each of those procedures are critical to ensure reproducibility of results and findings, which are often study and/or sample dependent and unlikely to be identical to the cited literature reference. A reference to a previous manuscript could be adequate if the citing paper contained sufficient data to indicate that all parameters from the previous study are identical to the current study. Lack of specific clear-cut statements indicating identical conditions, or mentioning specific exceptions, indicates that the simple description of the protocol or literature citation is insufficient. Regardless of the reason, the absence of key experimental and processing parameters (Figure 3) may lead to legitimate concerns regarding the validity of the presented study and the scientific value of any biological interpretation. In our view, a well-defined and established procedure is one where the specific biological, physiological, experimental, and study conditions are fully described in the published manuscript. At a minimum, detailed study-specific experimental and processing parameters be described in supplemental material and included when depositing data in public data repositories.

2.6. Distribution of experimental parameters and study outcomes.

Another clear outcome of the literature survey is the diversity of experimental protocols and the range of study designs used by the NMR metabolomics community. For example, the number of metabolites identified and quantified ranged from 0 to >100, with a peak at 0 and 31–40 metabolites in 2020 (Figure 4A). Encouragingly, the average number of identified/quantified metabolites increased from 23/26 in 2010 to 44/39 in 2020. The standard deviations in all cases were essentially equal to the average values, which again highlights the high diversity of NMR metabolomics studies. To further clarify the study outcomes, the number of metabolites identified and quantified were also tabulated for several different sample types: animal models, biofluids, cell cultures, food/beverages, humans, plants, and tissues. There was only a modest difference in metabolite counts per sample type. For example, in 2020, biofluids and human samples had the largest average number of identified metabolites ranging from approximately 45 to 56, while animal models and plants had the smallest average number of metabolites ranging from 12 to 29. The remaining groups had an average that ranged between 32 to 37. Consistent with the overall trends, the number of metabolites identified and quantified were lower for 2010 publications, but the relative rankings of sample types changed. In 2010, animal models and food/beverages had the lowest average number of metabolites, ranging from approximately 7 to 13. Cell cultures, plants and tissues had the largest average number of metabolites, which ranged from 18 to 77. The remaining sample types had an average number ranging from 18 to 21. Given both group type and yearly variations, it is difficult to assign any importance to differences in sample types beyond limited knowledge and/or complexity of some metabolomes that were sampled by NMR. Nevertheless, it is concerning that an equally likely outcome of NMR-based metabolomics studies is to report 0 identified/quantified metabolites. This is easily fixable and largely a result of the authors’ simple failure to report a summary of the number of metabolites identified and quantified.

Figure 4.

A picture containing text, screenshot, diagram, plot

Bar graphs summarizing the number of 2010 and 2020 papers reporting (A) the number of identified and quantified metabolites (binned), (B) the number of biological replicates per group, and (C) the total number of samples. The insert in (A) plots the average and standard deviation of the number of identified and quantified metabolites in 2010 and 2020.

The number of biological replicates per group and the total number of samples provide another factor to gauge the overall quality of a metabolomics study (Figure 4B,C). In general, the more samples that are analyzed, the better for establishing statistical significance and the reliability of any observations. The peak in the number of biological replicates per group being centered around bins of 11–20 and 21–50 is a positive trend, as is the number of studies that analyzed 50 – 200 total samples. Again, a large range of values was observed, further establishing the diversity of NMR metabolomics studies that are undertaken. A common occurrence was zero, in which the study doesn’t clearly state the number of samples used. Equally troubling is the large number of studies using a few (< 6) and even a single biological replicate per group or only 1 to 10 total samples. The biological significance of these studies is highly suspect. Of course, there are practical considerations that constrain the number of available samples, such as in rare medical conditions, but a minimal number of samples still needs to be acquired to obtain the necessary statistical power. In highly controlled metabolomics studies using cell cultures, animal models, food, and beverage samples, etc., 10 to 20 biological replicates per group are likely sufficient for statistical significance. Conversely, human clinical metabolomics studies may require 60 or more replicates per group to obtain sufficient statistical power (α 0.8). Of course, these are only minimal recommendations for the number of biological replicates, as more replicates are always better.

It should also be mentioned that the number of analytical replicates reported across all 2010 and 2020 papers was one. There were only a few rare exceptions of a study reporting more than one analytical replicate. While the reproducibility of the NMR measurement is superior, possible processing deviations can be detected when including analytical replicates for one or more samples. The practice of using analytical replicates ensures consistency within the metabolomics analytical workflow and offers confidence that the data is of high quality.

Other experimental parameters exhibited a similarly large range of values from the literature survey (data not shown). For example, reported pHs ranged from 2.5 to 7.50, with peaks at pH 6, 7, and 7.4. Despite the use of an apparently large range of pHs, about 80 (2010) to 90 (2020) percent of the reported pHs were within the 6.5 to 7.5 range. Although many studies used a pH consistent with typical biological conditions (i.e., pH 7.2) for their metabolomics analyses, some studies demonstrated an occasional need to use different pHs based upon the specific nature of the samples. For example, analysis of samples at lower pH is important in food preservation studies. Similarly, reported temperatures ranged from 198K to 448K, with peaks at 298K, 300K, and 310K. Despite the apparently large range of temperatures used in diverse studies, about 79 (2010) and 82 (2020) percent of the reported temperatures were either 298K or 300K, essentially ambient temperature. However, it is also clear that other temperatures were needed for specific studies to accommodate, for example, the need to cool samples for solid state NMR. For comparisons across studies and between different groups, it would be ideal to establish sample-type specific pH and temperature values for conducting metabolomics studies. Several studies have previously reported the evaluation of sample preparation for metabolomics studies [2, 65, 66].

Every field between 400 and 900 MHz has been used, where 600 MHz was the most popular choice, followed by 500 MHz. All modern NMR spectrometers can perform sophisticated multidimensional experiments that are pertinent to metabolomics, in addition to simple 1D NMR experiments. The advent of ultra-high field instruments, with the state of the art now being 1.2 GHz (1H frequency), affords unprecedented sensitivity and resolution, and simplification of the spectra for strongly coupled spin systems [67]. While their staggering cost precludes widespread application, ambiguities resolved using high field instruments will enhance the utility of lower field instruments. Recent advances in inexpensive benchtop instruments [68], which can execute the sophisticated pulse sequences available on high field instruments, will enable wider application and penetration of NMR-based metabolomics in more diverse settings, for example field studies and point-of-care. Thus, the choice of a specific spectrometer frequency for a given metabolomics study is likely dictated by instrument availability, project needs, and access to high-throughput capabilities including automatic sample loading systems or cryoprobe systems. Thus, standardizing NMR field strength for metabolomics studies is not practical, which, in turn, requires more flexible pipelines and software for NMR data processing and analyses.

Finally, the choice of 1D NMR experiments, 2D NMR experiments, and the software and databases used to process and analyze the metabolomics datasets exhibited the largest variance between research groups. Consistent with prior responses, “No Response” was the most common or among the most common answers to the type of NMR experiments and software used. This is a concerning outcome considering how fundamentally important it is to know specific details regarding the type of NMR experiments conducted and how the spectral data were processed and analyzed. Again, a partial explanation for the omission of this critical information may be the reliance on citing previous work to provide experimental details, instead including this information in supplementary material as a minimal requirement is a reasonable expectation. The popular choices of 1D NMR experiments were 1D NOESY, CPMG, and a generic 1D 1H NMR pulse sequence with or without the explicit inclusion of a presaturation pulse [6974]. While widely used, each of these 1D NMR experiments has well-known and unique limitations with associated problems that may impact the veracity of the reported biological insights. Accordingly, a community-led recommendation to adopt specific NMR pulse sequences and experimental parameters would again be beneficial to establish standards and enable the comparison of datasets across studies and groups. Conversely, popular choices for 2D NMR experiments are routinely employed for metabolite identification and validation and do not present any unique concerns. These experiments include expected selections such as HSQC, TOCSY, COSY, and HMBC [47, 75]. A notable exception is the 2D J-resolved experiment, which could be viewed as an alternative 1D 1H NMR experiment [76].

By far, the largest diversity in experimental protocols was the choice of software and database used to process and analyze the NMR metabolomics data sets. Once again, the number one response was “No Response.” While this assertion has become redundant, it is extremely challenging to evaluate the value of a metabolomics study and infer the reliability of its outcomes without a fundamental understanding of the data analysis protocols. The common reliance on prior published work to describe experimental protocols is a likely partial explanation for the absence of software being listed in the 2010 and 2020 papers. The top five software out of the 20 commonly reported programs were, in order, Chenomx, TopSpin, SIMCA, and R [77]. The missing member from this list, which was third overall, was the general category of “Other.” The other category contained over 90 different software programs. To further emphasize this point, of the 543 papers surveyed from 2010 and 2020, over 110 unique software packages were identified as being used to process or analyze the NMR metabolomics data sets. Nearly one out of every five papers used a unique NMR spectra processing approach. Again, community-led standardization of data processing pipelines and software may greatly benefit the field of metabolomics while minimizing problematic or tenuous outcomes. NMRbox (https://nmrbox.nmrhub.org/) [78], MetaboAnalyst 5.0 (https://www.metaboanalyst.ca/) [79], and MVAPACK (https://mvapack.unl.edu/) [80], among others, are current on-going efforts to resolve this issue.

3. Conclusion.

Herein we reported on an assessment of NMR-based metabolomics studies published in the scientific literature during the years 2010 and 2020. The outcome of our literature survey identified several areas of concern, but it also provided an important framework to establish best practices. One troubling observation was the high occurrence of papers that lacked important experimental details. A reliance on citing prior published papers as the source of these experimental details remains problematic. Instead, at a minimum, a proposed practice of including all experimental details and protocols in supplementary files is suggested. This information is critical to evaluating the proper execution of an NMR-based metabolomics study and ensuring the veracity of the results and biological insights, in addition to providing a means to reproduce the study results. The most severe omission was the incomplete reporting of statistics, which was made worse by routine improper usage of both univariate and multivariate statistical methods. Thus, robust guidelines need to be adopted by the NMR metabolomics community to ensure proper application, reporting, and validation of statistical methods critical to metabolomics analysis.

In addition to the lack of reporting key information regarding the design and execution of a study, it was extremely unlikely that the resulting data would be deposited in publicly available metabolomics repositories. Metabolomics data that is not publicly accessible eliminates the possibility to reexamine and/or reprocess the datasets to determine the reliability and reproducibility of the original results or even to reanalyze the research as software and data analytic tools progress. A lack of test data sets also hinders progress in software and method development and undermines the broad impact of metabolomics findings. Further perplexing was the observation that a majority of the papers did not include a stated scientific hypothesis. The importance of metabolomics goes well beyond simply cataloging the metabolites detected in a biological sample. Instead, the value of metabolomics comes from its complementarity with other experimental techniques that provide a distinct and holistic view of the system under investigation. Thus, metabolomics is often uniquely positioned for hypothesis generation.

The final impression garnered from the survey data was the large diversity of protocols employed by NMR communities. The absence of a generally accepted and adhered-to standards for data acquisition and processing pipeline almost guarantees that the results acquired across multiple investigations will not be reproducible or interoperable. Without any semblance of consistency, the accuracy of the results is also highly questionable. The diversity of protocols presented in the literature presents a valuable framework for devising standards and best practices for NMR metabolomics protocols that address: (i) sample handling and processing, (ii) NMR data collection, (iii) data preprocessing protocols and software choice, (iv) statistical analysis, validation, and software choice, (v) metabolite identification, quantification, validation, software and database choices, (vi) network analysis, and (vii) combining NMR with other analytical platforms. In fact, a body of suggested standards related to various segments of the metabolomics pipeline does exist [31, 3337, 81], including recommendations about data acquisition and processing [27, 28, 62, 63, 8284], but compliance with reporting standards is lacking. For example, a study by Spicer et. al (2017) [38] pointed out that many of the minimal reporting standards established by the MSI: Metabolomics Standards Initiative [39] in 2007 were not adhered to in metabolomics datasets deposited in MetabolomeXchange (http://www.metabolomexchange.org/) or other data repositories. Unfortunately, our recent survey of the NMR metabolomics scientific literature confirms this ongoing challenge facing the metabolomics community, and the lack of an approved and agreed upon set of best practices is still a major drawback. Resolving this issue will require a concerted and joint effort by metabolomics researchers, journal publishers, metabolomics data repositories, societies, and funding agencies to embrace and enforce an adherence to community standards.

As a final note, given the assortment of protocols revealed by the survey, it is difficult to imagine a single agreed-upon approach to NMR-based metabolomics among so many different studies involving humans, cells, animals, food, and others. Instead, we expect that by focusing on a single type of biological system, for example, on human serum or urine, only then will an improved set of best practices appear for the specific system. For instance, metabolomics studies performed with cells or animal models usually require fewer biological replicates than human studies because of the controlled environment and lower sources of variance. Similarly, different sample matrices define sample handling and metabolome extraction protocols which determine the number and chemical class of metabolites that can be detected. Describing the reporting habits among the many investigators comprising this survey was important to better understand whether the methods currently employed are sufficiently detailed and adequate for achieving such a consensus reporting standard. Conversely, it is difficult to conclude something about the standardization of a metabolomics methodology if the data are not presented in a coherent and homogeneous manner. Establishing an analytical standard framework for NMR-based metabolomics necessitates a thorough assessment of the experimental, processing, and statistical parameters currently employed by the scientific community.

Supplementary Material

NMR literature survey 2010 data

Tables B.2 and B.3 list the raw 2010 and 2020 literature survey data, respectively. Tables B.4 and B.5 list the tabulation of the 2010 and 2020 literature survey data, respectively.

NMR literature survey 2020 data
NMR literature survey questions

Table B.1 provides a summary of survey questions and possible answers.

Acknowledgments

This work was supported in part by funding from the National Institutes of Health (R01 AI148160, NIAID to R.P.), and the Nebraska Center for Integrated Biomolecular Communication (NIH NIGMS P20 GM113126 to R. P.). L. L. C. is supported by NIH AG070257. P. T. is supported by the NIHR Imperial Biomedical Research Centre (BRC). V.C received support from NIH R01- DK117473 and NSF- MCB-1714556. This material is based upon work supported by the Defense Health Agency and U.S. Strategic Command under Contract No. FA4600-12-D-9000. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Defense Health Agency, U.S. Strategic Command, or 55th Contracting Squadron.

Appendix A. Methods

A.1. Assembling a collection of NMR-based metabolomics papers.

The scientific literature deposited in the PubMed[8587] database (https://pubmed.ncbi.nlm.nih.gov/) was searched for peer-reviewed NMR-based metabolomics manuscripts published in 2010 and 2020. The database search employed a combination of an automatic and manual filtering process. The following PubMed query was used to generate an initial list of manuscripts:

((((NMR[Title/Abstract]) AND (metabolomics[Title/Abstract]))) AND ((“2020/01/01”[Date - Publication] : “2020/12/31”[Date - Publication]))) NOT (Review).

Any paper focused on software and/or NMR protocols, and any reviews that were not detected with the first filter were manually removed. Also, any paper describing the identification of a natural product that used mass spectrometry (MS) metabolomics to detect the compounds followed by NMR to determine the structure were removed. The following PubMed query was then used to cross check the list of publications that were identified with the first PubMed query to confirm that the articles selected via the review criteria were review articles:

((((NMR[Title/Abstract]) AND (metabolomics[Title/Abstract]))) AND ((“2020/01/01”[Date - Publication] : “2020/12/31”[Date - Publication])))

The process was repeated with the year 2020 changed to 2010.

A.2. Creating and conducting the survey of NMR-based metabolomics papers.

A list of 72 survey questions was developed by the project participants and are available in the supplemental information (Table B.1). The survey questions were designed to thoroughly catalog current NMR metabolomics best practices as described in the scientific literature. The number of survey questions were reduced to 59 for 2010 papers to eliminate uninformative questions. Omitted questions were associated with automation, number of analytical replicates that were either omitted or nearly always one, number of controls and cases that was redundant with number of groups, number of zero fills, final file size, and baseline correction function that were difficult to properly assess, and quality control and machine learning protocols not employed by NMR investigators. Most questions were designed with a simple response of “No”, “Yes”, or “Not Indicated.” A few questions had a defined pull-down list of likely answers, such as spectrometer frequencies. The remaining questions were open-ended such as the type of buffer, software, or biological sample used in the study. Overall, the survey questions were comprised of the following general categories: (i) study design, type of samples, number of groups and biological replicates, (ii) experimental details like field strength, temperature, pH, and solvent, (iii) processing details like zero fills, baseline correction, normalization, and scaling, (iv) quality controls, software used, metabolites identified, and metabolites quantified, and (v) statistical details like univariate techniques, multivariate techniques, and validation methods.

An Excel file of the list of authors, titles, and PMIDs from the PubMed query results was combined with the survey questions. A separate file was created for the 2010 and the 2020 papers. Each Excel file was shared with the survey participants. The 2010 papers were randomly distributed between 14 research groups and the 2020 papers were randomly distributed between 17 research groups to manually complete the survey questions. Survey participants read each assigned paper and any available supplemental information to manually answer all survey questions. A master 2010 (Table B.2) and 2020 (Table B.3) Excel file was created by concatenating the individual literature survey results compiled by each research group. The master Excel files were then used to tabulate a summary of the 2010 (Table B.4) and 2020 (Table B.5) results for each individual question using standard Excel COUNT and SUM functions such as COUNTA, COUNTBLANK, and COUNTIF. For Yes-No questions, the number of “Yes”, “No”, “Not Indicated”, and blank cells were separately counted. For questions with a defined list of answers, the number of each defined answer, each generic “other”, and each blank cell was separately counted. Open ended questions were manually cataloged and counted. The two Excel files that summarize the results of the literature survey are available as supplemental information.

Footnotes

NIST Disclaimer

Certain commercial equipment, instruments, or materials are identified in this paper in order to specify the experimental procedure adequately. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the materials or equipment identified are necessarily the best available for the purpose.

Data availability

All data is available as supplementary data.

References

  • [1].Viant MR, Mol. BioSyst. 4 (2008) 980. [DOI] [PubMed] [Google Scholar]
  • [2].Brennan L, Prog. Nucl. Magn. Reson. Spectrosc. 83 (2014) 42. [DOI] [PubMed] [Google Scholar]
  • [3].Powers R, Magn. Reson. Chem. 47 (2009) S2. [DOI] [PubMed] [Google Scholar]
  • [4].Powers R, J. Med. Chem. 57 (2014) 5860. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Wishart DS, Drugs R&D 9 (2008) 307. [DOI] [PubMed] [Google Scholar]
  • [6].Duarte IF, Diaz SO, Gil AM, J. Pharm. Biomed. Anal. 93 (2014) 17. [DOI] [PubMed] [Google Scholar]
  • [7].Emwas A-HM, Salek RM, Griffin JL, Merzaban J, Metabolomics 9 (2013) 1048. [Google Scholar]
  • [8].Gebregiworgis T, Powers R, Comb. Chem. High Throughput Screening 15 (2012) 595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Griffiths WJ, Koal T, Wang Y, Kohl M, Enot DP, Deigner H-P, Angew. Chem., Int. Ed. 49 (2010) 5426. [DOI] [PubMed] [Google Scholar]
  • [10].Wishart DS, Cheng LL, Copié V, Edison AS, Eghbalnia HR, Hoch JC, Gouveia GJ, Pathmasiri W, Powers R, Schock TB, Sumner LW, Uchimiya M, Metabolites 12 (2022) 678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Markley JL, Bruschweiler R, Edison AS, Eghbalnia HR, Powers R, Raftery D, Wishart DS, Curr. Opin. Biotechnol. 43 (2017) 34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Schult TA, Lauer MJ, Berker Y, Cardoso MR, Vandergrift LA, Habbel P, Nowak J, Taupitz M, Aryee M, Mino-Kenudson MA, Christiani DC, Cheng LL, Proc. Natl. Acad. Sci. U. S. A. 118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Daly PF, Cohen JS, Cancer Res. 49 (1989) 770. [PubMed] [Google Scholar]
  • [14].Bastawrous M, Jenne A, Anaraki MT, Simpson AJ, Metabolites 8 (2018) 35/1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Anaraki MT, Lane D, Bastawrous M, Jenne A, Simpson AJ, Methods Mol. Biol. 2037 (2019) 395. [DOI] [PubMed] [Google Scholar]
  • [16].Nagana Gowda GA, Raftery D, Adv. Exp. Med. Biol. 1280 (2021) 19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Wu CL, Jordan KW, Ratai EM, Sheng J, Adkins CB, Defeo EM, Jenkins BG, Ying L, McDougal WS, Cheng LL, Sci. Transl. Med. 2 (2010) 16ra8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Nicholson JK, Lindon JC, Holmes E, Xenobiotica 29 (1999) 1181. [DOI] [PubMed] [Google Scholar]
  • [19].Fan TWM, Prog. Nucl. Magn. Reson. Spectrosc. 28 (1996) 161. [Google Scholar]
  • [20].Wishart DS, Trends Anal. Chem. 27 (2008) 228. [Google Scholar]
  • [21].Maroli AS, Powers R, NMR Biomed. e4594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Crook AA, Powers R, Molecules 25 (2020) 5128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Takis PG, Ghini V, Tenori L, Turano P, Luchinat C, Trends Anal. Chem. 120 (2019) 115300. [Google Scholar]
  • [24].Brereton RG, J. Chemom. 28 (2014) 749. [Google Scholar]
  • [25].Vaux DL, Nature 492 (2012) 180. [DOI] [PubMed] [Google Scholar]
  • [26].Mutter S, Worden C, Paxton K, Mäkinen V-P, Metabolomics 16 (2019) 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Kjeldahl K, Bro R, Journal of Chemometrics 24 (2010) 558. [Google Scholar]
  • [28].Brereton RG, TrAC Trends in Analytical Chemistry 25 (2006) 1103. [Google Scholar]
  • [29].Eghbalnia HR, Romero PR, Westler WM, Baskaran K, Ulrich EL, Markley JL, Curr. Opin. Biotechnol. 43 (2017) 56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].Mische SM, Fisher NC, Meyn SM, Sol-Church K, Hegstad-Davies RL, Weis-Garcia F, Adams M, Ashton JM, Delventhal KM, Dragon JA, Holmes L, Jagtap P, Kubow KE, Mason CE, Palmblad M, Searle BC, Turck CW, Knudtson KL, J. Biomol. Tech. 31 (2020) 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Kirwan JA, Gika H, Beger RD, Bearden D, Dunn WB, Goodacre R, Theodoridis G, Witting M, Yu LR, Wilson ID, Metabolomics 18 (2022) 70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Roth HE, Powers R, Cancers 14 (2022) 3992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Goodacre R, Broadhurst D, Smilde AK, Kristal BS, Baker JD, Beger R, Bessant C, Connor S, Capuani G, Craig A, Ebbels T, Kell DB, Manetti C, Newton J, Paternostro G, Somorjai R, Sjöström M, Trygg J, Wulfert F, Metabolomics 3 (2007) 231. [Google Scholar]
  • [34].Lindon JC, Nicholson JK, Holmes E, Keun HC, Craig A, Pearce JT, Bruce SJ, Hardy N, Sansone SA, Antti H, Jonsson P, Daykin C, Navarange M, Beger RD, Verheij ER, Amberg A, Baunsgaard D, Cantor GH, Lehman-McKeeman L, Earll M, Wold S, Johansson E, Haselden JN, Kramer K, Thomas C, Lindberg J, Schuppe-Koistinen I, Wilson ID, Reily MD, Robertson DG, Senn H, Krotzky A, Kochhar S, Powell J, van der Ouderaa F, Plumb R, Schaefer H, Spraul M, Nat Biotechnol 23 (2005) 833. [DOI] [PubMed] [Google Scholar]
  • [35].Sansone S-A, Fan T, Goodacre R, Griffin JL, Hardy NW, Kaddurah-Daouk R, Kristal BS, Lindon J, Mendes P, Morrison N, Nikolau B, Robertson D, Sumner LW, Taylor C, van der Werf M, van Ommen B, Fiehn O, Members MSIB, Nature Biotechnology 25 (2007) 846. [DOI] [PubMed] [Google Scholar]
  • [36].Sumner LW, Amberg A, Barrett D, Beale MH, Beger R, Daykin CA, Fan TW, Fiehn O, Goodacre R, Griffin JL, Hankemeier T, Hardy N, Harnly J, Higashi R, Kopka J, Lane AN, Lindon JC, Marriott P, Nicholls AW, Reily MD, Thaden JJ, Viant MR, Metabolomics 3 (2007) 211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Rubtsov DV, Jenkins H, Ludwig C, Easton J, Viant MR, Gunther U, Griffin JL, Hardy N, Metabolomics 3 (2007) 223. [Google Scholar]
  • [38].Spicer RA, Salek R, Steinbeck C, Sci. Data 4 (2017) 170138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Fiehn O, Robertson D, Griffin J, van der Werf M, Nikolau B, Morrison N, Sumner LW, Goodacre R, Hardy NW, Taylor C, Fostel J, Kristal B, Kaddurah-Daouk R, Mendes P, van Ommen B, Lindon JC, Sansone S-A, Metabolomics 3 (2007) 175. [Google Scholar]
  • [40].Goodman S, Semin. Hematol. 45 (2008) 135. [DOI] [PubMed] [Google Scholar]
  • [41].Worley B, Powers R, Curr. Metabolomics 1 (2013) 92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [42].Benjamini Y, Hochberg Y, Stat JR. Soc. Ser. B Methodol. 57 (1995) 289. [Google Scholar]
  • [43].Neyman J, Pearson ES, Biometrika 20A (1928) 175. [Google Scholar]
  • [44].Armstrong RA, Ophthalmic Physiol. Opt. 34 (2014) 502. [DOI] [PubMed] [Google Scholar]
  • [45].Markley JL, Bruschweiler R, Edison AS, Eghbalnia HR, Powers R, Raftery D, Wishart DS, Curr. Opin. Biotechnol. 43 (2017) 34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [46].Haug K, Salek RM, Conesa P, Hastings J, de Matos P, Rijnbeek M, Mahendraker T, Williams M, Neumann S, Rocca-Serra P, Maguire E, González-Beltrán A, Sansone SA, Griffin JL, Steinbeck C, Nucleic Acids Res. 41 (2013) D781. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [47].Bhinderwala F, Vu T, Smith TG, Kosacki J, Marshall DD, Xu Y, Morton M, Powers R, Anal. Chem. 94 (2022) 16308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [48].Borges RM, de Assis Ferreira G, Campos MM, Teixeira AM, das Neves Costa F, das Chagas FO, Colonna MB, ChemRxiv (2023) 1. [DOI] [PubMed] [Google Scholar]
  • [49].Robinette SL, Zhang F, Bruschweiler-Li L, Bruschweiler R, Anal. Chem. (Washington, DC, U. S.) 80 (2008) 3606. [DOI] [PubMed] [Google Scholar]
  • [50].Wang C, Zhang B, Timari I, Somogyi A, Li D-W, Adcox HE, Gunn JS, Bruschweiler-Li L, Bruschweiler R, Anal. Chem. (Washington, DC, U. S.) 91 (2019) 15686. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [51].Ravanbakhsh S, Liu P, Bjordahl TC, Mandal R, Grant JR, Wilson M, Eisner R, Sinelnikov I, Hu X, Luchinat C, Greiner R, Wishart DS, PLOS ONE 10 (2015) e0124219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [52].Foroutan A, Fitzsimmons C, Mandal R, Berjanskii MV, Wishart DS, Metabolites 10 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [53].Vu T, Xu Y, Qiu Y, Powers R, Biostatistics 24 (2023) 140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [54].Saccenti E, Hoefsloot HCJ, Smilde AK, Westerhuis JA, Hendriks MMWB, Metabolomics 10 (2014) 361. [Google Scholar]
  • [55].Percival B, Gibson M, Leenders J, Wilson PB, Grootveld M, in: Wilson PB and Grootveld M (Eds.), Univariate and Multivariate Statistical Approaches to the Analysis and Interpretation of NMR-based Metabolomics Datasets of Increasing Complexity, The Royal Society of Chemistry. 2020, p. 0. [Google Scholar]
  • [56].Benjamini Y, Hochberg Y, Journal of the Royal Statistical Society. Series B (Methodological) 57 (1995) 289. [Google Scholar]
  • [57].Benjamini Y, Yekutieli D, The Annals of Statistics 29 (2001) 1165. [Google Scholar]
  • [58].Storey JD, Journal of the Royal Statistical Society Series B: Statistical Methodology 64 (2002) 479. [Google Scholar]
  • [59].Eriksson L, Trygg J, Wold S, Chemom J. 22 (2008) 594. [Google Scholar]
  • [60].Lindgren F, Hansen B, Karcher W, Sjostrom M, Eriksson L, J. Chemom. 10 (1996) 521. [Google Scholar]
  • [61].Begou O, Gika HG, Theodoridis GA, Wilson ID, Methods Mol. Biol. 1738 (2018) 15. [DOI] [PubMed] [Google Scholar]
  • [62].Craig A, Cloarec O, Holmes E, Nicholson JK, Lindon JC, Anal Chem 78 (2006) 2262. [DOI] [PubMed] [Google Scholar]
  • [63].Vu T, Riekeberg E, Qiu Y, Powers R, Metabolomics 14 (2018) 108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [64].Anonymous, Nature 533 (2016) 437. [Google Scholar]
  • [65].Emwas A-H, Luchinat C, Turano P, Tenori L, Roy R, Salek RM, Ryan D, Merzaban JS, Kaddurah-Daouk R, Zeri AC, Nagana Gowda GA, Raftery D, Wang Y, Brennan L, Wishart DS, Metabolomics 11 (2015) 872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [66].Snytnikova OA, Khlichkina AA, Sagdeev RZ, Tsentalovich YP, Metabolomics 15 (2019) 1. [DOI] [PubMed] [Google Scholar]
  • [67].Banci L, Barbieri L, Calderone V, Cantini F, Cerofolini L, Ciofi-Baffoni S, Felli IC, Fragai M, Lelli M, Luchinat C, Luchinat E, Parigi G, Piccioli M, Pierattelli R, Ravera E, Rosato A, Tenori L, Turano P, arXiv.org, e-Print Arch., Phys. (2019) 1. [Google Scholar]
  • [68].Alonso-Moreno P, Rodriguez I, Izquierdo-Garcia JL, Metabolites 13 (2023) 614. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [69].Beckonert O, Keun HC, Ebbels TMD, Bundy J, Holmes E, Lindon JC, Nicholson JK, Nat. Protoc. 2 (2007) 2692. [DOI] [PubMed] [Google Scholar]
  • [70].Nicholson JK, Foxall PJD, Spraul M, Farrant RD, Lindon JC, Anal. Chem. 67 (1995) 793. [DOI] [PubMed] [Google Scholar]
  • [71].Hwang TL, Shaka AJ, J. Magn. Reson., Ser. A 112 (1995) 275. [Google Scholar]
  • [72].Dona AC, Jiménez B, Schäfer H, Humpfer E, Spraul M, Lewis MR, Pearce JTM, Holmes E, Lindon JC, Nicholson JK, Anal. Chem. 86 (2014) 9887. [DOI] [PubMed] [Google Scholar]
  • [73].Simpson AJ, Brown SA, J. Mag. Res. 175 (2005) 340. [DOI] [PubMed] [Google Scholar]
  • [74].Le Guennec A, Tayyari F, Edison AS, Anal. Chem. 89 (2017) 8582. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [75].van der Hooft JJJ, Rankin N, in: Webb GA(Ed.), Metabolite Identification in Complex Mixtures Using Nuclear Magnetic Resonance Spectroscopy, Springer International Publishing. Cham, 2018, pp. 1309. [Google Scholar]
  • [76].Ludwig C, Viant MR, Phytochem. Anal. 21 (2010) 22. [DOI] [PubMed] [Google Scholar]
  • [77].Dessau RB, Pipper CB, Ugeskr Laeger 170 (2008) 328. [PubMed] [Google Scholar]
  • [78].Maciejewski MW, Schuyler AD, Gryk MR, Moraru II, Romero PR, Ulrich EL, Eghbalnia HR, Livny M, Delaglio F, Hoch JC, Biophys. J. 112 (2017) 1529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [79].Pang Z, Chong J, Zhou G, Anderson de Lima Morais D, Chang L, Barrette M, Gauthier C, Jacques P-E, Li S, Xia J, Nucleic Acids Res. 49 (2021) W388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [80].Worley B, Powers R, ACS Chem. Biol. 9 (2014) 1138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [81].Alseekh S, Aharoni A, Brotman Y, Contrepois K, D’Auria J, Ewald J, Ewald JC, Fraser PD, Giavalisco P, Hall RD, Heinemann M, Link H, Luo J, Neumann S, Nielsen J, Perez de Souza L, Saito K, Sauer U, Schroeder FC, Schuster S, Siuzdak G, Skirycz A, Sumner LW, Snyder MP, Tang H, Tohge T, Wang Y, Wen W, Wu S, Xu G, Zamboni N, Fernie AR, Nature Methods 18 (2021) 747. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [82].Vu T, Siemek P, Bhinderwala F, Xu Y, Powers R, J. Proteome Res. 18 (2019) 3282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [83].Bhinderwala F, Powers R, Bhinderwala F, Powers R, Methods Mol Biol 2037 (2019) 265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [84].Bhinderwala F, Lei S, Woods J, Rose J, Marshall DD, Riekeberg E, De Lima Leite A, Morton M, Dodds ED, Franco R, Powers R, Methods Mol Biol 1996 (2019) 217. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [85].Lu Z, Database (2011) 1. [Google Scholar]
  • [86].Fiorini N, Lipman DJ, Lu Z, Elife 6 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [87].Falagas ME, Pitsouni EI, Malietzis GA, Pappas G, FASEB J. 22 (2008) 338. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NMR literature survey 2010 data

Tables B.2 and B.3 list the raw 2010 and 2020 literature survey data, respectively. Tables B.4 and B.5 list the tabulation of the 2010 and 2020 literature survey data, respectively.

NMR literature survey 2020 data
NMR literature survey questions

Table B.1 provides a summary of survey questions and possible answers.

Data Availability Statement

All data is available as supplementary data.

RESOURCES