Skip to main content
Springer logoLink to Springer
. 2025 Jun 14;21(4):79. doi: 10.1007/s11306-025-02272-w

Untargeted analysis of hydrophilic metabolites using enhanced LC-MS separation with a pentafluoro phenylpropyl-functionalized column and prediction-based MS/MS spectrum annotation

Masaru Sato 1, Kazutaka Ikeda 1,2,
PMCID: PMC12167282  PMID: 40515963

Abstract

Introduction

Wide-targeted metabolome analysis has been applied in studies on the biology of mammals, plants, and microorganisms. However, there are still issues regarding both analytical and informatics technologies for establishing an untargeted and comprehensive analysis of hydrophilic primary and secondary metabolites.

Objectives

This study aimed to develop an improved chromatographic method for analyzing hydrophilic metabolites and an annotation method for these diverse metabolites.

Methods

We investigated the performance of a pentafluoro phenylpropyl-functionalized column (PFP column) for the comprehensive analysis of hydrophilic metabolites by liquid chromatography-mass spectrometry (LC-MS). Peaks were annotated using MS/MS spectral similarity searches of the predicted and experimental MS/MS spectra in metabolite structure databases.

Results

The improved retention and peak shapes of the standard compounds were obtained using LC-MS analysis with a PFP column. The mobile phases comprised water with 0.1% formic acid and methanol with 0.1% formic acid and 10 mM ammonium formate. From the annotation results of the 48 standard compounds, the chemical structures were correctly annotated in 54% of the compounds. However, over 70% of the compounds were annotated as biologically relevant based on the natural product classification. When these methods were applied to the analysis of tomato fruits, 658 and 458 peaks were detected and annotated in the positive and negative ion analyses, respectively.

Conclusion

Metabolome analysis combined with LC-MS analysis and annotation can contribute to the comprehensive analysis of hydrophilic metabolites.

Supplementary Information

The online version contains supplementary material available at 10.1007/s11306-025-02272-w.

Keywords: Untargeted metabolome analysis, Hydrophilic metabolite, LC-MS analysis, Metabolite annotation

Introduction

Untargeted and comprehensive metabolite analyses are the technological goals of metabolomics. Since yeast metabolome analysis (Oliver et al., 1998) was first reported, wide-target metabolome analysis has been widely applied in studies on the biology of mammals (Ramautar et al., 2013), plants (Fiehn et al., 2000), and microorganisms (Tang, 2011). The rapid development of metabolite analysis technology has been achieved through the development of high-performance chromatographic and spectrometric instruments (Munjal et al., 2022) and informatics (Chen et al., 2022). However, issues remain regarding both analytical and informatic technologies for establishing untargeted and comprehensive metabolite analyses. In particular, in the untargeted analysis of hydrophilic metabolites, including primary and secondary metabolites, their detection using a single analytical instrument remains a challenge. In addition, due to the more diverse and less-ruled structures of hydrophilic metabolites compared to lipids and peptides, there is a less sophisticated method for annotation.

Several analytical technologies have been applied for the analysis of hydrophilic metabolites. Targeted analysis of primary metabolites is typically performed using capillary electrophoresis-mass spectrometry (CE-MS; Soga, 2023) or gas chromatography-mass spectrometry (GC-MS; Fiehn, 2016). However, liquid chromatography-mass spectrometry (LC-MS) is often used in targeted and untargeted analysis of secondary metabolites (Perez de Souza et al. 2019). Because primary metabolites mainly consist of ionic molecules, CE-MS is suitable for analysis. However, most secondary metabolites are non-ionic. Therefore, it is difficult to analyze these molecules using CE-MS. GC-MS is a powerful instrument for primary metabolite analysis (Fiehn, 2016), but the analytes are limited to volatile and thermally stable molecules, even though the volatility can be increased by derivatization. Therefore, LC-MS analysis is considered a possible technology for the comprehensive untargeted analysis of hydrophilic metabolites.

To utilize the LC-MS analysis, the chromatographic separation of the primary metabolites is one of the issues that needs to be improved. Stationary phases based on reverse-phase (RP) chromatography are often used to retain and separate metabolites (Perez de Souza et al. 2019). However, these stationary phases showed poor retention of hydrophilic primary metabolites, in contrast to good retention and separation of secondary metabolites. Therefore, hydrophilic interaction chromatography (HILIC) (Narduzzi et al., 2023), a combination of RP and ion exchange (IEX) (Yanes et al., 2011), and a combination of HILIC and IEX (Nakatani et al., 2022) have been investigated. Because the stationary phases based on IEX chromatography show strong retention of primary metabolites, high-concentration additives such as ammonium formate (AF), ammonium acetate, and ammonium bicarbonate are indispensable to elute these metabolites from the column. However, the high concentration ( ≧ 40 mM) of the additives may suppress the sensitivity of MS and clog the flow path of LC. Therefore, the extra care for the equipment is necessary. Although HILIC is an interesting technique for the analysis of primary metabolites, the columns used tend to require a longer re-equilibration time than columns based on RP chromatography. As an alternative, pentafluorinated and pentabrominated phenyl columns have been developed to achieve greater retention of hydrophilic compounds in RP chromatography (Nakatani et al., 2020; Ozaki et al., 2022). As hydrophobic interactions are the main separation principle of these columns, good retention and separation of secondary metabolites are expected. Moreover, if efficient retention and separation of the primary metabolites are obtained using these columns, comprehensive LC-MS analysis of hydrophilic metabolites can be achieved. Hence, the performance of these columns for the comprehensive analysis of hydrophilic metabolites is worth investigating.

Another issue in the untargeted and comprehensive analysis of hydrophilic metabolites is metabolite annotation. Metabolite annotation using LC-MS data has been successful in some metabolite categories, such as peptides (Eng et al., 1994; Perkins et al., 1999) and lipids (Tsugawa et al., 2020). However, annotation of hydrophilic metabolites is difficult because of their more diverse and less ruled structures. Since peptides consist of 20 types of proteinogenic amino acids, the fragmentation patterns in MS/MS analyses can be predicted theoretically. The peptide structure was successfully identified by assigning the fragment ions detected in the MS/MS spectrum to a theoretical fragment pattern. Lipids possessing a ruled structure can be annotated using MS/MS spectra. In contrast, hydrophilic metabolites have fewer ruled structures, making the theoretical assignment of the MS/MS spectrum difficult. The only way to use the MS/MS spectrum of a hydrophilic metabolite for annotation is to conduct a similarity search of the already acquired MS/MS spectra stored in databases. Although the entries of these MS/MS spectrum databases are increasing, mzCloud stores the spectrum of only 26,417 compounds, whereas the Human Metabolome Database (HMDB, Wishart et al., 2022) and LOTUS (Rutz et al., 2022), the metabolite structure databases, store 253,245 and 276,518 compounds, respectively. Therefore, the availability of MS/MS spectra is limited, even for known metabolites. To fill the gap between the entries in the MS/MS spectrum database and the structure database, researchers consider the prediction of the MS/MS spectrum a possible technology. Wang et al. developed CFM-ID version 4.0, an MS/MS spectrum prediction tool (Wang et al., 2021). Using CFM-ID 4.0, the predicted MS/MS spectrum can be generated from the SMILES string of the metabolite structure and is expected to facilitate the similarity-search-based annotation of diverse hydrophilic metabolites.

We investigated the performance of the pentafluoro phenylpropyl-functionalized column (PFP column) for the comprehensive analysis of hydrophilic metabolites and developed software to semiautomatically process and annotate LC-MS data. Improved chromatographic separation of hydrophilic metabolites and comprehensive annotation using experimental and predicted MS/MS spectra were demonstrated.

Methods

Chemicals

The Prestwick Phytochemical Library was purchased from Parkin Elmer (catalog number: JP-PPHYL-10-50-P; Waltham, MA, USA). Currently, the library is updated and provided by Green Pharma S.A.S in France. An isotope-labeled metabolite standard mixture (Metabolomics QReSS Kit, catalog number: MSK-QRESS-KIT) was purchased from Cambridge Isotope Laboratories (Andover, MA, USA). LC-MS grade ultrapure water, acetonitrile, and methanol were purchased from Fuji Film Wako Pure Chemicals (Osaka, Japan). Other chemicals were of reagent grade and available from any chemical supplier.

Sample preparation

Two types of tomatoes (tomatoes A and B) distributed for food were used in this study. The tomato fruit (n = 3) was frozen in liquid nitrogen, ground using a Multibead Shocker (Yasui Kikai, Osaka, Japan), and lyophilized. The lyophilized powder (5 mg) was extracted twice with 500 µL of 75% v/v aqueous methanol. The extract was applied to a solid-phase extraction column (Discovery DSC-18, 100 mg, Merck, Darmstadt, Germany) equilibrated with 75% aqueous methanol, and the unbound fraction was collected. The fraction was evaporated to dryness using a centrifugal evaporator (CVE-3100, EYELA, Tokyo, Japan) and resuspended with 100 µL of ultrapure water containing 0.1% formic acid (FA) and 10-fold diluted QreSS Kit Standard Mix.

Standard compounds in the Prestwick library (319 compounds) were divided into 14 groups and analyzed as compound mixtures. The minimum difference in the exact mass of the compounds in each group was > 2 Da.

LC-MS analysis

An LC-MS system consisting of an Ultimate 3000 ultra-high-performance liquid chromatograph and Q Exactive mass spectrometer was used for metabolite analysis. A PFP column, Discovery HS F5 (2.1 mm-i.d., 150 mm-length, 3 μm-particle, Merck) or an ODS-functionalized column (ODS column), InertSustain AQ-C18 (2.1 mm-i.d., 150 mm-length, 3 μm-particle, GL-Science, Tokyo, Japan), were used for metabolite separation. The ODS column was designed to retain hydrophilic compounds more than ordinary ODS columns and used for a metabolome analysis of a transgenic plant in our previous study (Shimizu et al., 2019). For the LC-MS analysis, ultrapure water containing 0.1% FA was used as mobile phase (A). Acetonitrile, methanol, or a 10 mM AF methanol solution containing 0.1% FA was used as mobile phase (B). The gradient elution program consisted of 2% B (3 min), 98% B (30 min), 98% B (35 min), 2% B (35.1 min), and 2% B (40 min). The flow rate and the column temperature were maintained at 0.2 mL/min and 40˚C, respectively. MS analysis was performed in the positive or negative ion mode with electrospray ionization over an m/z range of 70 − 1050. The resolving power was set to 70,000 for the precursor ion scan (full MS analysis) and 17,500 for the product ion scan (MS/MS analysis). The product ion was generated by higher-energy collision-induced dissociation with stepped-collision energies of 10, 50, and 80% in MS/MS analysis. MS/MS analysis was performed in the data-dependent acquisition mode with the dynamic exclusion function of Q Exactive. Mass calibration of the Q Exactive was performed according to the low-mass calibration procedure provided by the manufacturer.

Raw data processing

The LC-MS raw data were processed using data processing software developed in our laboratory to obtain peak features from each raw data and the peak area matrix of all analyzed samples. Briefly, extracted mass chromatograms (XICs) were generated for all m/z values detected in the raw data. Subsequently, peaks were detected from the XICs. Peaks were filtered using the minimum scan points, signal-to-noise ratios, and peak widths. The filtered peaks in all the analyzed samples were aligned according to their retention time and m/z. After alignment, the peaks originating from the in-source fragmented ions were determined from the m/z differences between two peaks detected at the same retention time. In addition, peaks originating from protonated, deprotonated, and other adduct ions are observed. The in-source fragment patterns and adduct ions designated for data processing are listed in Supplementary Table S1.

Metabolite annotation

Metabolite entries in HMDB, LOTUS, and MassBank (Horai et al., 2010) were merged by their identity of the SMILES string regenerated by RDKit version 2023.3.3, and the metabolite name, chemical formula, exact mass value, SMILES string, and MS/MS spectrum (experimental and predicted) were stored in a local database. In the metabolite annotation process of peaks detected in LC-MS analysis, the database entries whose exact mass ( ≦ 5 ppm) and number of carbon atom ( ≦ 2 ) matched that of each detected peak were first selected as annotation candidates. The number of carbon atoms was estimated from the ratio of 13C1 peak intensities of each detected peak. Subsequently, the MS/MS spectrum availability of each candidate was checked, and the predicted MS/MS spectrum was generated using CFM-ID 4.0, if the spectrum was not available (Wang et al., 2021). The similarity of the MS/MS spectrum between a detected peak and a candidate was then calculated using spectral entropy (Li et al., 2021). The annotation candidates were ranked according to spectral similarity, and the top-ranked annotations were further considered in this study. These annotation candidates were classified using the NP Classifier (Kim et al., 2021), which is an automated structural classifier for natural products. This annotation process was performed semi-automatically using an in-house Python script.

Results

Improved separation of hydrophilic metabolites using the PFP column

Compounds in the Prestwick library were analyzed using LC-MS with ODS or PFP columns. The improvement in the retention of the 19 compounds that were weakly retained on the ODS column is shown in Fig. 1. When acetonitrile containing 0.1% FA was used, these compounds, except for compound 15, showed longer retention times on the PFP column than on the ODS column. It was also confirmed that mobile phases consisting of methanol slightly improved the compound retention.

Fig. 1.

Fig. 1

Retention time improvement of the compounds (compound 1–19) weakly retained on the ODS column. The combinations of the column and mobile phase B are as follows: the ODS column and acetonitrile with 0.1% FA (red); the PFP column and acetonitrile with 0.1% FA (orange); the PFP column and methanol with 0.1% FA (blue); and the PFP column and methanol with 0.1% FA and 10 mM AF (green). The common names and PubChem CIDs of the compound 1–19 were shown in Supplementary Table S2

A representative chromatogram of the compound mixture from the Prestwick Library is shown in Fig. 2. When using acetonitrile containing 0.1% FA, some compounds (Fig. 2d) were detected as distorted peaks (Fig. 2a). When methanol containing 0.1% FA was used, the compounds were strongly absorbed onto the column and were not detected (Fig. 2b). We compared the structures of the distorted peak-forming compounds and good-shaped peak-forming compounds and found that the ratio of compounds possessing tertiary amine structures was high in the distorted peak-forming compounds (data not shown). The retention times and peak shapes of all detected compounds are summarized in Supplementary Table S4.

Fig. 2.

Fig. 2

Representative total ion chromatogram of a compound mixture of the Prestwick Library. The LC-MS analysis was performed using the PFP column with acetonitrile containing 0.1% FA a, methanol containing 0.1% FA b, and methanol containing 0.1% FA and 10 mM AF c as the mobile phase B. The four chemical structures of the compounds that formed the distorted peaks were depicted (d). The common names and PubChem CIDs of the distorted peaks (1–4) and commonly detected major peaks (5–8) were shown in Supplementary Table S3

Mobile phase B, which consisted of methanol, 0.1% FA, and 10 mM AF, drastically improved the peak shapes of the distorted peak-forming compounds (Fig. 2c). Therefore, we used a combination of the mobile phases for further analysis.

Evaluation of annotation results for standard compound in metabolite library

Data processing and metabolite annotation were performed using the LC-MS data. As shown in Fig. 3, 94% of the standard compounds were correctly annotated at the molecular formula level. The three compounds whose molecular formulae were not annotated correctly were cationic compounds detected as [M] + ions in the LC-MS analysis. At the molecular structure level, 54% of the standard compounds were correctly annotated, ignoring the differences between stereoisomers. Among these annotations, the spectral similarity values were > 0.5, except for compounds 44 (0.380) and 45 (0.374). To evaluate whether the structures of the annotation results and the standard compounds were consistent at the metabolite class level, they were compared using an NP Classifier. As a result, 71% of the standard compounds were identical to their annotations at the NP Class level, which is the most subdivided classification. At the NP superclass and NP pathway classification levels, 75% and 88% of the standard compounds had identical annotations, respectively.

Fig. 3.

Fig. 3

Annotation of 48 compounds in the Prestwick library. The chemical structure of each compound and the annotated chemical structure were shown on the left and right, respectively. The index of the compound, match indicator and annotation score, were shown below the chemical structures. The match indicator indicates the annotation matches at the level of the chemical formula (red), chemical structure (blue), NP Class (green), NP Superclass (light blue), and NP Pathway (yellow)

Practical metabolome analysis of tomato fruits

The hydrophilic metabolites in the two types of tomato fruits were analyzed and subsequently annotated. In the total ion chromatograms shown in Fig. 4, the overall retention times of the metabolites on the PFP column were longer than those on the ODS column. The chromatographic data for the isotope-labeled metabolites spiked into the samples are summarized in Table 1. These spiked metabolites were retained more strongly on the PFP column than on the ODS column. In addition, the C.V. values for l-alanine (13C3/15N) and creatinine (D3), which showed almost no retention on the ODS column, were lower when the PFP column was used. Although citric acid (13C3) and a-Ketoglutaric acid (13C4) were not detected, the peaks corresponding to [M-H] ion of the two isotope-labeled metabolites were observed at 3.48 and 4.91 min, respectively, when an excess amount (20 ng on-column) was injected into the LC-MS.

Fig. 4.

Fig. 4

Total ion chromatogram of tomato A (a, c) and tomato B (b, d). The ODS column (a, b) and the PFP column (c, d) were used in the LC-MS analysis

Table 1.

Retention times and peak areas of the isotope-labeled metabolites (QReSS Kit) by the LC-MS analysis

Internal Standard On-column
amount
(ng)
ODS column1 PFP column2
Average
R. T.
(min)
Average
peak area
(× 106)
C.V.
(%)
Average
R. T.
(min)
Average
peak area
(× 106)
C.V.
(%)
L-Alanine (13C3/15N) 2 1.74 14 97.1 2.29 10 14.8
Creatinine (D3) 2 1.93 574 16.1 4.20 710 1.65
Nicotinamide (13C6) 0.1 3.42 8 10.5 5.35 17 5.27
Hypoxanthine (13C5) 0.2 3.75 30 7.78 4.72 42 16.0
L-Tyrosine (13C6) 2 4.92 93 4.69 9.30 84 6.44
L-Leucine (13C6) 0.1 5.24 8 3.54 10.41 5 6.69
Thymine (15N2) 0.4 6.22 13 13.5 6.26 6 8.03
Guanosine (15N5) 0.04 7.27 1 11.3 8.82 1 10.9
L-Phenylalanine (13C6) 2 9.16 159 2.84 13.74 123 8.54
L-Tryptophan (13C11) 2 11.05 126 2.38 18.39 97 10.60
Indole-3-acetic acid (13C6) 0.1 16.53 2 7.26 21.21 14 9.15
Citric acid (13C3) 0.2 n.d.3 n.d.3 n.a.4 n.d.3 n.d.3 n.a.4
Fumaric acid (13C4) 2 4.98 15 2.1 5.21 17 1.9
a-Ketoglutaric acid (13C4) 2 n.d.3 n.d.3 n.a.4 n.d.3 n.d.3 n.a.4

1Acetonirile with 0.1% FA was used for the mobile phase B, 2Methanol with 0.1% FA and 10 mM AF was used for the mobile phase B, 3 n.d.: not detected, 4 n.a.: not available

In the annotation of metabolites in the tomato fruits analyzed by the PFP column, 658 and 458 peaks were annotated in positive and negative ion analyses, respectively, with spectrum similarity above 0.4 (Supplementary Tables S5 and S6). From the results of the annotation of the compounds in the Prestwick library, annotations with similarity scores below 0.4 were rejected from consideration in this study.

Classification of annotations by the NP Classifier at the NP Pathway level is shown in Fig. 5a and b. In these pie charts, the annotations classified into the two NP Pathways were counted separately in both NP Pathways. The annotations in “Amino acids and Peptide” and “Alkaloids” were detected more in the positive ion analysis than in the negative ion analysis. However, the annotations in “Fatty acids” and “Carbohydrates” were detected more in the negative ion analysis than in the positive ion analysis. The volcano and strip plots of each annotated metabolite detected in the positive ion analysis are shown in Fig. 5c. Comparing the strip plot of tomato A and B, the number of metabolites uniquely detected in tomato A was higher than that of tomato B. From the semiquantitative comparison of the metabolites in tomato A and B shown in the volcano plot, significant changes in metabolite abundance were mainly found in the metabolites classified into “Amino acids and Peptides” and “Shikimates and Phenylpropanoids.”

Fig. 5.

Fig. 5

Annotation and classification using NP Classifier of tomato metabolites. The annotations obtained by positive a and negative b ion analysis were classified by the NP Classifier at the NP Pathway level, and the numbers of annotations in each NP Pathway were shown. The volcano plot and strip plot c indicated the significant variation of the hydrophilic metabolites found in tomato A and B

Discussion

In this study, we demonstrate a comprehensive method for analyzing hydrophilic metabolites. The primary feature of this study was the improved chromatographic retention and separation of hydrophilic metabolites using a PFP column. The PFP column showed sufficient retention of the primary and secondary metabolites (Fig. 4). These improvements resulted in a more stable and reproducible detection of highly hydrophilic metabolites (Table 1). Chromatographic separation using a PFP column is necessary for the comprehensive and semiquantitative analysis of hydrophilic metabolites. Comparing the PFP column with the ODS column, the PFP column exhibits hydrogen bonding, dipole-dipole, and π-π interactions in addition to hydrophobic interactions, which are the main retention mechanism of the ODS column. Therefore, these additional interactions likely contributed to the improved retention of hydrophilic metabolites (Fig. 1). In addition, the retention of hydrophilic metabolites increased when mobile phase B consisting of methanol was used in the analysis using the PFP column. Since methanol is a protic solvent and weakens the hydrogen bonding between the stationary phase and analytes, whereas acetonitrile was reported to suppress the dipole-dipole and the π-π interaction between them stronger than methanol (Croes et al., 2005), we suppose that the increased retention was due to the dipole-dipole and π-π interaction. Other characteristics of the analysis using the PFP column, the compounds containing secondary amine, tertiary amine, and quaternary ammonium structures showed distorted peak shapes when acetonitrile or methanol containing 0.1% FA was used (Fig. 2a and b). Because these structures are often found in alkaloids, which are important plant metabolites, it is indispensable to analyze them. These peak distortions suggested additional interactions between the PFP column and the compound. We assumed that these interactions were reduced or controlled by the addition of ammonium ions. The peak shapes of these compounds were improved upon the addition of 10 mM AF, as expected (Fig. 2c).

The second feature of this study is the comprehensive annotation method using MS/MS spectrum prediction. The accuracy of the annotation was above 70% at the NP Class level, and it was considered that the annotation could be useful for providing an overview of the profile of hydrophilic metabolites (Fig. 3). Only 54% of the metabolite structures of the 48 compounds in the Prestwick library were annotated. Therefore, further improvement of the algorithm for MS/MS spectrum prediction and spectrum similarity search is needed. Moreover, the correct annotation of metabolites that exist as cations, such as compounds 31, 37, and 48 in Fig. 3, was impossible using this annotation method. In the untargeted analysis, the [M] + and [M + H] + ions were indistinguishable, resulting in the assignment of the wrong annotation in the exact mass-search process, which is the first process of this annotation method. At present, a spectrum similarity search of the experimentally acquired MS/MS spectrum combined with the retention time consistency of the standard compound is the only way to annotate these compounds correctly.

The practical applicability of the annotation method was evaluated for the two types of tomato fruit, and 658 and 458 peaks were annotated in the positive and negative ion analyses, respectively. Alseekh et al. reported the genetic robustness of tomato fruit metabolism using genomic and metabolomic data comprising 63 primary and 145 secondary metabolites (Alseekh et al., 2017). In the study, metabolites were analyzed using LC-MS and annotated using standard compounds, literature, and tomato metabolomic databases. The analysis and annotation method in this study could provide more annotated peaks than in the previous report without any species-specific knowledge, i.e. the literature and tomato-specific databases in this case. Therefore, this method is robust and can provide deep insights into metabolomics and multi-omics research.

Annotations belonging to the NP Pathway classes of “Amino acids and Peptides” and “Alkaloids” were significantly found more in the positive ion analysis than that in the negative ion analysis (Fig. 5a and b). In contrast, annotations belonging to “Carbohydrates” and “Fatty acids” were found more in the negative ion analysis than that in the positive ion analysis. It is considered that these tendencies reflect the detectability of MS, that is, the compounds possessing amino groups are preferably detected in the positive ion analysis and the compounds possessing hydroxy and carboxy groups are preferably detected in the negative ion analysis. Therefore, the validity of the annotation result could be supported by the detectable tendency of MS. In the volcano plot shown in Fig. 5c, metabolites significantly changed between tomato A and B and were mainly classified into NP Pathways of “Amino acids and Peptides” and “Shikimates and Phenylpropanoids.” Therefore, the analysis suggests that the main difference between tomato A and B occurred in the metabolites concerning to “Amino acids and Peptides” and “Shikimates and Phenylpropanoids.” Metabolomic analysis plays an important role in the discovery of single metabolites that can be used as biomarkers. Conversely, metabolome analysis, which enables consideration of changes in biologically relevant metabolite groups, is useful for comprehensively interpreting the whole profiles of metabolome samples, as demonstrated in this study.

Conclusions

In this study, we performed a comprehensive metabolome analysis of hydrophilic metabolites. Sufficient retention and separation of primary and secondary metabolites were achieved using a PFP column and appropriate mobile phases. In the annotation of metabolites in the Prestwick library using the predicted MS/MS spectra, > 70% of the metabolites were correctly annotated at the NP Superclass level, although the proportion of metabolites that were correctly annotated at the chemical structure level was limited to 50%. Metabolome analysis, combining LC-MS analysis and annotation, can contribute to the comprehensive analysis of hydrophilic metabolites.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1 (223.6KB, xlsx)

Abbreviations

LC-MS

Liquid chromatography-mass spectrometry

GC-MS

Gas chromatography-mass spectrometry

CE-MS

Capillary electrophoresis-mass spectrometry

C.V.

Coefficient of variation

RP

Reversed phase

IEX

Ion exchange

HILIC

Hydrophilic interaction chromatography

PFP

Pentafluoro phenylpropyl

ODS

Octadecylsilyl

FA

Formic acid

AF

Ammonium formate

Author contributions

All authors contributed to the conception and design of this study. Material preparation, data collection, and analysis were performed by MS. The first draft of the manuscript was written by MS. All authors commented on the previous versions of the manuscript. All the authors have read and approved the final version of the manuscript.

Funding

This research was funded by Kazusa DNA Research Institute foundation and Japan Agency for Medical Research and Development-Core Research for Evolutionary Medical Science and Technology (AMED-CREST) (grant number: 23gm1710006s0201; K. Ikeda).

Data availability

The data supporting the findings of this study are available from the corresponding author upon request.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. Alseekh, S., Tong, H., Scossa, F., Brotman, Y., Vigroux, F., Tohge, T., Ofner, I., Zamir, D., Nikoloski, Z., & Fernie, A. R. (2017). Canalization of tomato fruit metabolism. The Plant Cell, 29(11), 2753–2765. 10.1105/tpc.17.00367 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Chen, Y., Li, E. M., & Xu, L. Y. (2022). Guide to metabolomics analysis: A bioinformatics workflow. Metabolites, 12(4), 357. 10.3390/metabo12040357 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Croes, K., Steffens, A., Marchand, D. H., & Snyder, L. R. (2005). Relevance of p-p and dipole-dipole interactions for retention on cyano and phenyl columns in reverse-phase liquid chromatography. Journal of Chromatography A, 1098(1–2), 123–130. 10.1016/j.chroma.2005.08.090 [DOI] [PubMed] [Google Scholar]
  4. de Perez, L., Alseekh, S., Naake, T., & Fernie, A. (2019). Mass spectrometry-based untargeted plant metabolomics. Current Protocols in Plant Biology, 4(4), e20100. 10.1002/cppb.20100 [DOI] [PubMed] [Google Scholar]
  5. Eng, J. K., McCormack, A. L., & Yates, J. R. (1994). An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. Journal of the American Society for Mass Spectrometry, 5(11), 976–989. 10.1016/1044-0305(94)80016-2 [DOI] [PubMed] [Google Scholar]
  6. Fiehn, O. (2016). Metabolomics by gas chromatography–mass spectrometry: Combined targeted and untargeted profiling. Current Protocols in Molecular Biology, 114(1), 30.4.1–30.4.32 [DOI] [PMC free article] [PubMed]
  7. Fiehn, O., Kopka, J., Dörmann, P., Altmann, T., Trethewey, R. N., & Willmitzer, L. (2000). Metabolite profiling for plant functional genomics. Nature Biotechnology, 18(11), 1157–1161. 10.1038/81137 [DOI] [PubMed] [Google Scholar]
  8. Horai, H., Arita, M., Kanaya, S., Nihei, Y., Ikeda, T., Suwa, K., Ojima, Y., Tanaka, K., Tanaka, S., Aoshima, K., Oda, Y., Kakazu, Y., Kusano, M., Tohge, T., Matsuda, F., Sawada, Y., Hirai, M. Y., Nakanishi, H., Ikeda, K., & Nishioka, T. (2010). MassBank: A public repository for sharing mass spectral data for life sciences. Journal of Mass Spectrometry, 45(7), 703–714. 10.1002/jms.1777 [DOI] [PubMed] [Google Scholar]
  9. Kim, H. W., Wang, M., Leber, C. A., Nothias, L. F., Reher, R., Kang, K., Van Der Bin, J. J. J., Dorrestein, P. C., Gerwick, W. H., & Cottrell, G. W. (2021). NPClassifier: A deep neural network-based structural classification tool for natural products. Journal of Natural Products, 84(11), 2795–2807. 10.1021/acs.jnatprod.1c00399 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Li, Y., Kind, T., Folz, J., Vaniya, A., Mehta, S. S., & Fiehn, O. (2021). Spectral entropy outperforms MS/MS Dot product similarity for small-molecule compound identification. Nature Methods, 18(12), 1524–1531. 10.1038/s41592-021-01331-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Munjal, Y., Tonk, R. K., & Sharma, R. (2022). Analytical techniques used in metabolomics: A review. Systematic Reviews in Pharmacy, 13(8), 515–521. 10.31858/0975-8453.13.8.515-521 [Google Scholar]
  12. Nakatani, K., Izumi, Y., Hata, K., & Bamba, T. (2020). An analytical system for single-cell metabolomics of typical mammalian cells based on highly sensitive nano-liquid chromatography tandem mass spectrometry. Mass Spectrometry, 9(1), A0080. 10.5702/massspectrometry.A0080 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Nakatani, K., Izumi, Y., Takahashi, M., & Bamba, T. (2022). Unified-hydrophilic-interaction/anion-exchange liquid chromatography mass spectrometry (Unified-HILIC/AEX/MS): A single-run method for comprehensive and simultaneous analysis of Polar metabolome. Analytical Chemistry, 94(48), 16877–16886. 10.1021/acs.analchem.2c03986 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Narduzzi, L., Delgado-Povedano, M. M., Lara, F. J., Le Bizec, B., García-Campaña, A. M., Dervilly, G., & Hernández-Mesa, M. (2023). A comparison of hydrophilic interaction liquid chromatography and capillary electrophoresis for the metabolomics analysis of human serum. Journal of Chromatography A, 1706, 464239. 10.1016/j.chroma.2023.464239 [DOI] [PubMed] [Google Scholar]
  15. Oliver, S. G., Winson, M. K., Kell, D. B., & Baganz, F. (1998). Systematic functional analysis of the yeast genome. Trends in Biotechnology, 16(9), 373–378. 10.1016/S0167-7799(98)01214-1 [DOI] [PubMed] [Google Scholar]
  16. Ozaki, M., Shimotsuma, M., & Hirose, T. (2022). Separation of nicotinamide metabolites using a PBr column packed with pentabromobenzyl group modified silica gel. Analytical Biochemistry, 655, 114837. 10.1016/j.ab.2022.114837 [DOI] [PubMed] [Google Scholar]
  17. Perkins, D. N., Pappin, D. J. C., Creasy, D. M., & Cottrell, J. S. (1999). Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis, 20(18), 3551–3567. [DOI] [PubMed] [Google Scholar]
  18. Ramautar, R., Berger, R., van der Greef, J., & Hankemeier, T. (2013). Human metabolomics: Strategies to understand biology. Current Opinion in Chemical Biology, 17(5), 841–846. 10.1016/j.cbpa.2013.06.015 [DOI] [PubMed] [Google Scholar]
  19. Rutz, A., Sorokina, M., Galgonek, J., Mietchen, D., Willighagen, E., Gaudry, A., Graham, J. G., Stephan, R., Page, R., Vondrášek, J., Steinbeck, C., Pauli, G. F., Wolfender, J. L., Bisson, J., & Allard, P. M. (2022). The LOTUS initiative for open knowledge management in natural products research. eLife, 11, e70780. 10.7554/eLife.70780 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Shimizu, Y., Rai, A., Okawa, Y., Tomatsu, H., Sato, M., Kera, K., Suzuki, H., Saito, K., & Yamazaki, M. (2019). Metabolic diversification of nitrogen-containing metabolites by the expression of a heterologous lysine decarboxylase gene in Arabidopsis. The Plant Journal, 100(3), 505–521. 10.1111/tpj.14454 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Soga, T. (2023). Advances in capillary electrophoresis mass spectrometry for metabolomics. Trends in Analytical Chemistry, 158. 10.1016/j.trac.2022.116883
  22. Tang, J. (2011). Microbial metabolomics. Current Genomics, 12, 391–403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Tsugawa, H., Ikeda, K., Takahashi, M., Satoh, A., Mori, Y., Uchino, H., Okahashi, N., Yamada, Y., Tada, I., Bonini, P., Higashi, Y., Okazaki, Y., Zhou, Z., Zhu, Z. J., Koelmel, J., Cajka, T., Fiehn, O., Saito, K., Arita, M., & Arita, M. (2020). A lipidome atlas in MS-DIAL 4. Nature Biotechnology, 38(10), 1159–1163. 10.1038/s41587-020-0531-2 [DOI] [PubMed] [Google Scholar]
  24. Wang, F., Liigand, J., Tian, S., Arndt, D., Greiner, R., & Wishart, D. S. (2021). CFM-ID 4.0: More accurate ESI-MS/MS spectral prediction and compound identification. Analytical Chemistry, 93(34), 11692–11700. 10.1021/acs.analchem.1c01465 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Wishart, D. S., Guo, A. C., Oler, E., Wang, F., Anjum, A., Peters, H., Dizon, R., Sayeeda, Z., Tian, S., Lee, B. L., Berjanskii, M., Mah, R., Yamamoto, M., Jovel, J., Torres-Calzada, C., Hiebert-Giesbrecht, M., Lui, V. W., Varshavi, D., Varshavi, D., & Gautam, V. (2022). HMDB 5.0: The human metabolome database for 2022. Nucleic Acids Research, 50(D1), D622–D631. 10.1093/nar/gkab1062 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Yanes, O., Tautenhahn, R., Patti, G. J., & Siuzdak, G. (2011). Expanding coverage of the metabolome for global metabolite profiling. Analytical Chemistry, 83(6), 2152–2161. 10.1021/ac102981k [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1 (223.6KB, xlsx)

Data Availability Statement

The data supporting the findings of this study are available from the corresponding author upon request.


Articles from Metabolomics are provided here courtesy of Springer

RESOURCES