Evaluation of the current methods for assigning molecular formulas to dissolved organic matter using high resolution mass spectrometry

Zhiyuan Deng; Xinxia He; Wenjuan Ouyang; Hao Ying; Qiong Tang; Dianchang Wang; Chong Li; Yao Chen; Zhe Li

doi:10.1038/s41598-025-87539-x

. 2025 Mar 8;15:8105. doi: 10.1038/s41598-025-87539-x

Evaluation of the current methods for assigning molecular formulas to dissolved organic matter using high resolution mass spectrometry

Zhiyuan Deng ^1,², Xinxia He ^2,^✉, Wenjuan Ouyang ², Hao Ying ³, Qiong Tang ^1,², Dianchang Wang ⁴, Chong Li ⁴, Yao Chen ¹, Zhe Li ^2,^✉

PMCID: PMC11890610 PMID: 40057519

Abstract

Understanding DOM composition is essential for investigating the biogeochemical carbon cycle and related elements. However, methods for assigning molecular formulas (MF) in high-resolution mass spectrometry remain poorly defined, which often resulting in the misidentification of DOM components. In this study, we established a metrics evaluation framework for assessing the assignment results of each method based on similarity, accuracy and correctness. We selected six methods and then evaluated them using different settings of elemental limits, filter rules and selection rules. Our findings reveal that Formularity and TRFU are the most suitable methods for MF assignment in DOM. These two methods show high similarity ratios (93–99%) and low Bray-Curtis distances (0.13–0.14), indicating more substantial assignment capability. Their correctness rates (86–87%) and low chemical diversity errors (0.14–0.39) indicate more accurate assignment results. Other methods, TEnvR, ICBM and MFAssignR, with separate filters, show unassigned error rates of up to 47% ± 18%, potentially omitting certain DOM components. At moderate dissolved organic carbon concentrations, TRFU performs better, while Formularity outperforms at both high and low concentrations. This study provides recommendations for selecting analytical methods for DOM, facilitating a deeper understanding of its properties and role in aquatic ecosystems.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-87539-x.

Keywords: High-resolution mass spectrometry, Dissolved organic matter, Molecular formula assignment methods, Assessment framework, Chemodiversity

Subject terms: Analytical chemistry, Environmental sciences

Introduction

Dissolved organic matter (DOM) is a type of crucial organic ligand and transport carrier in the aquatic environment¹. DOM serves as both energy source and carbon pool for the metabolism of organisms, which in turn affects biogeochemical processes through catabolism, photosynthesis and food webs². Human activities, such as wastewater discharge and changes in land use, can alter DOM components, affecting ecosystem carbon storage, disrupting ecological balance and increasing ecological risks³. It is essential to resolve DOM components to probe into the biogeochemical cycle effectively⁴. Assignment of molecular formula (MF) is a crucial approach to analyzing the chemical composition of DOM using high-resolution mass spectrometry, such as Orbitrap Mass Spectrometry (Orbitrap MS) or Fourier Transform Ion Cyclotron Resonance Mass Spectrometry (FT-ICR MS)^5–7.

In the analysis of DOM using high-resolution mass spectrometry, various MF assignment methods employed, yet users often overlook the reasons for selecting a particular method. However, the assignment results can also vary in terms of DOM component identification⁸. The classic MF assignment method, Formularity, was proposed in 2017, is a software that assigns MF by comparing them with the Woods Hole Oceanographic Institution (WHOI) database⁹. However, it requires integration with a large database and separate calculations for additional indicators, including double bond equivalents (DBE), number of structural carbons (NOSC) and others. Therefore, methods based on different programming languages have been developed, which generate MF libraries through chemical rules and calculating molecular indicators. TRFU method, a MATLAB-based code, improved the calculation of related components and indicators after MF assignment¹⁰. However, the rule of selecting the MF with the fewest heteroatoms when faced with ambiguous peaks had yet to be validated. This led to the development of MFAssignR, an R-based code designed to resolve ambiguous peaks using homologous series¹¹. ICBM-OCEAN (denoted as ICBM) was developed to handle assignments of multiple elements¹². In recent years, methods such as TEnvR and NOMspectra have also been developed^13,14. While commonly used as prerequisite tools for analyzing chemical diversity in DOM samples, these methods may inherently exhibit inconsistencies in MF assignments due to differences in chemical rules for limiting MF libraries (filtering rules), selection rules for ambiguous peaks (selecting the MF when a single peak corresponds to multiple possibilities), and the number of elements set during assignment (elemental limits)^15,16. These rules play a crucial role in the assignment process, influencing the assignment results. Inconsistencies in MF assignments lead to misclassification of DOM components, potentially resulting in a misinterpretation of DOM’s role in biogeochemical cycles. Therefore, selecting methods that provide relatively correct assignment results is essential for subsequent studies¹⁷.

Currently, the evaluation of assignment methods includes two main types: (1) Assigning the environmental sample and comparing the differences in the results across methods. (2) Assigning based on a known chemical formula database and calculating the proportion of correct assignments. Schum et al. compared the assignment accuracy of Composer, Formularity, UltraMassExplorer (UME) and MFAssignR by assigning atmospheric organic aerosol sample data, focusing on the proportion of common MF¹¹. Fu et al. assigned a natural organic matter (NOM)-like emerging chemical dataset with known chemical formulas and evaluated the assignment accuracy of methods such as TRFU, Formularity and WHOI based on the ratio of correctly assigned MF¹⁰. Using both experimental data and theoretical data from a known chemical formula database may allow for a more comprehensive evaluation of each method. However, relying solely on a single indicator to assess the performance of assignment methods may lack comprehensive evaluation. Therefore, it is essential to establish an evaluation framework that combines multiple experimental samples, a known chemical formula database, and various assessment indicators to evaluate method performance.

This study selected six MF assignment methods based on a literature review. Combining the known chemical formula dataset with experimental data, the evaluation framework had been conducted by quantifying six metrics to assess the assignment capability and accuracy of each MF assignment method. The experimental data includes activated sludge samples from seven different dissolved organic carbon (DOC) concentration gradients, representing DOM samples from various environmental conditions. The differences in the elemental limits, the filter and selection rules analyses were considered factors influencing the results of MF assignments. We can minimize the challenges of selecting assignment methods in high-resolution mass spectrometry data by choosing the appropriate MF assignment method. This will lead to a more accurate examination of DOM.

Materials and methods

Selecting process of assignment methods

To track the development of MF assignment methods, this study conducted a literature review using four steps: (1) Searches were performed for the keywords “high-resolution mass spectrometer” and “MF assignment” based on four databases. Web of Science Core Collection (WOSCC: https://www.webofscience.com/) included 679 peer-reviewed papers, Science Citation Index Expanded (Scopus: https://www.scopus.com/) were 1241 papers, China National Knowledge Infrastructure (CNKI: https://www.cnki.net/) and Wanfang Data Knowledge Service Platform (WFKS: https://www.wanfangdata.com.cn/) were 65 papers and 25 papers, respectively. (2) After removing the duplicates, there were 1382 related papers. (3) A total of 65 papers published between 1965 and 2024 referring to MF assignment methods were filtered by refining the title and abstract. (4) The full-text search yielded 25 papers devoted to the MF assignment methods and their principles. The detailed literature screening process was shown in Fig. S1.

Data preparation

To compare the methods of MF assignment, this study created two groups of datasets. The first dataset combined with NORMAN Substance Database (https://www.norman-network.com/nds/) and PubChem Database (https://pubchem.ncbi.nlm.nih.gov/) of 8,719 chemicals with predefined MF as known chemical formula dataset. Chemicals were converted to negative mass-to-charge ratio (m/z) values by subtracting a hydrion (¹H⁺) prior. To prevent intensity from influencing the assignment results, in the analysis, identical intensity was used for all chemicals¹⁰. This is a cross-validation of the dataset, where only known chemical formulas of NOM are considered, while unknown or emerging compounds outside the dataset are not included in the scope. It should be noted that only the correct MF assignments were considered here, without accounting for isomerism or structural ambiguities. Therefore, this evaluation is based on relative consistency, not absolute validation.

The second dataset were obtained from the activated sludge mixture incubation experiments conducted by the authors’ research team at a wastewater treatment plant. These experiments yielded a dataset that covers a broader range of DOC concentrations by varying the temperature, adding carbon sources and using dilution-to-extinction methods to modify biological activity and DOM components. A comprehensive description of the incubation experiments was available in Text S1. The dataset comprised three consecutive samples collected at the start of each incubation experiment, which reflected the rapid changes in DOC concentrations (Table 1). Detailed information about the pre-treatment and testing procedures for the samples can be found in Text S2. In addition, we calculated the diversity and divided the composition of DOM from another two natural water samples. The diversity of the experimental data fell within the range of natural waters, and the composition of DOM showed certain similarities (Fig. S2). Thus, the experiment data were representative and revealed more conditions of DOM.

Table 1.

Conditions of incubation experiments and DOC concentrations.

Experiments	DOC (mg·L^− 1)
High temperature (HT)	6.0-6.8
Low temperature (LT)	5.6–6.7
High temperature + carbon source (HT + C)	761.5–984.8
Low temperature + carbon source (LT + C)	605.0–716.9
Dilution-to-extinction 10 times (D10)	14.8–15.7
Dilution-to-extinction 100 times (D100)	1.1–1.7
Dilution-to-extinction 1000 times (D1000)	0.58–0.61

Open in a new tab

Establishment of evaluation framework

To remove the effect of environmental samples, we utilized three metrics to evaluate the results of each MF assignment method based on a known chemical formula dataset, including similarity ratio (SR), accuracy (A) and correctness (C). The similarity ratio measures the degree of similarity between the assigned MF and the original MF. Accuracy reflects the dispersion of the assignment results, while correctness indicates how much the assignment results deviate from the actual value. These metrics were calculated as shown in Eq. (1).

graphic file with name 41598_2025_87539_Article_Equ1.gif

where FN_ass is the number of assigned formulas, FN_tot is the total number of formulas, and FN_cor is the number of correct assignment formulas.

In this study, we selected the different samples from our experiments in wastewater treatment plants to compare different MF assignment methods. The Bray-Curtis dissimilarity is a commonly used metric for assessing differences in the species composition¹⁸. We used the Bray-Curtis dissimilarity (BC) to characterize the similarity of the assignment results of each MF assignment method in this study, as shown in Eq. (2). Mass error (ME) indicates the method’s effectiveness in MF assignment. Mass accuracy (MA) reveals the overall mass error of the results assigned by the method to show the accuracy of the results¹⁹, as shown in Eq. (3). Chemical diversity (CD) is an essential metric for evaluating the compositional diversity of DOM²⁰. The correctness of the assignment method is reflected by comparing the error of chemical diversity obtained before and after the MF assignment. The error of chemical diversity (CD_e) is calculated across different m/z segments, divided into nine segments with the m/z range from 100 to 1000, shown in Eq. (4). An evaluation framework of the MF assignment methods was formed based on the above metrics. For the assignment method, the ability to assign and the accuracy of the assignment were equally important^8,10,11,16. Normalization of each metric (M_nor) was required to ensure that metrics fall within the interval of 0 to 1, calculated a comprehensive score in each method (S_a), as shown in Eq. (5).

graphic file with name 41598_2025_87539_Article_Equ2.gif

where x_i and y_i denote the ith MF in samples x and y, n is the total number of times of the MFs. The smaller the BC, the more similar the assignment results.

where m is the relative molecular mass of MF, Δm is the value that measured mass (instrument output mass minus adduct mass) minus m, ME_Q3 is the third quartile of the mass error, and ME_Q1 is the first quartile of the mass error. The smaller the MA, the more accurate the assignment results.

graphic file with name 41598_2025_87539_Article_Equ4.gif

where P_i is the relative intensity of the jth m/z in a sample, m is the total number of m/z in a sample, CD_bef is the chemical diversity of the before assignment m/z and intensity, CD_aft is the chemical diversity after assigning m/z and intensity for each MF assignment method. The smaller the CD_e, the higher the correct assignment results.

graphic file with name 41598_2025_87539_Article_Equ5.gif

where X is the value corresponding to each method of an indicator, X_max is the maximum value of the corresponding indicator, and X_min is the minimum value of the indicator. M_{a, nor} is the normalization of metrics used in the method a.

Metrics such as Bray-Curtis dissimilarity, mass accuracy and error in chemical diversity are relative, not absolute. These metrics primarily compare differences between methods rather than definitively measuring the correctness of the assignments. While useful for assessing method precision, they cannot confirm accuracy without external validation.

Setting of impact factors

The elemental limits in the known chemical formula dataset were set to C_1–50N_0–5S_0–3P_0–2¹³C_0–1 with H/C < 2.5 and O/C < 1.2¹⁰. The elemental limits of C in the experiment dataset was set as 50 due to the m/z maximum of 1000^21–24.To assign more MFs, we set H and O as H_1–125O_1–60 via H/C < 2.5 and O/C < 1.2^25,26. N, S and P were set as N_0–3S_0–2P_0–2 according to water filter⁹. Three other cases (Table 2) were summarized through literature to recognize the effect of the elemental limits number on assignment results.

Table 2.

The three additional cases of elemental limits.

Cases	C	H	O	H/C	O/C	N	S	P	¹³C	References
Case I	1–50	2–73	1–26	3.0	1.2	3	2	2	1	^22,40
Case II	1–60	1-120	1–30	2.5	1.2	5	3	2	1	^25,41
Case III	1–40	1–82	1–40	2.5	1.0	3	2	1	1	^42,21

Open in a new tab

In Case I, the hydrogen-to-carbon (H/C) ratio of the original case was adjusted from 2.5 to 3.0. Case II altered the atomic limits for carbon (C), nitrogen (N) and sulfur (S), while maintaining the same H/C and oxygen-to-carbon (O/C) ratios. In Case III, the O/C ratio, the atomic limits for carbon (C) and phosphorus (P) were modified. These cases were identified through a review of the literature.

To investigate how filter and selection rules impact the assignment results, we quantified the incorrect rates of MF assignments based on each set of rules. If more than half of the MF assignment methods obtained the same result, it was deemed relatively correct within the experimental dataset, because consistency among methods does not necessarily imply correctness. If multiple methods share biases, they may produce similar yet inaccurate results. We divided the m/z values of the known chemical formula dataset, ranging from 0 to 1000, into ten segments, each spaced 100 m/z units apart. We selected five m/z values from each segment, resulting in 50 m/z chosen values. We selected three parts to identify representative MF for each experimental sample: (i) The low m/z ranged from 100 to 200 with 15 MFs. (ii)The medium m/z ranged from 450 to 550 with 15 MFs. (iii) The high m/z ranged from 900 to 1000 with 15 MFs (Fig. S3). One sample was selected for each experimental method to summarize the rules for assigning incorrect MF.

Results and discussion

Determination of assignment methods

The development of current assignment methods can be divided into three phases. Rules exploration from 1965 to 2005, MF were assigned one by one based on specific chemical rules. Methods development from 2006 to 2016, methods for batch assignment of MF emerged at this stage. And methods refinement from 2017 to the present (Fig. 1A), the methods emerged were relatively mature and capable of meeting the analytical needs of mass spectrometry data. During the rules exploration phase, the MF was primarily assigned through the Kendrick mass defect (KMD) principle, the nitrogen rule and elemental limits^27,28. With the purpose of the automated assignment tool compound identification algorithm (CIA) by Elizabeth’s team²⁹, the phase of methods development presented seven heuristic golden rules³⁰ and the double bond equivalent minus oxygen (DBE-O) limit³¹ in NOM. CHOFIT algorithm was developed during this period³². Until the introduction of Formularity in 2017, the MF assignment methods have been continuously refined, primarily including HaloSeeker 1.0³³, UME, ICBM¹², MFAssignR¹¹, TRFU¹⁰, QualAnalysis³⁴, TEnvR¹³, NOMspectra¹⁴ and FTMS-Analysis³⁵ (Fig. 1B). In the field of NOM, a mass range of 0–1000 was utilized. Formularity (v1.0.8476), TRFU, ICBM, MFAssignR, TEnvR and NOMspectra, these open-source codes were selected for subsequent comparison.

Fig. 1 — Development of assignment methods. (A) Number of publications between 1965 and 2024. (B) Assignment rules and methods under key time points.

The variability in assignment results of these MF assignment methods was mainly due to the conditions for the elemental limits, the filter rules for the assignable MF, and the selection rules for ambiguous peaks (Table 3), detailed descriptions can be found in Text S3. For elemental limits, TRFU and MFAssignR did not limit the number of H and O but limited H/C and O/C. Formularity, ICBM and NOMspectra were not limited to H/C and O/C value. For filter rules, Formularity filtered MF by chemistry rules, TRFU added the DBE-O limit, and further, MFAssignR added the separate filter. ICBM and TEnvR also have individual filters, while ICBM added NSP rules. Filter rules of NOMspectra included DBE-O limit and nitrogen rule. Formularity and TRFU selected the formula with the lowest number of heteroatoms (N, S and P) and lower errors for selection rules. Still, TRFU included lower black C (AI_mod > 0.66 and C > 15). ICBM, MFassignR and TEnvR chose the highest number of homologous sequences, and NOMspectra chose the lowest mass error. Current assignment methods focused solely on MF assignment, without considering the structure of the formula. Therefore, this study addressed the correctness of the MF assignment but did not include structural validation.

Table 3.

Overview of the rules for each method.

Method	Elemental limits	Filter rules	Selection rules	Cited	References
Formularity	C, H, O, N, S, P	Chemical rules	Lower of heteroatoms, Lower errors	173	Tolic et al.⁹
TRFU/FuJHA	C, N, S, P, H/C, O/C	Chemical rules, DBE-O limit	Lower black carbon, Lower of heteroatoms, Lower errors	114	Fu et al.¹⁰
ICBM-OCEAN	C, H, O, N, S, P	NSP ruler, separate filter	Homologous sequence	83	Merder et al.¹²
MFAssignR	O, N, S, P, H/C, O/C	Chemical rules, DBE-O limit, separate filter	Homologous sequence	69	Schum et al.¹¹
TEnvR	C, H, O, N, S, P, H/C, O/C	Separate filter	Homologous sequence, simpler MF, lower errors	8	Goranov et al.¹³
NOMspectra	C, H, O, N, S, P	Nitrogen rule*, DBE-O limit	Lower errors	1	Volikov et al.¹⁴

Open in a new tab

More detailed information about each of the rules among MF assignment methods was shown in Tables S2 and S3. Literature cited as of 2024.9.30. Nitrogen rule* unlike the nitrogen rule, see Table S3 for details.

Variability of assignment results among different methods

Comprehensive scores for Formularity and TRFU were significantly higher than those of the other methods in each case (Fig. 2). In the original case, under the high temperature + carbon source (HT + C) and low temperature + carbon source (LT + C) conditions with higher DOC concentrations, Formularity scored 4.22 and 4.34, higher than TRFU (Fig. 2A). In case I, at the conditions of high temperature (HT), low temperature (LT) and dilution-to-extinction 1000 times (D1000), scores of TRFU were the highest with 4.20, 4.01 and 3.96 (Fig. 2B), respectively. In cases II and III, Formularity showed the best results in all experimental conditions (Fig. 2C, D).

Formularity was more straightforward in filter rules, included only basic chemistry rules such as seven golden rules and nitrogen rule⁹ (Table S2) and did not lead to incorrect MF assignments. TRFU removed some MFs through the DBE-O limit, which led to up to 11% incorrect MF assignments. ICBM, MFAssignR and TEnvR have different filters that removed total MFs of 47%±18%. NSP rule of ICBM resulted in a 6%±5% incorrect rate. The DBE-O limit resulted in up to a 7% error rate for MFAssignR. The nitrogen rule and fixed elemental limit (H/C < 2.2 and O/C < 1.0) resulted in up to 15% and 24% error rates in NOMspectra, respectively. In selection rules, the incorrect rate of TRFU was up 4% through the lower heteroatoms, which was lower than that of Formularity in 9% ± 8%. The incorrect rates of ICBM and MFAssignR were up to 8% due to homologous sequences, while TEnvR had an incorrect rate of up to 13% due to simpler MFs. The highest incorrect rate in NOMspectra was 31% ± 10% due to lower errors (Table 4). Stricter filter rules reduced the incorrect rate caused by selection rules.

Table 4.

Each rule resulted in totally incorrect rates of assigning MF for each method.

Rules		Formularity	TRFU	ICBM	MFAssignR	NOMspectra	TEnvR
Filter rules	Nitrogen rule*	–	–	–	–	0–15%	–
	NSP rules	–	–	0–17%	–	–	–
	DBE-O limit	–	0–11%	–	0–7%	–	–
	Separate filter	–	–	0–48%	26–40%	–	26–70%
	Elemental limits	/	/	/	/	0–24%	/
Selection rules	Lower of heteroatoms	0–26%	0–4%	–	–	–	–
	Homologous sequence	–	–	0–8%	0–8%	–	0–4%
	Lower errors	/	/	–	–	28–48%	/
	Simpler MF	–	–	–	–	/	0–13%
	Lower black carbon	–	0–4%	–	–	–	–

Open in a new tab

Calculation of the experimental dataset is shown in Table 6. The dataset used to assess incorrect rates was summarized in Tables S5–S11. The slash indicated that the rule in the method did not result in an error rate. The short horizontal line represented that the technique lacked the rule. The bolded numbers indicated the incorrect rates due to unassigned MF, while the unbolded numbers represented the incorrect rates due to misassigned MF.

Comparison of evaluation parameters

The similarity rates of Formularity, TRFU and NOMspectra were high at 99.5%, 93.7% and 93%, respectively (Fig. 3A). These three methods equipped with high assignment ability in lower unassignment rates ranged from 0 to 14% (Table 5), because more stringent filter rules led to lower similarity rates³⁶. TEnvR, ICBM and MFAssignR achieved high accuracy at 98%, 96% and 96%, respectively (Fig. 3B). The ratios of assignment rate and correct assignment rate in TEnvR, ICBM and MFAssignR, with 0.89, 0.91 and 1.28, were higher among these methods (Table 5). Correctness of Formularity, TRFU and MFAssignR was higher at 87%, 86% and 83%, respectively (Fig. 3C). The correct assignment rate of these three methods ranged from 62% to 76% (Table 5). Methods with separate filter, such as TEnvR, ICBM and MFAssignR, may achieve higher accuracy due to this feature. Although these methods exhibited the highest unassigned error rates of up to 47%±18%, the resulting error rate due to incorrect MF assignments was only 13%, much lower than that of other methods (Table 6).

Table 5.

Influence rules for known chemical formula dataest assignment results.

Rules	Detailed rules	Formularity	TRFU	ICBM	MFAssign	NOMspectra	TEnvR
Rules	Detailed rules	Rate at which rules lead to incorrect assignment results (%)
Filter rules	Nitrogen rule*	–	–	–	–	8	–
	NSP Rules	–	–	10 + 8	–	–	–
	DBE-O limit	–	8 + 4	–	4	/	–
	Separate filter	–	–	2	24	–	6
	Elemental limits	/	/	/	/	2	/
Selection rules	Lower of heteroatoms	12	12	–	–	–	–
	Homologous sequence	–	–	8	8	/	8
	Lower errors	/	/	–	–	12	/
	Simpler MF	–	–	–	–	–	8
	Lower black carbon	–	4	–	–	–	–
Else		6 + 6	10	/	/	6	/

Open in a new tab

The dataset used to assess incorrect rates was summarized in Table S4. The slash indicated that the rule in the method did not cause error rate. The short horizontal line represented that the method did not have the rule. Bolded and unbolded numbers were the incorrect rates due to unassigned MF and misassigned MF, respectively. Else is an assignment error caused by an isotope or an unidentifiable cause.

Table 6.

Each rule resulted in incorrect rates of assigning MF for each method in each experimental condition.

Rules		Formularity	TRFU	ICBM	MFAssignR	NOMspectra	TEnvR
Rules		Rates of rules leading to incorrect assignment results in seven experimental conditions. (%)
Filter rules	Nitrogen rule*	–	–	–	–	15,0,0,7,0,0,10	–
	NSP rules	–	–	11,4,0,7,0,0,15	–	–	–
	DBE-O limit	–	11,0,6,4,0,0,5	–	4,0,6,4,0,0,5	–	–
	Separate filter	–	–	11,12,12,48,8,3,0	26,32,27,30,34,15,25	–	41,64,42,36,63,21,70
	Elemental limits	/	/	/	/	15,24,3,0,0,0,0	/
Selection rules	Lower of heteroatoms	26,12,3,0,13,3,5	0,0,0,0,4,0,0	–	–	–	–
	Homologous sequence	–	–	0,0,6,0,0,3,0	4,8,6,0,0,3,0	–	0,0,0,4,0,0,0
	Lower errors	/	/	–	–	45,28,39,34,34,12,25	/
	Simpler MF	–	–	–	–	/	7,0,6,0,13,0,0
	lower black carbon	–	0,4,0,0,0,0,0	–	–	–	–

Open in a new tab

Which are the results under seven different conditions, HT, LT, HT + C, LT + C, D10, D100, D1000, respectively. The slash indicated that the rule in the method did not result in an error rate. The short horizontal line represented that the technique lacked the rule. The bolded numbers indicated the incorrect rates due to unassigned MF, while the unbolded numbers represented the incorrect rates due to misassigned MF.

The relative accuracy of MF assignment methods was compared by assessing the similarity of assignment results across different methods. Filter and selection rules affected the ability of MF assignment and the accuracy of assignment results³⁷. The average Bray-Curtis distances between Formularity, TRFU and NOMspectra were smaller than other methods (Fig. 4A). Specifically, the TRFU and Formularity in average Bray-Curtis dissimilarity were the lowest among the six MF assignment methods, with 0.13 and 0.14, respectively (Fig. 4B). The incorrect rates of 0% in Formularity and up to 8% in TRFU were the lowest among all methods using filter rules (Table 4). The average mass accuracy of NOMspectra was the smallest, from 0.38 to 0.46. In comparison, the average mass accuracy of Formularity and TRFU were higher, ranging from 0.58 to 0.92 and 0.63 to 0.98, respectively (Fig. 5). NOMspectra selected the MF with lower error, leading the lower mass accuracy³⁸. The mass error was considered the heteroatoms⁹, and Formularity and TRFU chose the MF with the lower heteroatoms in selection rules to prove the mass accuracy (Tables 4 and 6). The average error of chemical diversity was lowest in NOMspectra, Formularity and TRFU (Fig. 6A), with values ranging from 0.15 to 0.36, 0.14 to 0.39 and 0.15 to 0.39 in m/z of 100–1000, respectively (Fig. 6B). The incorrect rates of filter and selection rules in Formularity and TRFU ranged from 0% to 26% and 0% to 19%, which was lower than other methods (Tables 4 and 6). Overall, Formularity and TRFU achieved the highest overall scores by demonstrating strong partitioning capability through the use of chemical rules as filtering rules. This allowed them to gain a significant advantage in similarity rates and Bray-Curtis distance metrics, resulting in an error rate of 0% due to unassigned formulas. By combining the selection rule of minimizing heteroatoms, they achieved a higher correct assignment ratio, gaining a significant advantage in correctness and chemical diversity error, with average correct assignment ratios of 91.4% and 96.4% (Table 6). In contrast, due to different nitrogen rule and selection rules, NOMspectra obtained a 31%±10% assignment error rate, leading to lower scores in accuracy and correctness, and consequently a lower correct assignment results.

Fig. 4 — Evaluation metrics about average Bray-Curtis dissimilarity. (A) Bray-Curtis dissimilarity in seven experimental conditions. (B) Average Bray-Curtis dissimilarity of each MF assignment method.

Fig. 5 — Average mass accuracy of each MF assignment method.

Fig. 6 — Evaluation metrics about chemical diversity. (A) Error of chemical diversity in seven experimental conditions. (B) Average error of chemical diversity of each MF assignment method.

By using a known chemical formula dataset and experimental data, we obtained consistent DOM evaluation results, which mutually validated each other and were almost identical. This reflected the integrity of the evaluation framework. This work can guide further research on molecular formula matching in the future.

It is important to emphasize that all evaluations in this study were based solely on MF assignment. We did not consider the structural aspects of these formulas or account for isomeric variations. Further analysis using techniques such as Liquid Chromatography-Tandem Mass Spectrometry (LC-MS/MS) or Nuclear Magnetic Resonance (NMR) can be performed to confirm the accuracy of the assigned molecular structures³⁹.

Conclusions

Considering both assigning capacity and accuracy, Formularity and TRFU are the most suitable assignment methods. These two methods show better similarity and correctness than others in assignment results. ICBM, MFAssignR and TEnvR have lower assignment capacities but higher accuracy than Formularity and TRFU, and these methods can achieve accuracy in assignment results as high as 96–98%. The assignment capability of NOMspectra is similar to that of Formularity and TRFU, but its accuracy is lower than that of the two methods. This study recommends choosing the most suitable analytical method for DOM components in aquatic ecosystems.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1^{(615.5KB, docx)}

Acknowledgements

This study was primarily supported by the National Key Research and Development Program (2022YFC3203504) and the National Natural Science Foundation of China (Project No. U2340222). It was also supported by general research funds from the Chongqing Science and Technology Bureau (CSTB2022NSCQ-MSX0599). We thank Dr. Chen He from China University of Petroleum-Beijing for her valuable support in ensuring the thoroughness and accuracy of our methods.

Author contributions

Zhiyuan Deng: Formal analysis, Writing—Original Draft, Visualization. Xinxia He: Methodology, Validation, Writing—Review & Editing. Wenjuan Ouyang: Validation, Data curation. Hao Ying: Methodology, Validation, Resources. Qiong Tang: Investigation, Data curation. Dianchang Wang: Funding Acquisition, Project administration. Chong Li: Investigation, Funding Acquisition. Yao Chen: Supervision. Zhe Li: Conceptualization, Supervision, Writing review & editing, Funding Acquisition.

Data availability

Data are available upon request to the corresponding author.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Xinxia He, Email: hexinxia@cigit.ac.cn.

Zhe Li, Email: lizhe@cigit.ac.cn.

References

1.Zhang, X. et al. Application of Fourier transform ion cyclotron resonance mass spectrometry in deciphering molecular composition of soil organic matter: a review. Sci. Total Environ.756, 144140 (2021). [DOI] [PubMed] [Google Scholar]
2.Panneer Selvam, B. et al. Photo-reactivity of dissolved organic carbon in the freshwater continuum. Aquat. Sci.81, 57 (2019). [Google Scholar]
3.Jiao, N. et al. Microbial production of recalcitrant dissolved organic matter: long-term carbon storage in the global ocean. Nat. Rev. Microbiol.8, 593–599 (2010). [DOI] [PubMed] [Google Scholar]
4.Shi, W., Zhuang, W. E., Hur, J. & Yang, L. Monitoring dissolved organic matter in wastewater and drinking water treatments using spectroscopic analysis and ultra-high resolution mass spectrometry. Water Res.188, 116406 (2020). [DOI] [PubMed] [Google Scholar]
5.Hertkorn, N. et al. Natural organic matter and the event horizon of mass spectrometry. Anal. Chem.80, 8908–8919 (2008). [DOI] [PubMed] [Google Scholar]
6.Qi, Y. & O’Connor, P. B. Data processing in Fourier transform ion cyclotron resonance mass spectrometry. Mass. Spectrom. Rev.33, 333–352 (2014). [DOI] [PubMed] [Google Scholar]
7.Zubarev, R. A. & Makarov, A. Orbitrap mass spectrometry. Anal. Chem.85, 5288–5296 (2013). [DOI] [PubMed] [Google Scholar]
8.Pailler, L., Renard, P., Nicol, E., Deguillaume, L. & Bianco, A. How well do we handle the sample preparation, FT-ICR mass spectrometry analysis, and data treatment of atmospheric waters? Molecules27, 1 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Tolić, N. et al. Formularity: Software for automated formula assignment of natural and other organic matter from ultrahigh-resolution mass spectra. Anal. Chem.89, 12659–12665 (2017). [DOI] [PubMed] [Google Scholar]
10.Fu, Q. L., Fujii, M. & Riedel, T. Development and comparison of formula assignment algorithms for ultrahigh-resolution mass spectra of natural organic matter. Anal. Chim. Acta1125, 247–257 (2020). [DOI] [PubMed] [Google Scholar]
11.Schum, S. K., Brown, L. E., Mazzoleni, L. R. & MFAssignR Molecular formula assignment software for ultrahigh resolution mass spectrometry analysis of environmental complex mixtures. Environ. Res.191, 110114 (2020). [DOI] [PubMed] [Google Scholar]
12.Merder, J. et al. ICBM-OCEAN: Processing ultrahigh-resolution mass spectrometry data of complex molecular mixtures. Anal. Chem.92, 6832–6838 (2020). [DOI] [PubMed] [Google Scholar]
13.Goranov, A. I., Sleighter, R. L. & Yordanov, D. A. TEnvR: MATLAB-based toolbox for environmental research. Anal. Methods15, 5390–5400 (2023). [DOI] [PubMed] [Google Scholar]
14.Volikov, A. B., Rukhovich, G. D., Perminova, I. V. & NOMspectra An open-source python package for processing high resolution mass spectrometry data on natural organic matter. J. Am. Soc. Mass. Spectrom.1, 1 (2023). [DOI] [PubMed] [Google Scholar]
15.Boiko, D. A., Kozlov, K. S., Burykina, J. V., Ilyushenkova, V. V. & Ananikov, V. P. Fully automated unconstrained analysis of high-resolution mass spectrometry data with machine learning. J. Am. Chem. Soc.144, 14590–14606 (2022). [DOI] [PubMed] [Google Scholar]
16.Fakouri Baygi, S. et al. Comparison between automated and user-interactive non-targeted screening tools: isotopic profile deconvoluted chromatogram (IPDC) algorithm and HaloSeeker 1.0. Int. J. Environ. Sci. Technol.19, 10535–10546 (2022). [Google Scholar]
17.Gao, S. et al. Detection and exclusion of false-positive molecular formula assignments via mass error distributions in UHR mass spectra of natural organic matter. Anal. Chem.96, 10210–10218 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Rosselló-Mora, R. & Amann, R. The species concept for prokaryotes. FEMS Microbiol. Rev.25, 39–67 (2001). [DOI] [PubMed] [Google Scholar]
19.Simon, C., Roth, V. N., Dittmar, T. & Gleixner, G. Molecular signals of heterogeneous terrestrial environments identified in dissolved organic matter: a comparative analysis of orbitrap and ion cyclotron resonance mass spectrometers. Front. Earth Sci.6, 16 (2018). [Google Scholar]
20.Kellerman, A. M., Dittmar, T., Kothawala, D. N. & Tranvik, L. J. Chemodiversity of dissolved organic matter in lakes driven by climate and hydrology. Nat. Commun.5, 3804 (2014). [DOI] [PubMed] [Google Scholar]
21.Li, W. et al. Chemodiversity of dissolved organic matter is governed by microbial biogeography in inland waters. Environ. Sci. Technol.57, 7753–7763 (2023). [DOI] [PubMed] [Google Scholar]
22.Qu, L. et al. Spatial variation of anthropogenic disturbances within watersheds determines dissolved organic matter composition exported to oceans. Water Res.262, 122084 (2024). [DOI] [PubMed] [Google Scholar]
23.Pan, B., Liu, S., Wang, Y., Li, D. & Li, M. FT-ICR-MS combined with fluorescent spectroscopy reveals the driving mechanism of the spatial variation in molecular composition of DOM in 22 plateau lakes. Environ. Res.232, 116272 (2023). [DOI] [PubMed] [Google Scholar]
24.Liu, Z. et al. Enrichment of geogenic phosphorus in a coastal groundwater system: new insights from dissolved organic matter characterization. Chemosphere322, 138214 (2023). [DOI] [PubMed] [Google Scholar]
25.Cai, S. et al. Dissolved organic matter transformation mechanisms and process optimization of wastewater sludge hydrothermal humification treatment for producing plant biostimulants. Water Res.235, 119910 (2023). [DOI] [PubMed] [Google Scholar]
26.Li, X., Guo, M., Wang, Y., Liu, G. & Fu, Q. Molecular insight into the release of phosphate from dissolved organic phosphorus photo-mineralization in shallow lakes based on FT-ICR MS analysis. Water Res.222, 118859 (2022). [DOI] [PubMed] [Google Scholar]
27.Raznikov, V. V. & Raznikova, M. O. Assignment of the mass number to peaks in high-resolution mass spectra acquired via the counting of ions. Int. J. Mass Spectrom.85, 1–21 (1988). [Google Scholar]
28.Fievre, A., Solouki, T., Marshall, A. G. & Cooper, W. T. High-resolution Fourier transform ion cyclotron resonance mass spectrometry of humic and fulvic acids by laser desorption/ionization and electrospray ionization. Energy Fuels11, 554–560 (1997). [Google Scholar]
29.Kujawinski, E. B. & Behn, M. D. Automated analysis of electrospray ionization fourier transform ion cyclotron resonance mass spectra of natural organic matter. Anal. Chem.78, 4363 (2006). [DOI] [PubMed] [Google Scholar]
30.Kind, T. & Fiehn, O. Seven Golden rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry. BMC Bioinform.8, 105 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Herzsprung, P. et al. Understanding molecular formula assignment of Fourier transform ion cyclotron resonance mass spectrometry data of natural organic matter from a chemical point of view. Anal. Bioanal. Chem.406, 7977–7987 (2014). [DOI] [PubMed] [Google Scholar]
32.Green, N. W. & Perdue, E. M. Fast graphically inspired algorithm for assignment of molecular formulae in ultrahigh resolution mass spectrometry. Anal. Chem.87, 5086–5094 (2015). [DOI] [PubMed] [Google Scholar]
33.Léon, A. et al. HaloSeeker 1.0: a user-friendly software to highlight halogenated chemicals in nontargeted high-resolution mass spectrometry data sets. Anal. Chem.91, 3500–3507 (2019). [DOI] [PubMed] [Google Scholar]
34.Ceolotto, N., Zangrando, R., Orlando, S. & Gambaro, A. QualAnalysis, a new tool in environmental chemistry for a faster qualitative analysis. Chemometr. Intell. Lab. Syst.1, 1 (2022). [Google Scholar]
35.Fu, Q. L., Fujii, M. & Ma, R. Development of a gaussian-based alignment algorithm for the ultrahigh-resolution mass spectra of dissolved organic matter. Anal. Chem.95, 2796–2803 (2023). [DOI] [PubMed] [Google Scholar]
36.Sarycheva, A., Perminova, I. V., Nikolaev, E. N. & Zherebker, A. Formulae differences commence a database for interlaboratory studies of natural organic matter. Environ. Sci. Technol.57, 6238–6247 (2023). [DOI] [PubMed] [Google Scholar]
37.He, C., He, D., Chen, C. & Shi, Q. Application of Fourier transform ion cyclotron resonance mass spectrometry in molecular characterization of dissolved organic matter. Sci. China Earth Sci.65, 2219–2236 (2022). [Google Scholar]
38.Potemkin, A. A., Proskurnin, M. A. & Volkov, D. S. Noise filtering algorithm using gaussian mixture models for high-resolution mass spectra of natural organic matter. Anal. Chem.96, 5455–5461 (2024). [DOI] [PubMed] [Google Scholar]
39.DiDonato, N. et al. Improved characterization of soil organic matter by integrating FT-ICR MS, liquid chromatography tandem mass spectrometry, and molecular networking: a case study of root litter decay under drought conditions. Anal. Chem.96, 11699–11706 (2024). [DOI] [PubMed] [Google Scholar]
40.Lv, L. et al. Unique dissolved organic matter molecules and microbial communities in rhizosphere of three typical crop soils and their significant associations based on FT-ICR-MS and high-throughput sequencing analysis. Sci. Total Environ.919, 170904 (2024). [DOI] [PubMed] [Google Scholar]
41.Liu, H. et al. Effects of reservoir construction on optical and molecular characteristics of dissolved organic matter in a typical P-contaminated river. Chemosphere341, 140037 (2023). [DOI] [PubMed] [Google Scholar]
42.Catalán, N. et al. Treeline displacement may affect lake dissolved organic matter processing at high latitudes and altitudes. Nat. Commun.15, 2640 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1^{(615.5KB, docx)}

Data Availability Statement

Data are available upon request to the corresponding author.

[CR1] 1.Zhang, X. et al. Application of Fourier transform ion cyclotron resonance mass spectrometry in deciphering molecular composition of soil organic matter: a review. Sci. Total Environ.756, 144140 (2021). [DOI] [PubMed] [Google Scholar]

[CR2] 2.Panneer Selvam, B. et al. Photo-reactivity of dissolved organic carbon in the freshwater continuum. Aquat. Sci.81, 57 (2019). [Google Scholar]

[CR3] 3.Jiao, N. et al. Microbial production of recalcitrant dissolved organic matter: long-term carbon storage in the global ocean. Nat. Rev. Microbiol.8, 593–599 (2010). [DOI] [PubMed] [Google Scholar]

[CR4] 4.Shi, W., Zhuang, W. E., Hur, J. & Yang, L. Monitoring dissolved organic matter in wastewater and drinking water treatments using spectroscopic analysis and ultra-high resolution mass spectrometry. Water Res.188, 116406 (2020). [DOI] [PubMed] [Google Scholar]

[CR5] 5.Hertkorn, N. et al. Natural organic matter and the event horizon of mass spectrometry. Anal. Chem.80, 8908–8919 (2008). [DOI] [PubMed] [Google Scholar]

[CR6] 6.Qi, Y. & O’Connor, P. B. Data processing in Fourier transform ion cyclotron resonance mass spectrometry. Mass. Spectrom. Rev.33, 333–352 (2014). [DOI] [PubMed] [Google Scholar]

[CR7] 7.Zubarev, R. A. & Makarov, A. Orbitrap mass spectrometry. Anal. Chem.85, 5288–5296 (2013). [DOI] [PubMed] [Google Scholar]

[CR8] 8.Pailler, L., Renard, P., Nicol, E., Deguillaume, L. & Bianco, A. How well do we handle the sample preparation, FT-ICR mass spectrometry analysis, and data treatment of atmospheric waters? Molecules27, 1 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Tolić, N. et al. Formularity: Software for automated formula assignment of natural and other organic matter from ultrahigh-resolution mass spectra. Anal. Chem.89, 12659–12665 (2017). [DOI] [PubMed] [Google Scholar]

[CR10] 10.Fu, Q. L., Fujii, M. & Riedel, T. Development and comparison of formula assignment algorithms for ultrahigh-resolution mass spectra of natural organic matter. Anal. Chim. Acta1125, 247–257 (2020). [DOI] [PubMed] [Google Scholar]

[CR11] 11.Schum, S. K., Brown, L. E., Mazzoleni, L. R. & MFAssignR Molecular formula assignment software for ultrahigh resolution mass spectrometry analysis of environmental complex mixtures. Environ. Res.191, 110114 (2020). [DOI] [PubMed] [Google Scholar]

[CR12] 12.Merder, J. et al. ICBM-OCEAN: Processing ultrahigh-resolution mass spectrometry data of complex molecular mixtures. Anal. Chem.92, 6832–6838 (2020). [DOI] [PubMed] [Google Scholar]

[CR13] 13.Goranov, A. I., Sleighter, R. L. & Yordanov, D. A. TEnvR: MATLAB-based toolbox for environmental research. Anal. Methods15, 5390–5400 (2023). [DOI] [PubMed] [Google Scholar]

[CR14] 14.Volikov, A. B., Rukhovich, G. D., Perminova, I. V. & NOMspectra An open-source python package for processing high resolution mass spectrometry data on natural organic matter. J. Am. Soc. Mass. Spectrom.1, 1 (2023). [DOI] [PubMed] [Google Scholar]

[CR15] 15.Boiko, D. A., Kozlov, K. S., Burykina, J. V., Ilyushenkova, V. V. & Ananikov, V. P. Fully automated unconstrained analysis of high-resolution mass spectrometry data with machine learning. J. Am. Chem. Soc.144, 14590–14606 (2022). [DOI] [PubMed] [Google Scholar]

[CR16] 16.Fakouri Baygi, S. et al. Comparison between automated and user-interactive non-targeted screening tools: isotopic profile deconvoluted chromatogram (IPDC) algorithm and HaloSeeker 1.0. Int. J. Environ. Sci. Technol.19, 10535–10546 (2022). [Google Scholar]

[CR17] 17.Gao, S. et al. Detection and exclusion of false-positive molecular formula assignments via mass error distributions in UHR mass spectra of natural organic matter. Anal. Chem.96, 10210–10218 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Rosselló-Mora, R. & Amann, R. The species concept for prokaryotes. FEMS Microbiol. Rev.25, 39–67 (2001). [DOI] [PubMed] [Google Scholar]

[CR19] 19.Simon, C., Roth, V. N., Dittmar, T. & Gleixner, G. Molecular signals of heterogeneous terrestrial environments identified in dissolved organic matter: a comparative analysis of orbitrap and ion cyclotron resonance mass spectrometers. Front. Earth Sci.6, 16 (2018). [Google Scholar]

[CR20] 20.Kellerman, A. M., Dittmar, T., Kothawala, D. N. & Tranvik, L. J. Chemodiversity of dissolved organic matter in lakes driven by climate and hydrology. Nat. Commun.5, 3804 (2014). [DOI] [PubMed] [Google Scholar]

[CR21] 21.Li, W. et al. Chemodiversity of dissolved organic matter is governed by microbial biogeography in inland waters. Environ. Sci. Technol.57, 7753–7763 (2023). [DOI] [PubMed] [Google Scholar]

[CR22] 22.Qu, L. et al. Spatial variation of anthropogenic disturbances within watersheds determines dissolved organic matter composition exported to oceans. Water Res.262, 122084 (2024). [DOI] [PubMed] [Google Scholar]

[CR23] 23.Pan, B., Liu, S., Wang, Y., Li, D. & Li, M. FT-ICR-MS combined with fluorescent spectroscopy reveals the driving mechanism of the spatial variation in molecular composition of DOM in 22 plateau lakes. Environ. Res.232, 116272 (2023). [DOI] [PubMed] [Google Scholar]

[CR24] 24.Liu, Z. et al. Enrichment of geogenic phosphorus in a coastal groundwater system: new insights from dissolved organic matter characterization. Chemosphere322, 138214 (2023). [DOI] [PubMed] [Google Scholar]

[CR25] 25.Cai, S. et al. Dissolved organic matter transformation mechanisms and process optimization of wastewater sludge hydrothermal humification treatment for producing plant biostimulants. Water Res.235, 119910 (2023). [DOI] [PubMed] [Google Scholar]

[CR26] 26.Li, X., Guo, M., Wang, Y., Liu, G. & Fu, Q. Molecular insight into the release of phosphate from dissolved organic phosphorus photo-mineralization in shallow lakes based on FT-ICR MS analysis. Water Res.222, 118859 (2022). [DOI] [PubMed] [Google Scholar]

[CR27] 27.Raznikov, V. V. & Raznikova, M. O. Assignment of the mass number to peaks in high-resolution mass spectra acquired via the counting of ions. Int. J. Mass Spectrom.85, 1–21 (1988). [Google Scholar]

[CR28] 28.Fievre, A., Solouki, T., Marshall, A. G. & Cooper, W. T. High-resolution Fourier transform ion cyclotron resonance mass spectrometry of humic and fulvic acids by laser desorption/ionization and electrospray ionization. Energy Fuels11, 554–560 (1997). [Google Scholar]

[CR29] 29.Kujawinski, E. B. & Behn, M. D. Automated analysis of electrospray ionization fourier transform ion cyclotron resonance mass spectra of natural organic matter. Anal. Chem.78, 4363 (2006). [DOI] [PubMed] [Google Scholar]

[CR30] 30.Kind, T. & Fiehn, O. Seven Golden rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry. BMC Bioinform.8, 105 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Herzsprung, P. et al. Understanding molecular formula assignment of Fourier transform ion cyclotron resonance mass spectrometry data of natural organic matter from a chemical point of view. Anal. Bioanal. Chem.406, 7977–7987 (2014). [DOI] [PubMed] [Google Scholar]

[CR32] 32.Green, N. W. & Perdue, E. M. Fast graphically inspired algorithm for assignment of molecular formulae in ultrahigh resolution mass spectrometry. Anal. Chem.87, 5086–5094 (2015). [DOI] [PubMed] [Google Scholar]

[CR33] 33.Léon, A. et al. HaloSeeker 1.0: a user-friendly software to highlight halogenated chemicals in nontargeted high-resolution mass spectrometry data sets. Anal. Chem.91, 3500–3507 (2019). [DOI] [PubMed] [Google Scholar]

[CR34] 34.Ceolotto, N., Zangrando, R., Orlando, S. & Gambaro, A. QualAnalysis, a new tool in environmental chemistry for a faster qualitative analysis. Chemometr. Intell. Lab. Syst.1, 1 (2022). [Google Scholar]

[CR35] 35.Fu, Q. L., Fujii, M. & Ma, R. Development of a gaussian-based alignment algorithm for the ultrahigh-resolution mass spectra of dissolved organic matter. Anal. Chem.95, 2796–2803 (2023). [DOI] [PubMed] [Google Scholar]

[CR36] 36.Sarycheva, A., Perminova, I. V., Nikolaev, E. N. & Zherebker, A. Formulae differences commence a database for interlaboratory studies of natural organic matter. Environ. Sci. Technol.57, 6238–6247 (2023). [DOI] [PubMed] [Google Scholar]

[CR37] 37.He, C., He, D., Chen, C. & Shi, Q. Application of Fourier transform ion cyclotron resonance mass spectrometry in molecular characterization of dissolved organic matter. Sci. China Earth Sci.65, 2219–2236 (2022). [Google Scholar]

[CR38] 38.Potemkin, A. A., Proskurnin, M. A. & Volkov, D. S. Noise filtering algorithm using gaussian mixture models for high-resolution mass spectra of natural organic matter. Anal. Chem.96, 5455–5461 (2024). [DOI] [PubMed] [Google Scholar]

[CR39] 39.DiDonato, N. et al. Improved characterization of soil organic matter by integrating FT-ICR MS, liquid chromatography tandem mass spectrometry, and molecular networking: a case study of root litter decay under drought conditions. Anal. Chem.96, 11699–11706 (2024). [DOI] [PubMed] [Google Scholar]

[CR40] 40.Lv, L. et al. Unique dissolved organic matter molecules and microbial communities in rhizosphere of three typical crop soils and their significant associations based on FT-ICR-MS and high-throughput sequencing analysis. Sci. Total Environ.919, 170904 (2024). [DOI] [PubMed] [Google Scholar]

[CR41] 41.Liu, H. et al. Effects of reservoir construction on optical and molecular characteristics of dissolved organic matter in a typical P-contaminated river. Chemosphere341, 140037 (2023). [DOI] [PubMed] [Google Scholar]

[CR42] 42.Catalán, N. et al. Treeline displacement may affect lake dissolved organic matter processing at high latitudes and altitudes. Nat. Commun.15, 2640 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Evaluation of the current methods for assigning molecular formulas to dissolved organic matter using high resolution mass spectrometry

Zhiyuan Deng

Xinxia He

Wenjuan Ouyang

Hao Ying

Qiong Tang

Dianchang Wang

Chong Li

Yao Chen

Zhe Li

Abstract

Supplementary Information

Introduction

Materials and methods

Selecting process of assignment methods

Data preparation

Table 1.

Establishment of evaluation framework

Setting of impact factors

Table 2.

Results and discussion

Determination of assignment methods

Fig. 1.

Table 3.

Variability of assignment results among different methods

Fig. 2.

Table 4.

Comparison of evaluation parameters

Fig. 3.

Table 5.

Table 6.

Fig. 4.

Fig. 5.

Fig. 6.

Conclusions

Electronic supplementary material

Acknowledgements

Author contributions

Data availability

Declarations

Competing interests

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases