Skip to main content
. 2022 May 24;50(W1):W165–W174. doi: 10.1093/nar/gkac383

Table 1.

Summary of Compound Identification results using the CFM-ID 4.0 web server and different scoring options. Searches were performed against the HMDB 5.0 database (∼250,000 compounds) and a subset of PubChem (∼2.1 million compounds). The median number of candidates with the same parent ion mass is listed under ‘Candidate Median’. The Rank indicates the position in the list of spectral hits where the correct compound was found. The percentage of compounds with the correct molecular formula based on the top ranked hit (even with the incorrect structure) is given in ‘% Correct formula for the first hit’

HMDB 5.0 PubChem
[M + H]+ [M-H]- [M + H]+ [M-H]-
Candidate Median 9 10 2764 1624
CFM-ID 4.0 (dot product) Rank = 1 48.10% 38.40% 7.01% 6.30%
Rank Inline graphic 5 89.50% 82.30% 20.70% 19.50%
Rank Inline graphic 10 95.80% 93.70% 29.94% 30.10%
% Correct formula for the first hit 98.20% 97.80% 90.76% 82.70%
CFM-ID 4.0 (Dice) Rank = 1 48.13% 32.49% 5.41% 5.50%
Rank Inline graphic 5 87.53% 82.70% 16.24% 19.50%
Rank Inline graphic 10 95.01% 93.67% 24.20% 27.90%
% Correct formula for the first hit 99.25% 98.73% 93.31% 84.90%