Table 1.
Clustering performance with different similarity thresholds. Results are shown for a single run from the human dataset (793000 spectra searched against the human IPI sequence database). For each similarity threshold we report the number of spectra searched, the number of spectra identified, the number of peptides identified and the number of proteins identified. These values are compared with the values obtained from a regular non-clustered search of the same dataset (the difference is reported as a percentage).
Similarity threshold | Spectra/Clusters searched | Spectra identified | Peptides identified | Proteins identified | ||||
---|---|---|---|---|---|---|---|---|
Non-clustered | 793000 | 86682 | 21090 | 6191 | ||||
0.30 | 167407 | −78.9% | 116571 | +34.5% | 18352 | −13.0% | 5772 | −6.8% |
0.35 | 204851 | −74.2% | 114196 | +31.7% | 19503 | −7.5% | 5991 | −3.2% |
0.40 | 241489 | −69.5% | 111309 | +28.4% | 20142 | −4.5% | 6096 | −1.5% |
0.45 | 276059 | −65.2% | 104983 | +21.1% | 20592 | −2.4% | 6178 | −0.2% |
0.50 | 309501 | −61.0% | 102859 | +18.7% | 20978 | −0.5% | 6229 | +0.6% |
0.55 | 340847 | −57.0% | 99488 | +14.8% | 21142 | +0.2% | 6282 | +1.5% |
0.60 | 369159 | −53.4% | 95764 | +10.5% | 21163 | +0.3% | 6275 | +1.4% |
0.65 | 394990 | −50.2% | 93511 | +7.9% | 21224 | +0.6% | 6266 | +1.2% |
0.70 | 417576 | −47.3% | 92666 | +6.9% | 21349 | +1.2% | 6300 | +1.8% |
0.75 | 436973 | −44.9% | 91269 | +5.3% | 21412 | +1.5% | 6310 | +1.9% |
0.80 | 452294 | −43.0% | 90018 | +3.8% | 21386 | +1.4% | 6289 | +1.6% |
0.85 | 467361 | −41.1% | 89137 | +2.8% | 21414 | +1.5% | 6286 | +1.5% |
0.90 | 478978 | −39.6% | 88406 | +2.0% | 21367 | +1.3% | 6268 | +1.2% |
0.95 | 487833 | −38.5% | 87689 | +1.2% | 21276 | +0.9% | 6245 | +0.9% |
1.00(only filtering) | 493023 | −37.8% | 87276 | +0.7% | 21239 | +0.7% | 6242 | +0.8% |