Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2016 Oct 27;15(12):4686–4695. doi: 10.1021/acs.jproteome.6b00765

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Copyright © 2016 American Chemical Society

This is an open access article published under a Creative Commons Attribution (CC-BY) License, which permits unrestricted use, distribution and reproduction in any medium, provided the author and source are cited.

PMC Copyright notice

Data retention throughout the pipeline. The bars show the numbers of “data elements” (spectra, PMSs, and peptides) under consideration as these numbers decrease from the start (left) to the end (right) of the proteogenomics pipeline. In detail, the bars represent the following (node numbers refer to the TOPPAS workflow in Figure 2): “MS2 spectra”, input MS2 spectra in the C-HPP testis data set; “Mascot/MS-GF+ PSMs (all)”, spectra that generated PSMs using either search engine; “Mascot/MS-GF+ (1% PEP)”, PSMs after PSM-level filtering (node 15); “Consensus”, PSMs after ConsensusID (node 16); “Filter: contaminants”, PSMs after filtering for contaminants (node 18); “Filter: known proteins (exact)”, PSMs after filtering for exact matches to known proteins (node 20); “Filter: known proteins (approx.)”, PSMs after filtering for approximate matches to known proteins (final set; node 22); and “Novel peptides”, distinct novel peptides identified by the final set of PSMs.