Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2016 Jun 2;7:11778. doi: 10.1038/ncomms11778

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Copyright © 2016, Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

PMC Copyright notice

This workflow lays out the extensive filtering applied to search results before being passed on for genome annotation. (a) Results for multiple search engines are merged requiring PSMs to match between the different engines. The worst posterior error probability (PEP) between the matching search engines is carried forward. PSMs are then filtered by false discovery rate (FDR), PEP and peptide length. Contaminants are removed prior to protein inference. Peptides with CDS matches are mapped to Ensembl and stored while novel and non-CDS only peptides are further filtered. (b) Non-CDS peptides are first filtered to increase confidence in identification. (c) These peptides are then examined for alternative explanations or existing annotation. (d) The remaining non-CDS peptides are assigned a priority annotation score and inferred into novel or non-CDS genes. (e) The final set of ranked proteins undergoes manual inspection of spectra, validation against RNAseq data sets and is passed to manual annotators for review.