Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2019 Apr 30;18(8 Suppl 1):S126–S140. doi: 10.1074/mcp.RA118.001218

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© 2019 Verbruggen et al.

Published by The American Society for Biochemistry and Molecular Biology, Inc.

Author's Choice—Final version open access under the terms of the Creative Commons CC-BY license.

PMC Copyright notice

Fig. 1. — Most important parts of the PROTEOFORMER pipeline workflow. The pipeline starts with raw reads from a ribosome profiling experiment, provided in FASTQ format. The quality of these raw reads are checked with FastQC (40). Next, the reads are preprocessed, filtered and mapped to the reference genome. By using P-site offsets (calculated with Plastid (28)), these alignments can be pinpointed at the base level. Quality of the alignments and general data outlook will be checked with help of FastQC (40) and mQC (29). If the user is satisfied with the output, one can continue with the pipeline. PROTEOFORMER will search for the transcript isoforms with translation evidence. Based on these, the translated proteoforms can be deduced. The workflow used in the previous PROTEOFORMER (11) version can be applied (TIS calling, SNP calling and proteoform assembly) or one can use PRICE (12) or SPECtre (13) to determine these proteoforms. All results of these earlier steps are saved in an SQLite results database. For MS-based validation, the results can be exported, combined and even merged with canonical information from UniProt. The end result is a FASTA file that can be used for database searching of MS/MS spectra with tools like MaxQuant (19), SearchGUI-PeptideShaker (63, 64) or Prosit (24) in combination with Percolator (25). Several novel scripts were added to the pipeline to use these search results for counting database hits and classifying new proteoforms and novel translation events in a semi-automated fashion. Identifications can also be manually inspected on both ribosome (e.g. by browsing the PROTEOFORMER BedGraph files in a genome browser environment) and MS level (MS software interface or converted MS identification files in proBAM/proBED format (57, 58) in the same genome browser session as the BedGraph ribofiles).