. 2021 Apr 3;20:100076. doi: 10.1016/j.mcpro.2021.100076

Table 1.

MaxQuant identification statistics of searches against several search spaces from differing sources and sizes

Sequencing technique	UniProt	Search space		Identified protein groups	Identified peptides	Identified PSMs	Identified PSMs
Sequencing technique	UniProt	Entries	Amino acids	MaxQuant	MaxQuant	MaxQuant	MaxQuant+Percolator
None	Canonical	71,356	24,055,511	4294	28,443	180,526	186,937
Ribosome profiling	Canonical	176,202	40,603,175	4333	28,402	177,473	185,767
Ribosome profiling	Spliced	186,627	46,830,033	4347	28,372	176,978	184,578
RNA-Seq	Spliced	4,988,183	757,075,232	3669	15,820	91,232	175,775

The size of the search space is given based on the number of present sequences as well as based on amino acid content. Information of both ribosome profiling and RNA-Seq could be combined with reference information from UniProt (only canonical proteins or with additional splicing isoforms included). The obtained proteogenomic search spaces were afterward used in the MaxQuant search tool. The number of identified PSMs, peptides, and inferred protein groups clearly differ based on the size of the used search space. Especially for the RNA-Seq–based search space, the size of the search space has dramatic effects on the identification in MaxQuant. Percolator helps to overcome already a big part of this identification reduction. “MaxQuant+Percolator” is used in the rest of the article as the baseline.