Skip to main content
. 2022 Jan 18;204(1):e00353-21. doi: 10.1128/jb.00353-21

TABLE 1.

A selection of small protein discovery studies for bacteria and archaea using shotgun (bottom-up) proteomics and top-down approaches without proteolytic digesta

Organism(s) Taxonomy Approach Notes Sample Reference(s)
Mycoplasma pneumoniae Bacteria Shotgun Term proteogenomics introduced; search six-frame translated genome Whole cell lysate 56
Mycoplasma mobile Bacteria Shotgun Used proteomics data in initial genome annotation of an organism Whole cell lysate 126
Shewanella oneidensis Bacteria Shotgun MS-based proteomics to improve genome annotation, used PTM data, studied several conditions Whole cell lysate; subcellular fractionation 127
Methanosarcina acetivorans Archaea Top down Top-down approach identified five unannotated small proteins (40–76 aa) Whole cell lysate 19
Staphylococcus aureus Bacteria Shotgun Effort to analyze the entire expressed proteome, combining different conditions and proteomics approaches Subcellular fractionation 128
Yersinia pestis Bacteria Shotgun Required 2 peptides to identify a novel protein Subcellular fractionation 57
Mycobacterium tuberculosis complex Bacteria Shotgun Custom database approach that merges information from different strains Various 129
46 species (bacteria and archaea) Bacteria, Archaea Shotgun Large proteogenomic study; use of stringent PSM level FDR advocated Various 30
Escherichia coli Bacteria Shotgun High FDR among peptides implying novel proteins; trypsin + Lys-C Whole cell lysate 31
Helicobacter pylori Bacteria Shotgun Required 2 peptides to identify a novel protein; also used size exclusion chromatography Whole cell lysate 130
57 bacterial species Bacteria Shotgun Large proteogenomics study on N-terminal methionine excision and PTM (N-terminal acetylation) Various 131
Saccharopolyspora erythrea Bacteria Shotgun Reannotated genome of organism with high GC content (transcriptomics, shotgun proteomics) Whole cell lysate 58
Bradyrhizobium japonicum Bacteria Shotgun Custom databases to find longer (84)/shorter proteoforms (132) (integration with dRNA-seq data) Whole cell lysate 84, 132
Synechococcus sp Archaea Shotgun Proteogenomic study of model cyanobacterium (8 conditions); global profiling for PTMs (1% PSM FDR) Whole cell lysate 133
Roseobacter denitrificans Bacteria Shotgun N terminomic combined with six-frame translation database to validate/correct N termini (+ alternative proteases; Glu-C, chymotrypsin) Whole cell lysate 27
Mycoplasma pneumoniae Bacteria Shotgun Integrated anaylsis to re-annotate genome (>300 transcriptome, >70 proteome datasets); evidence for internal starts, new small proteins Various 103
Pseudomonas stutzeri Bacteria Top down (MALDI) Small transmembrane subunit of cbb3 oxidase Purified protein complex 23
Metagenomic study of grassland soil Bacteria, archaea Shotgun Metagenome-assembled genomes as basis for meta-proteomics (custom database); integrate metabolomics; beyond culturable strains Soil extract 61
Listeria monocytogenes Bacteria Shotgun N-terminal enrichment (COFRADIC approach) (26); 2nd study (92) used spectral libraries and combined DDA and DIA Whole cell lysate 26, 29, 92
Xanthomonas euvesicatoria Bacteria Shotgun Reannotation of a plant pathogen with Shotgun data; confirmed expression of 5 novel proteins with Immunoblot (c-Myc tag) Whole cell lysate 134
Bartonella henselae Bacteria Shotgun Broadly applicable proteogenomic approach, custom databases validated 107/138 peptides with PRM (97) Subcellular fractionation 81, 97, 141
Methylobacterium extorquens Bacteria Shotgun N terminome study of a strain used for microbial rehabilitation and degradation of industrial pollutants Whole cell lysate 28
Escherichia coli Bacteria Top down (MALDI) Small transmembrane subunit of bd oxidase Purified protein complex 43
Bacillus subtilis Bacteria Shotgun Explored small protein enrichment strategies, different proteases, database searches; validation by PRM and spectral matching Small protein enrichment 12
Salmonella Typhimurium, Deinococcus radiodurans Bacteria Shotgun Broadly applicable custom peptide DB; integrated Ribo-seq data and peptide fragmentation prediction Whole cell lysates 88
Salmonella Typhimurium Bacteria Shotgun Integrated small protein prediction with Ribo-seq, shotgun, and other OMICS data; elucidated function of novel small proteins Various 104
Prokaryotes from human microbiota, Bacteroides thetaiotaomicron Bacteria, archaea Shotgun Prediction of ∼4500 small protein families <50 aa; experimental evidence for selected examples (meta-transcriptomics/proteomics)
Identified several small proteins in Bacteroides thetaiotaomicron
Various 1
Intestinal microbiota model system Bacteria Shotgun Extended custom iPtgxDB to multispecies model (8 strains); meta-transcriptomics + meta-proteomics; spectral matching Small protein enrichment 60
Thermosynechococcus elongatus Bacteria Top down (MALDI) Small transmembrane subunit of photosystem II Purified protein complex 44
Staphylococcus aureus Bacteria Shotgun Broadly applicable approach; integrated shotgun and Ribo-seq data; automated evaluation of MS/MS spectra quality Cytoplasmic extract 87
Methanosarcina mazei Archaea Shotgun/top down Characterization of 36 proteoforms mapping to 12 small proteins with top down (2D-LC-MS) Small protein enrichment 89
Methanosarcina mazei Archaea Shotgun Multiprotease approach (SDS-PAGE) Small protein enrichment 35
a

We apologize to authors of the many important studies we could not reference due to space restrictions. Preference was given to more recent studies, many of which have used higher accuracy MS instruments and small protein enrichment strategies, carried out validation of novel small proteins (e.g., by PRM or spectral matching), and put a larger emphasis on integration of RNA-seq, Ribo-seq or computational small protein prediction algorithms. Proteogenomic studies prior to 2014 are more thoroughly listed by Kucharova and Wiker (125).