TABLE 1.
Organism(s) | Taxonomy | Approach | Notes | Sample | Reference(s) |
---|---|---|---|---|---|
Mycoplasma pneumoniae | Bacteria | Shotgun | Term proteogenomics introduced; search six-frame translated genome | Whole cell lysate | 56 |
Mycoplasma mobile | Bacteria | Shotgun | Used proteomics data in initial genome annotation of an organism | Whole cell lysate | 126 |
Shewanella oneidensis | Bacteria | Shotgun | MS-based proteomics to improve genome annotation, used PTM data, studied several conditions | Whole cell lysate; subcellular fractionation | 127 |
Methanosarcina acetivorans | Archaea | Top down | Top-down approach identified five unannotated small proteins (40–76 aa) | Whole cell lysate | 19 |
Staphylococcus aureus | Bacteria | Shotgun | Effort to analyze the entire expressed proteome, combining different conditions and proteomics approaches | Subcellular fractionation | 128 |
Yersinia pestis | Bacteria | Shotgun | Required 2 peptides to identify a novel protein | Subcellular fractionation | 57 |
Mycobacterium tuberculosis complex | Bacteria | Shotgun | Custom database approach that merges information from different strains | Various | 129 |
46 species (bacteria and archaea) | Bacteria, Archaea | Shotgun | Large proteogenomic study; use of stringent PSM level FDR advocated | Various | 30 |
Escherichia coli | Bacteria | Shotgun | High FDR among peptides implying novel proteins; trypsin + Lys-C | Whole cell lysate | 31 |
Helicobacter pylori | Bacteria | Shotgun | Required 2 peptides to identify a novel protein; also used size exclusion chromatography | Whole cell lysate | 130 |
57 bacterial species | Bacteria | Shotgun | Large proteogenomics study on N-terminal methionine excision and PTM (N-terminal acetylation) | Various | 131 |
Saccharopolyspora erythrea | Bacteria | Shotgun | Reannotated genome of organism with high GC content (transcriptomics, shotgun proteomics) | Whole cell lysate | 58 |
Bradyrhizobium japonicum | Bacteria | Shotgun | Custom databases to find longer (84)/shorter proteoforms (132) (integration with dRNA-seq data) | Whole cell lysate | 84, 132 |
Synechococcus sp | Archaea | Shotgun | Proteogenomic study of model cyanobacterium (8 conditions); global profiling for PTMs (1% PSM FDR) | Whole cell lysate | 133 |
Roseobacter denitrificans | Bacteria | Shotgun | N terminomic combined with six-frame translation database to validate/correct N termini (+ alternative proteases; Glu-C, chymotrypsin) | Whole cell lysate | 27 |
Mycoplasma pneumoniae | Bacteria | Shotgun | Integrated anaylsis to re-annotate genome (>300 transcriptome, >70 proteome datasets); evidence for internal starts, new small proteins | Various | 103 |
Pseudomonas stutzeri | Bacteria | Top down (MALDI) | Small transmembrane subunit of cbb3 oxidase | Purified protein complex | 23 |
Metagenomic study of grassland soil | Bacteria, archaea | Shotgun | Metagenome-assembled genomes as basis for meta-proteomics (custom database); integrate metabolomics; beyond culturable strains | Soil extract | 61 |
Listeria monocytogenes | Bacteria | Shotgun | N-terminal enrichment (COFRADIC approach) (26); 2nd study (92) used spectral libraries and combined DDA and DIA | Whole cell lysate | 26, 29, 92 |
Xanthomonas euvesicatoria | Bacteria | Shotgun | Reannotation of a plant pathogen with Shotgun data; confirmed expression of 5 novel proteins with Immunoblot (c-Myc tag) | Whole cell lysate | 134 |
Bartonella henselae | Bacteria | Shotgun | Broadly applicable proteogenomic approach, custom databases validated 107/138 peptides with PRM (97) | Subcellular fractionation | 81, 97, 141 |
Methylobacterium extorquens | Bacteria | Shotgun | N terminome study of a strain used for microbial rehabilitation and degradation of industrial pollutants | Whole cell lysate | 28 |
Escherichia coli | Bacteria | Top down (MALDI) | Small transmembrane subunit of bd oxidase | Purified protein complex | 43 |
Bacillus subtilis | Bacteria | Shotgun | Explored small protein enrichment strategies, different proteases, database searches; validation by PRM and spectral matching | Small protein enrichment | 12 |
Salmonella Typhimurium, Deinococcus radiodurans | Bacteria | Shotgun | Broadly applicable custom peptide DB; integrated Ribo-seq data and peptide fragmentation prediction | Whole cell lysates | 88 |
Salmonella Typhimurium | Bacteria | Shotgun | Integrated small protein prediction with Ribo-seq, shotgun, and other OMICS data; elucidated function of novel small proteins | Various | 104 |
Prokaryotes from human microbiota, Bacteroides thetaiotaomicron | Bacteria, archaea | Shotgun | Prediction of ∼4500 small protein families <50 aa; experimental evidence for selected examples (meta-transcriptomics/proteomics) Identified several small proteins in Bacteroides thetaiotaomicron |
Various | 1 |
Intestinal microbiota model system | Bacteria | Shotgun | Extended custom iPtgxDB to multispecies model (8 strains); meta-transcriptomics + meta-proteomics; spectral matching | Small protein enrichment | 60 |
Thermosynechococcus elongatus | Bacteria | Top down (MALDI) | Small transmembrane subunit of photosystem II | Purified protein complex | 44 |
Staphylococcus aureus | Bacteria | Shotgun | Broadly applicable approach; integrated shotgun and Ribo-seq data; automated evaluation of MS/MS spectra quality | Cytoplasmic extract | 87 |
Methanosarcina mazei | Archaea | Shotgun/top down | Characterization of 36 proteoforms mapping to 12 small proteins with top down (2D-LC-MS) | Small protein enrichment | 89 |
Methanosarcina mazei | Archaea | Shotgun | Multiprotease approach (SDS-PAGE) | Small protein enrichment | 35 |
We apologize to authors of the many important studies we could not reference due to space restrictions. Preference was given to more recent studies, many of which have used higher accuracy MS instruments and small protein enrichment strategies, carried out validation of novel small proteins (e.g., by PRM or spectral matching), and put a larger emphasis on integration of RNA-seq, Ribo-seq or computational small protein prediction algorithms. Proteogenomic studies prior to 2014 are more thoroughly listed by Kucharova and Wiker (125).