Skip to main content
. 2021 Sep 15;19(4):602–610. doi: 10.1016/j.gpb.2021.09.002

Table 1.

Statistics of unique small proteins in SmProt

Source Start codon Human Mouse Fruit fly Rat C. elegans Yeast E. coli Zebrafish All species examined
Ribo-seq ATG 70,931 48,909 5269 3560 4334 4535 1881 1924 141,343
Near-cognate codons 229,653 133,037 29,679 9910 9894 12,339 10,004 1347 435,863
Literature ATG and near-cognate codons 38,157 8875 22,228 163 4 355 296 3612 73,690
Databases ATG and near-cognate codons 786 797 100 271 120 336 955 64 3429
MS ATG and near-cognate codons 768 51 66 38 0 3 0 1 927
All IDs examined ATG and near-cognate codons 327,995 189,433 56,574 13,829 14,255 17,312 12,881 6679 638,958

Note: Small protein families from human microbiomes are not included. Near-cognate codons refer to non-ATG start codons that differ from the canonical ATG start codon by a single base but are able to initiate translation, such as TTG, GTG, CTG, AAG, AGG, ACG, ATA, ATT, and ATC. ID refers to a unique entry with identical genomic loci in one species. Ribo-seq, ribosome profiling; MS, mass spectrometry.