Table 1. Breakdown of the eukaryotic protein benchmark dataset derived from Swiss-Prot database (release 55.3) according to the procedures described in the Materials section.
Subseta | Subcellular location | Number of proteins |
Acrosome | 14 | |
Cell membrane | 697 | |
Cell wall | 49 | |
Centrosome | 96 | |
Chloroplast | 385 | |
Cyanelle | 79 | |
Cytoplasm | 2186 | |
Cytoskeleton | 139 | |
Endoplasmic reticulum | 457 | |
Endosome | 41 | |
Extracell | 1048 | |
Golgi apparatus | 254 | |
Hydrogenosome | 10 | |
Lysosome | 57 | |
Melanosome | 47 | |
Microsome | 13 | |
Mitochondrion | 610 | |
Nucleus | 2320 | |
Peroxisome | 110 | |
Spindle pole body | 68 | |
Synapse | 47 | |
Vacuole | 170 | |
Number of total virtual proteins | 8,897b | |
Number of total different proteins | 7,766c |
None of the proteins included here has sequence identity to any other in a same subcellular location.
See Fig. 1 and Eq.1 as well as the relevant text for the definitions of the subsets listed in this table.
See Eqs.2–3 for the definition about the number of virtual proteins, and its relation with the number of different proteins.
Of the 7,766 different proteins, 6,687 belong to one subcellular location, 1,029 to two locations, 48 to three locations, and 2 to four locations. See Online Supporting Information S1 for the protein sequences.