Table I. Summary of previously existing subcellular annotation and comparison with STEPdb. Uniprot and EchoLOCATION databases contributed 2050 and 3979 protein annotations correspondingly and together with the theoretical IM proteome (33) gave a combined 4111 initially annotated proteins. Comparison of the three resources revealed some matches in proposed subcellular locations (“Matching annotation”) but also some differences (“Conflicting annotation”). IM proteins with matching annotations in only in two of three resources are referred as “Unique IM proteome.” All proteins with existing annotations in only one source are referred as “Unique (total).” STEPdb multicombinatorial analysis contributed 36 newly classified proteins that were of unknown location (“STEPdb de novo annotated”), 674 proteins that have been reassigned to locations other than previously proposed (“STEPdb revised”) and 601 proteins with contradicting subcellular annotations that have been unresolved (“STEPdb unresolved”).
Uniprot | EchoLOCATION | Bernsel & Daley (2009)33 | Total | |
---|---|---|---|---|
Reference proteome (E. coli K-12) | 4303 | 4345a | 1133 | 4303 |
Matching annotations | 1613 | 1646 | 850 | 1652 |
Unique (IM proteome) | 11 | 29 | 4 | 44 |
Unique (total) | 12 | 1998 | 4 | 2014 |
Contradicting annotations | 425 | 599 | 254 | 601 |
Existing annotations | 2050 | 4243 | 1108 | 4267 |
% of reference proteome | 48% | 98% | 26% | 99% |
Missing annotations | 2253 | 60 | – | 36 |
Missing and unresolved annotations | 2678 | 659 | 254 | 637 |
% of reference proteome | 62% | 15% | 6% | 18% |
STEPdb total contribution over previous annotations | 3352 | 1333 | 560 | 1311 |
de novo annotated | 2253 | 60 | 84 | 36 |
Revised | 674 | 674 | 222 | 674 |
Resolved | 425 | 599 | 254 | 601 |
Experimental validations | 1205 | |||
References added | 118 | |||
% of reference proteome | 76.89% | 35.37% | 7.87% | 32.81% |
a This is the estimation of the total proteome of EchoLOCATION.