Abstract
The sub-cellular localization of a native protein constitutes one coarse-grained aspect of its function. Transport between compartments is often regulated through short sequence motifs. Here, we analyzed experimentally characterized endoplasmic reticulum (ER)/ Golgi retrieval motifs and investigated the accuracy of homology-transfer. Only the C-terminal ER retrieval motifs KDEL, HDEL and AIAKE were sufficiently specific. However, even unspecific motifs may help, provided we know the probability for localization given the motif. We provided such estimates. We also rigorously estimated the accuracy and coverage for inferring ER and Golgi localization through homology-transfer by sequence similarity. In entire proteomes, we could thereby annotate 3304 ER (3182 membrane) and 1853 Golgi (759 membrane) proteins. We identified another putative 5157 globular and 3941 membrane ER or Golgi proteins. Each experimental annotation yielded, on average, one to three high-accuracy and five to six low-accuracy homology-transfers in the six proteomes. These numbers will increase with each new experimental annotation.
Keywords: Endoplasmic reticulum, Golgi apparatus, genome sequence analysis, sub-cellular localization, protein sequence motifs
Footnotes
Received 6 January 2004; received after revision 10 March 2004; accepted 29 March 2004