Skip to main content
. 2008 Sep 30;9:405. doi: 10.1186/1471-2105-9-405

Table 4.

Number of protein references successfully assigned to ROG's and broken down by assignment score.

Examples

Score type Total number with this score type (%) ROG Assignment Score Number of cases Details for one example
1 598590 (77.43) P 512650 UniProt:Q15118 is cited in the interaction record as the primary reference (P).
S 52738 UniProt:P94102 is cited in the interaction record as the secondary reference (S).
PD 14166 "protein accession" is cited as the source database for accession Q9Z2F5 (D).
SM 2154 Accession NP 191913 is cited in a modified form (M) without the underscore.
SVGO+ 262 EntrezGeneId:26207 (G) encodes multiple proteins (+) but only one matches the original (O) sequence given in the interaction record (RefSeq:NP_858057.1).

2 24664 (3.19) PU 18542 UniProt:O95686 is cited and updated (U) to UniProt/KB:Q9UQK1.
PE 264 GenBank GI:12962935 is cited and updated to RefSeq:NP_002458.2 using eUtils (E).
PUO+ 6 UniProt:P38706 is cited. Two possible updates are possible (+) but only one matches the original (O) sequence in the interaction record (P0C2H6).

3 121540 (15.72) PT 52074 Protein reference cites taxon id as 9534 (African green monkey) but the sequence record cites taxon 9606.
ST 60205 Protein reference cites taxon id as 40674 (mammalia) but the sequence record cites (9606) human.

4 2803 (0.36) PTUO+ 15 UniProt:O04063 is cited with taxon identifier 4530 (rice). More than one updated accession exists (+U). Only one possibility has the same sequence as cited in the interaction record (P0C5B0) with taxon identifier 39947 (a specific strain of rice).

5 9840 (1.27) SL+ 9090 The primary reference cited is not found. 49 secondary references are cited (S). 15 of these were found to map to 8 distinct proteins (+). The protein with the largest (L) SEGUID is arbitrarily chosen.
PUTL+ 187 UniProt:Q9MAY7 is cited with a taxon id of 4530 (rice). Two updated accessions are available (+U). Neither one has the expected sequence or taxon id (T) given in the interaction record. The accession with the largest (L) SEGUID is arbitrarily chosen.
SVGL+ 303 EntrezGene:9912 is cited (G). This gene encodes two proteins (+). Neither has the sequence expected from the interaction record. The one with the largest (L) SEGUID is selected.
PTQ 21 Primary accession P84244 cited as a "see also" (Q) reference with taxon id 9606. The sequence record cites taxon id 10090 (T).

6 15649 (2.02) PN 8909 Q95Q01 is an obsolete accession. The sequence is retrieved from the interaction record. The SEGUID and ROGID are calculated and stored locally as a new entry (N).
SEN 5561 RefSeq:NP_010441 is an obsolete accession. The sequence is retrieved using eUtils (E). The SEGUID and ROGID are calculated and stored locally as a new entry (N).
STGOEN+ 2 EntrezGene 196549 (G) is cited and encodes two proteins (+). The protein accessions cited by EntrezGene are retired. Sequences are retrieved using eUtils (E). One matches the sequence cited in the interaction record (O). The SEGUID and ROGID are calculated and stored locally as a new entry (N).