Skip to main content
. 2003 Jul 1;31(13):3625–3630. doi: 10.1093/nar/gkg545

Table 1. Distribution of selected nuclear ELM matches within SWISS-PROT release 40.41 (121 515 sequence entries).

ELM_IDa Regular expression Total hitsb Taxonomyc Subcellular locationd   Non-globulare
          Nuclear Non-nuclear Unknown  
LIG_WRPW [WFY]RP[WFY].{0,7}$ 54 Metazoa 42 34f 7 1 34
      Human 10 7 2 1 7
LIG_RBBD [LI].C.E 6 127 Metazoa 2784 487 1305 992  
      Human 813 185 347 281 87g
LIG_NRBOX [⁁P]L[⁁P][⁁P]LL[⁁P] 44 902 Metazoa 19 963 2003 11 752 6208  
      Human 6138 775 3641 1722 458
MOD_SUMO [VILAFP]K.[EDNGP] 255 048 Metazoa 81 329 16 094 37 502 27 733  
      Human 24 319 5968 10 428 7923 4059h

aLIG_WRPW: ligand motif for transcriptional cofactors; LIG_RBBD: ligand motif for Rb interacting proteins; LIG_NRBOX: ligand motif for nuclear receptors; MOD_SUMO: modification motif for sumoylation.

bThe total number of regular expression matches. One sequence may have more than one hit.

cThe taxonomy range for each ELM is given along with the number of matches within that taxonomy range. In addition, the corresponding numbers for Homo sapiens are shown.

dSubcellular location was evaluated by the SWISS-PROT comment line ‘subcellular location.’ Nuclear: comment contains word nuclear or nucleus. Non-nuclear: comment does not contain words nuclear or nucleus. Unknown: comment line is missing.

eGlobularity of the human nuclear sequences with ELM predictions was evaluated by the SMART server (including Pfam domains). All ELMs that are within SMART/Pfam domains were excluded.

fAll but one of the predicted nuclear LIG_WRPWs are presumptive true positives.

gEleven of 19 experimentally verified instances of LIG_RBBD in human sequences are in this set. Among the missing occurrences are three which are known to reside in globular domains.

hDue to the large number of sequences containing predicted MOD_SUMO, 200 randomly chosen sequences were subjected to the SMART/Pfam filtering. The obtained ELM number was scaled to reflect the theoretical number of MOD_SUMOs in nonglobular regions of human nuclear sequences. MOD_SUMO is known to be located in globular domains as well as in nonglobular regions, and some true positives are thus likely to have been filtered out by the crude SMART/Pfam filter.

HHS Vulnerability Disclosure