Skip to main content
. 2007 Dec 24;2:16. doi: 10.1186/1748-7188-2-16

Table 3.

Evaluation of contiguous motifs on Prosite data.

PS entry Motif NumSeqs DiffNGrams Rel. Supp(%) Supp Rank ZScore LogOdd Pratt IG Info
PS00341 IPCCPV 9 702 77.8 9 21 65 166 13 217
PS00415 LRRRLSDS 12 3582 91.6 9 503 1058 2103 11 1784
PS00047 GAKRH 105 653 93.3 21 61 109 216 27 460
PS00984 CFWKYC 19 1256 100 1 1 1 785 1 5
PS00541 SKRKYRK 6 144 100 1 85 110 131 3 134
PS00822 PFDRHDW 9 2251 100 1 1 5 204 1 400
PS00419 CDGPGRGGTC 207 32936 100 1 1 1 3 1 158
PS00349 RKRKYFKKHEKR 18 2929 100 1 38 86 2884 19 310
PS00861 GWTLNSAGYLLGP 32 888 100 1 66 301 179 1 569
PS01024 EFDYLKSLEIEEKIN 60 5527 100 1 620 2427 5266 1 5244
PS00291 AGAAAAGAVVGGLGGY 136 2423 100 1 1033 1770 184 3 1984

Rm 0.2340 4.526E-3 1.854E-3 9.075E-4 0.1358 9.764E-4

Ranking results of eleven Prosite datasets (identified by the Prosite (PS) entry column). For each dataset, the number of protein sequences, the number of different n-grams (Diff NGrams), where n is equal to the motif length and the relative support of the target motifs (Rel. Supp) are presented. Motifs are ranked with Information-theoretic based measures. Ranks obtained by support (Supp Rank) and information gain (Info) are also provided for comparison purposes. Last row gives the Rm values of each measure, where best results are obtained by support and IG.