Skip to main content
. 2000 Jul;10(7):1011–1019. doi: 10.1101/gr.10.7.1011

Table 2.

Distribution of Hammerhead and Hammerhead-like Motifs in the Different Sections of the GenBank: Mutants of the Single Stranded Regions

pri rod mam vrt inv pln bct rna vrl phg syn una est pat sts gss htg Total Total over expected Expected (A = C = G = T)





















HH-I 51 21 0 14 176 119 130 0 37 3 5 0 186 45 4 17 108 916 0.56 1635
HH-I-3 182 72 12 45 997 560 553 2 159 16 6 0 702 125 10 119 486 4046 0.57 7081
HH-I-4 270 29 0 32 365 228 226 10 69 3 8 1 358 55 4 33 272 1963 0.55 3576
HH-I-5 310 70 14 97 506 400 309 2 77 10 7 1 641 60 13 92 382 2991 0.46 6488
HH-I-6 197 60 17 37 516 312 399 3 107 5 8 1 700 93 11 67 288 2821 0.49 6973
HH-I-8 171 44 11 22 790 407 455 10 138 11 5 0 521 86 10 61 420 3162 0.46 6883
HH-I-9 247 56 12 29 508 260 454 0 107 9 7 3 536 99 9 65 344 2745 0.38 7314
HH-I-12 946 224 41 103 754 613 454 0 180 13 8 2 1856 137 48 263 794 6436 0.94 6865
HH-I-13 209 76 16 41 428 308 462 5 93 10 11 0 739 143 12 66 226 2845 0.42 6775
HH-I-14 317 94 30 36 404 377 487 7 104 5 30 5 737 102 8 77 327 3147 0.50 6362
HH-I-17 65 25 0 18 201 144 162 0 50 4 11 0 225 53 4 19 122 1103 0.l54 2031





















HH-II 83 33 1 5 175 126 141 1 80 1 1 3 435 30 4 8 91 1218 0.54 2246
HH-II-3 180 78 8 29 994 645 715 3 204 14 25 23 1022 102 11 82 502 4637 0.66 7063
HH-II-4 133 56 7 11 387 261 278 1 100 1 2 5 658 43 7 39 188 2177 0.51 4277
HH-II-5 244 82 16 37 573 477 372 7 141 10 4 4 910 70 12 71 362 3392 0.39 8788
HH-II-6 234 81 23 17 522 348 977 38 153 7 7 9 856 74 8 59 311 3724 0.49 7674
HH-II-8 209 64 11 22 921 385 519 16 130 3 10 4 1265 53 7 65 452 4136 0.54 7710
HH-II-9 255 58 8 24 457 319 540 6 140 9 11 5 884 82 10 51 281 3140 0.39 7979
HH-II-12 1290 253 61 60 879 716 534 1 209 14 18 3 2491 125 56 273 1027 8010 0.97 8285
HH-II-13 214 91 45 26 421 326 534 7 162 8 23 3 1031 71 7 53 228 3250 0.44 7404
HH-II-14 281 81 15 37 414 524 493 6 142 7 2 4 1038 87 9 67 256 3463 0.43 8069
HH-II-17 100 35 1 7 220 152 173 1 97 1 1 4 504 36 7 16 108 1463 0.53 2786





















HH-III 42 6 0 2 93 67 65 0 21 1 2 0 96 57 3 8 64 527 0.55 952
HH-III-3 96 32 8 18 625 305 245 0 74 6 3 0 310 72 5 74 291 2164 0.64 3397
HH-III-4 295 12 1 21 192 128 98 0 36 3 5 0 204 63 3 23 229 1313 0.76 1725
HH-III-5 144 38 16 35 301 205 231 0 34 2 2 0 388 76 11 32 219 1734 0.48 3612
HH-III-6 134 29 5 18 290 183 197 9 58 2 3 0 426 66 10 33 180 1643 0.46 3540
HH-III-8 92 26 4 6 528 203 260 0 722 1 2 0 294 154 4 29 255 2580 0.68 3774
HH-III-9 109 22 0 12 239 143 196 0 87 3 4 0 462 63 7 38 164 1549 0.49 3918
HH-III-12 658 97 40 32 369 318 217 0 98 3 6 0 1071 135 20 121 477 3662 1.03 3540
HH-III-13 119 29 8 14 226 176 246 3 53 6 2 0 345 70 6 28 142 1473 0.41 3576
HH-III-14 133 33 10 11 242 184 195 1 65 2 21 0 408 86 5 39 146 1581 0.45 3504
HH-III-17 49 6 0 3 123 81 86 0 23 2 2 0 154 67 4 11 74 685 0.49 1402





















GenBank  relative  size 0.14 0.03 0.01 0.01 0.07 0.07 0.07 <0.01 0.03 <0.01 <0.01 <0.01 0.37 0.02 0.01 0.06 0.10 1.00

The motifs are named as explained in Methods and in Figure 1. The “expected” number of occurrences was obtained by searching the different motifs in a database of 1000 random sequences of 100,000 nucleotides (equal representations of A, C, G, and T) and correcting the frequency to the relative size of the GenBank. The “total over expected” shows the ratio of occurrences obtained in the GenBank versus the expected ones according to the search performed in a random database with the size of the GenBank. 

(pri) Primate sequence entries (from the two GenBank files); (rod) rodent sequence entries; (mam) other mammalian sequence entries; (vrt) other vertebrate sequence entries; (inv) invertebrate sequence entries; (pln) plant sequence entries (including fungi and algae); (bct) bacterial sequence entries; (rna) structural RNA sequence entries; (vrl) viral sequence entries; (phg) phage sequence entries; (syn) synthetic and chimeric sequence entries; (una) unannotated sequence entries; (est) EST (expressed sequence tag) (from 23 GenBank files); (pat) patent sequence entries; (sts) STS (sequence tagged site) sequence entries; (gss) GSS (genome survery sequence) sequence entries; (htg) HTGS (high throughput genomic sequencing) sequence entries.