Table 2.
pri | rod | mam | vrt | inv | pln | bct | rna | vrl | phg | syn | una | est | pat | sts | gss | htg | Total | Total over expected | Expected (A = C = G = T) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
HH-I | 51 | 21 | 0 | 14 | 176 | 119 | 130 | 0 | 37 | 3 | 5 | 0 | 186 | 45 | 4 | 17 | 108 | 916 | 0.56 | 1635 |
HH-I-3 | 182 | 72 | 12 | 45 | 997 | 560 | 553 | 2 | 159 | 16 | 6 | 0 | 702 | 125 | 10 | 119 | 486 | 4046 | 0.57 | 7081 |
HH-I-4 | 270 | 29 | 0 | 32 | 365 | 228 | 226 | 10 | 69 | 3 | 8 | 1 | 358 | 55 | 4 | 33 | 272 | 1963 | 0.55 | 3576 |
HH-I-5 | 310 | 70 | 14 | 97 | 506 | 400 | 309 | 2 | 77 | 10 | 7 | 1 | 641 | 60 | 13 | 92 | 382 | 2991 | 0.46 | 6488 |
HH-I-6 | 197 | 60 | 17 | 37 | 516 | 312 | 399 | 3 | 107 | 5 | 8 | 1 | 700 | 93 | 11 | 67 | 288 | 2821 | 0.49 | 6973 |
HH-I-8 | 171 | 44 | 11 | 22 | 790 | 407 | 455 | 10 | 138 | 11 | 5 | 0 | 521 | 86 | 10 | 61 | 420 | 3162 | 0.46 | 6883 |
HH-I-9 | 247 | 56 | 12 | 29 | 508 | 260 | 454 | 0 | 107 | 9 | 7 | 3 | 536 | 99 | 9 | 65 | 344 | 2745 | 0.38 | 7314 |
HH-I-12 | 946 | 224 | 41 | 103 | 754 | 613 | 454 | 0 | 180 | 13 | 8 | 2 | 1856 | 137 | 48 | 263 | 794 | 6436 | 0.94 | 6865 |
HH-I-13 | 209 | 76 | 16 | 41 | 428 | 308 | 462 | 5 | 93 | 10 | 11 | 0 | 739 | 143 | 12 | 66 | 226 | 2845 | 0.42 | 6775 |
HH-I-14 | 317 | 94 | 30 | 36 | 404 | 377 | 487 | 7 | 104 | 5 | 30 | 5 | 737 | 102 | 8 | 77 | 327 | 3147 | 0.50 | 6362 |
HH-I-17 | 65 | 25 | 0 | 18 | 201 | 144 | 162 | 0 | 50 | 4 | 11 | 0 | 225 | 53 | 4 | 19 | 122 | 1103 | 0.l54 | 2031 |
HH-II | 83 | 33 | 1 | 5 | 175 | 126 | 141 | 1 | 80 | 1 | 1 | 3 | 435 | 30 | 4 | 8 | 91 | 1218 | 0.54 | 2246 |
HH-II-3 | 180 | 78 | 8 | 29 | 994 | 645 | 715 | 3 | 204 | 14 | 25 | 23 | 1022 | 102 | 11 | 82 | 502 | 4637 | 0.66 | 7063 |
HH-II-4 | 133 | 56 | 7 | 11 | 387 | 261 | 278 | 1 | 100 | 1 | 2 | 5 | 658 | 43 | 7 | 39 | 188 | 2177 | 0.51 | 4277 |
HH-II-5 | 244 | 82 | 16 | 37 | 573 | 477 | 372 | 7 | 141 | 10 | 4 | 4 | 910 | 70 | 12 | 71 | 362 | 3392 | 0.39 | 8788 |
HH-II-6 | 234 | 81 | 23 | 17 | 522 | 348 | 977 | 38 | 153 | 7 | 7 | 9 | 856 | 74 | 8 | 59 | 311 | 3724 | 0.49 | 7674 |
HH-II-8 | 209 | 64 | 11 | 22 | 921 | 385 | 519 | 16 | 130 | 3 | 10 | 4 | 1265 | 53 | 7 | 65 | 452 | 4136 | 0.54 | 7710 |
HH-II-9 | 255 | 58 | 8 | 24 | 457 | 319 | 540 | 6 | 140 | 9 | 11 | 5 | 884 | 82 | 10 | 51 | 281 | 3140 | 0.39 | 7979 |
HH-II-12 | 1290 | 253 | 61 | 60 | 879 | 716 | 534 | 1 | 209 | 14 | 18 | 3 | 2491 | 125 | 56 | 273 | 1027 | 8010 | 0.97 | 8285 |
HH-II-13 | 214 | 91 | 45 | 26 | 421 | 326 | 534 | 7 | 162 | 8 | 23 | 3 | 1031 | 71 | 7 | 53 | 228 | 3250 | 0.44 | 7404 |
HH-II-14 | 281 | 81 | 15 | 37 | 414 | 524 | 493 | 6 | 142 | 7 | 2 | 4 | 1038 | 87 | 9 | 67 | 256 | 3463 | 0.43 | 8069 |
HH-II-17 | 100 | 35 | 1 | 7 | 220 | 152 | 173 | 1 | 97 | 1 | 1 | 4 | 504 | 36 | 7 | 16 | 108 | 1463 | 0.53 | 2786 |
HH-III | 42 | 6 | 0 | 2 | 93 | 67 | 65 | 0 | 21 | 1 | 2 | 0 | 96 | 57 | 3 | 8 | 64 | 527 | 0.55 | 952 |
HH-III-3 | 96 | 32 | 8 | 18 | 625 | 305 | 245 | 0 | 74 | 6 | 3 | 0 | 310 | 72 | 5 | 74 | 291 | 2164 | 0.64 | 3397 |
HH-III-4 | 295 | 12 | 1 | 21 | 192 | 128 | 98 | 0 | 36 | 3 | 5 | 0 | 204 | 63 | 3 | 23 | 229 | 1313 | 0.76 | 1725 |
HH-III-5 | 144 | 38 | 16 | 35 | 301 | 205 | 231 | 0 | 34 | 2 | 2 | 0 | 388 | 76 | 11 | 32 | 219 | 1734 | 0.48 | 3612 |
HH-III-6 | 134 | 29 | 5 | 18 | 290 | 183 | 197 | 9 | 58 | 2 | 3 | 0 | 426 | 66 | 10 | 33 | 180 | 1643 | 0.46 | 3540 |
HH-III-8 | 92 | 26 | 4 | 6 | 528 | 203 | 260 | 0 | 722 | 1 | 2 | 0 | 294 | 154 | 4 | 29 | 255 | 2580 | 0.68 | 3774 |
HH-III-9 | 109 | 22 | 0 | 12 | 239 | 143 | 196 | 0 | 87 | 3 | 4 | 0 | 462 | 63 | 7 | 38 | 164 | 1549 | 0.49 | 3918 |
HH-III-12 | 658 | 97 | 40 | 32 | 369 | 318 | 217 | 0 | 98 | 3 | 6 | 0 | 1071 | 135 | 20 | 121 | 477 | 3662 | 1.03 | 3540 |
HH-III-13 | 119 | 29 | 8 | 14 | 226 | 176 | 246 | 3 | 53 | 6 | 2 | 0 | 345 | 70 | 6 | 28 | 142 | 1473 | 0.41 | 3576 |
HH-III-14 | 133 | 33 | 10 | 11 | 242 | 184 | 195 | 1 | 65 | 2 | 21 | 0 | 408 | 86 | 5 | 39 | 146 | 1581 | 0.45 | 3504 |
HH-III-17 | 49 | 6 | 0 | 3 | 123 | 81 | 86 | 0 | 23 | 2 | 2 | 0 | 154 | 67 | 4 | 11 | 74 | 685 | 0.49 | 1402 |
GenBank relative size | 0.14 | 0.03 | 0.01 | 0.01 | 0.07 | 0.07 | 0.07 | <0.01 | 0.03 | <0.01 | <0.01 | <0.01 | 0.37 | 0.02 | 0.01 | 0.06 | 0.10 | 1.00 |
The motifs are named as explained in Methods and in Figure 1. The “expected” number of occurrences was obtained by searching the different motifs in a database of 1000 random sequences of 100,000 nucleotides (equal representations of A, C, G, and T) and correcting the frequency to the relative size of the GenBank. The “total over expected” shows the ratio of occurrences obtained in the GenBank versus the expected ones according to the search performed in a random database with the size of the GenBank.
(pri) Primate sequence entries (from the two GenBank files); (rod) rodent sequence entries; (mam) other mammalian sequence entries; (vrt) other vertebrate sequence entries; (inv) invertebrate sequence entries; (pln) plant sequence entries (including fungi and algae); (bct) bacterial sequence entries; (rna) structural RNA sequence entries; (vrl) viral sequence entries; (phg) phage sequence entries; (syn) synthetic and chimeric sequence entries; (una) unannotated sequence entries; (est) EST (expressed sequence tag) (from 23 GenBank files); (pat) patent sequence entries; (sts) STS (sequence tagged site) sequence entries; (gss) GSS (genome survery sequence) sequence entries; (htg) HTGS (high throughput genomic sequencing) sequence entries.