TABLE 3.
Virus and motif length | 100 nt upstream
|
200 nt upstream
|
||||||
---|---|---|---|---|---|---|---|---|
Motif | Occurrence in genomeb (% of expected occurrencec) | Occurrence in upstream regions (% of expected occurrence) | Relative enrichment in upstream regions | Motif | Occurrence in genome (% of expected occurrence) | Occurrence in upstream regions (% of expected occurrence) | Relative enrichment in upstream regions | |
AcMNPV | TAAGd | 393 (29) | 90 (114) | 4 | TAAGd | 393 (29) | 137 (87) | 3 |
TATA | 1,314 (66) | 172 (149) | 2.3 | TATA | 1,314 (66) | 255 (111) | 1.7 | |
ATAA | 1,973 (101) | 222 (198) | 1.9 | ATAA | 1,973 (101) | 363 (162) | 1.6 | |
ATAT | 1,616 (81) | 170 (147) | 1.8 | ATAT | 1,616 (81) | 268 (116) | 1.4 | |
AGTA | 671 (49) | 70 (89) | 1.8 | AGTA | 671 (49) | 109 (69) | 1.4 | |
AAGG | 473 (51) | 41 (76) | 1.5 | GATAe | 867 (63) | 137 (87) | 1.4 | |
GATAe | 867 (63) | 74 (94) | 1.5 | AATA | 2,230 (115) | 346 (154) | 1.3 | |
CACTf | 612 (64) | 52 (95) | 1.5 | CAGTg | 698 (73) | 106 (96) | 1.3 | |
AATA | 2,230 (115) | 186 (166) | 1.4 | CTTA | 393 (28) | 59 (37) | 1.3 | |
ATTA | 1,957 (98) | 163 (141) | 1.4 | TCACf | 669 (70) | 99 (90) | 1.3 | |
GTAT | 949 (68) | 79 (98) | 1.4 | GCTA | 541 (57) | 77 (70) | 1.2 | |
CAGTg | 698 (73) | 58 (105) | 1.4 | CCCC | 190 (42) | 27 (52) | 1.2 | |
TAAA | 2,716 (140) | 222 (198) | 1.4 | TACC | 444 (47) | 63 (57) | 1.2 | |
AGGG | 233 (35) | 19 (50) | 1.4 | TAGT | 737 (53) | 104 (64) | 1.2 | |
TAGT | 737 (53) | 58 (72) | 1.4 | CACTf | 612 (64) | 86 (78) | 1.2 | |
SGHV | ||||||||
4-mer | TAAG | 848 (32) | 106 (96) | 3.0 | TAAG | 848 (32) | 154 (69) | 2.1 |
AGTC | 506 (53) | 36 (88) | 1.7 | AGTC | 506 (53) | 67 (82) | 1.6 | |
AGGT | 536 (52) | 38 (87) | 1.7 | GTAG | 676 (66) | 88 (101) | 1.5 | |
TAGG | 458 (44) | 32 (73) | 1.6 | AGGG | 245 (58) | 30 (84) | 1.4 | |
AGTA | 1,515 (58) | 103 (93) | 1.6 | AAGT | 1,703 (65) | 208 (94) | 1.4 | |
CAGT | 814 (85) | 55 (135) | 1.6 | AGTA | 1,515 (58) | 183 (83) | 1.4 | |
GCGC | 178 (123) | 12 (195) | 1.6 | GCAG | 418 (106) | 50 (149) | 1.4 | |
AAGT | 1,703 (65) | 113 (102) | 1.6 | AGCC | 347 (94) | 41 (131) | 1.4 | |
TAGT | 1,451 (58) | 95 (89) | 1.5 | TAGG | 458 (44) | 53 (61) | 1.4 | |
GTAG | 676 (66) | 44 (101) | 1.5 | AAGC | 631 (63) | 71 (84) | 1.3 | |
GTAA | 1,989 (76) | 128 (115) | 1.5 | ATAG | 1,655 (63) | 186 (84) | 1.3 | |
CTTA | 848 (36) | 54 (54) | 1.5 | TAGT | 1,451 (58) | 163 (77) | 1.3 | |
ATAA | 6,696 (101) | 420 (149) | 1.5 | CTTA | 848 (36) | 95 (48) | 1.3 | |
AAGA | 2,215 (81) | 135 (117) | 1.4 | AGGT | 536 (52) | 60 (69) | 1.3 | |
AGCC | 347 (94) | 21 (135) | 1.4 | CAGT | 814 (85) | 91 (112) | 1.3 | |
AGTT | 1,764 (70) | 106 (100) | 1.4 | AAGA | 2,215 (81) | 243 (105) | 1.3 | |
5-mer | ATAAG | 338 (35) | 70 (172) | 4.9 | ATAAG | 338 (35) | 89 (109) | 3.1 |
TAAGA | 297 (31) | 46 (113) | 3.7 | TAAGA | 297 (31) | 64 (79) | 2.5 | |
GTCAG | 80 (57) | 11 (187) | 3.2 | CAGTC | 93 (72) | 20 (182) | 2.5 | |
GTAAG | 128 (34) | 16 (100) | 3.0 | AGGGC | 57 (100) | 12 (248) | 2.5 | |
AGGGC | 57 (100) | 7 (289) | 2.9 | TCCGC | 52 (109) | 10 (248) | 2.3 | |
TCCGC | 52 (109) | 6 (297) | 2.7 | GTAAG | 128 (34) | 24 (75) | 2.2 | |
CCTTA | 83 (26) | 9 (67) | 2.6 | CTGGG | 67 (122) | 12 (258) | 2.1 | |
CGCGC | 28 (143) | 3 (362) | 2.5 | CGCGC | 28 (143) | 5 (301) | 2.1 | |
GCGCA | 79 (148) | 8 (354) | 2.4 | TAGCC | 96 (74) | 17 (155) | 2.1 | |
TAAGC | 99 (28) | 10 (67) | 2.4 | GGGCG | 30 (133) | 5 (262) | 2.0 | |
TAAGT | 369 (40) | 37 (95) | 2.4 | CCTAA | 139 (42) | 23 (82) | 2.0 | |
CGCCC | 30 (164) | 3 (388) | 2.4 | TAAGT | 369 (40) | 61 (78) | 2.0 | |
TAAGG | 83 (22) | 8 (50) | 2.3 | GTCAG | 80 (57) | 13 (110) | 1.9 | |
AAGGG | 94 (60) | 9 (136) | 2.3 | AGTAG | 200 (53) | 32 (100) | 1.9 | |
AGCGC | 42 (79) | 4 (177) | 2.2 | CCCTG | 25 (52) | 4 (99) | 1.9 | |
AGGTC | 42 (30) | 4 (68) | 2.2 | CAGGG | 25 (44) | 4 (83) | 1.9 |
Only the 15 motifs with the highest relative enrichment are shown for each virus. For AcMNPV, sequences that are part of the consensus TATA box [TATA(a/t)A] are underlined. Bold indicates a P value of ≤0.05
Both strands, excluding homologous repeats (present for AcMNPV and SHGV).
Occurrence of a 4-mer or 5-mer based on random distribution of nucleotides in the complete genome.
Part of the AcMNPV late initiator sequence (a/g/t)TAAG.
Part of the AcMNPV upstream activating element with sequence (a/t)GATA(a/t).
Part of the AcMNPV downstream activating element with sequence (a/t)CACNG.
Sequence of the AcMNPV early initiator sequence CAGT.