TABLE 3.
Frequencies of 4- or 5-nt motifs in the 5′ upstream regions of the SGHV ORFs compared to the complete genome for the baculovirus AcMNPVa
| Virus and motif length | 100 nt upstream
|
200 nt upstream
|
||||||
|---|---|---|---|---|---|---|---|---|
| Motif | Occurrence in genomeb (% of expected occurrencec) | Occurrence in upstream regions (% of expected occurrence) | Relative enrichment in upstream regions | Motif | Occurrence in genome (% of expected occurrence) | Occurrence in upstream regions (% of expected occurrence) | Relative enrichment in upstream regions | |
| AcMNPV | TAAGd | 393 (29) | 90 (114) | 4 | TAAGd | 393 (29) | 137 (87) | 3 |
| TATA | 1,314 (66) | 172 (149) | 2.3 | TATA | 1,314 (66) | 255 (111) | 1.7 | |
| ATAA | 1,973 (101) | 222 (198) | 1.9 | ATAA | 1,973 (101) | 363 (162) | 1.6 | |
| ATAT | 1,616 (81) | 170 (147) | 1.8 | ATAT | 1,616 (81) | 268 (116) | 1.4 | |
| AGTA | 671 (49) | 70 (89) | 1.8 | AGTA | 671 (49) | 109 (69) | 1.4 | |
| AAGG | 473 (51) | 41 (76) | 1.5 | GATAe | 867 (63) | 137 (87) | 1.4 | |
| GATAe | 867 (63) | 74 (94) | 1.5 | AATA | 2,230 (115) | 346 (154) | 1.3 | |
| CACTf | 612 (64) | 52 (95) | 1.5 | CAGTg | 698 (73) | 106 (96) | 1.3 | |
| AATA | 2,230 (115) | 186 (166) | 1.4 | CTTA | 393 (28) | 59 (37) | 1.3 | |
| ATTA | 1,957 (98) | 163 (141) | 1.4 | TCACf | 669 (70) | 99 (90) | 1.3 | |
| GTAT | 949 (68) | 79 (98) | 1.4 | GCTA | 541 (57) | 77 (70) | 1.2 | |
| CAGTg | 698 (73) | 58 (105) | 1.4 | CCCC | 190 (42) | 27 (52) | 1.2 | |
| TAAA | 2,716 (140) | 222 (198) | 1.4 | TACC | 444 (47) | 63 (57) | 1.2 | |
| AGGG | 233 (35) | 19 (50) | 1.4 | TAGT | 737 (53) | 104 (64) | 1.2 | |
| TAGT | 737 (53) | 58 (72) | 1.4 | CACTf | 612 (64) | 86 (78) | 1.2 | |
| SGHV | ||||||||
| 4-mer | TAAG | 848 (32) | 106 (96) | 3.0 | TAAG | 848 (32) | 154 (69) | 2.1 |
| AGTC | 506 (53) | 36 (88) | 1.7 | AGTC | 506 (53) | 67 (82) | 1.6 | |
| AGGT | 536 (52) | 38 (87) | 1.7 | GTAG | 676 (66) | 88 (101) | 1.5 | |
| TAGG | 458 (44) | 32 (73) | 1.6 | AGGG | 245 (58) | 30 (84) | 1.4 | |
| AGTA | 1,515 (58) | 103 (93) | 1.6 | AAGT | 1,703 (65) | 208 (94) | 1.4 | |
| CAGT | 814 (85) | 55 (135) | 1.6 | AGTA | 1,515 (58) | 183 (83) | 1.4 | |
| GCGC | 178 (123) | 12 (195) | 1.6 | GCAG | 418 (106) | 50 (149) | 1.4 | |
| AAGT | 1,703 (65) | 113 (102) | 1.6 | AGCC | 347 (94) | 41 (131) | 1.4 | |
| TAGT | 1,451 (58) | 95 (89) | 1.5 | TAGG | 458 (44) | 53 (61) | 1.4 | |
| GTAG | 676 (66) | 44 (101) | 1.5 | AAGC | 631 (63) | 71 (84) | 1.3 | |
| GTAA | 1,989 (76) | 128 (115) | 1.5 | ATAG | 1,655 (63) | 186 (84) | 1.3 | |
| CTTA | 848 (36) | 54 (54) | 1.5 | TAGT | 1,451 (58) | 163 (77) | 1.3 | |
| ATAA | 6,696 (101) | 420 (149) | 1.5 | CTTA | 848 (36) | 95 (48) | 1.3 | |
| AAGA | 2,215 (81) | 135 (117) | 1.4 | AGGT | 536 (52) | 60 (69) | 1.3 | |
| AGCC | 347 (94) | 21 (135) | 1.4 | CAGT | 814 (85) | 91 (112) | 1.3 | |
| AGTT | 1,764 (70) | 106 (100) | 1.4 | AAGA | 2,215 (81) | 243 (105) | 1.3 | |
| 5-mer | ATAAG | 338 (35) | 70 (172) | 4.9 | ATAAG | 338 (35) | 89 (109) | 3.1 |
| TAAGA | 297 (31) | 46 (113) | 3.7 | TAAGA | 297 (31) | 64 (79) | 2.5 | |
| GTCAG | 80 (57) | 11 (187) | 3.2 | CAGTC | 93 (72) | 20 (182) | 2.5 | |
| GTAAG | 128 (34) | 16 (100) | 3.0 | AGGGC | 57 (100) | 12 (248) | 2.5 | |
| AGGGC | 57 (100) | 7 (289) | 2.9 | TCCGC | 52 (109) | 10 (248) | 2.3 | |
| TCCGC | 52 (109) | 6 (297) | 2.7 | GTAAG | 128 (34) | 24 (75) | 2.2 | |
| CCTTA | 83 (26) | 9 (67) | 2.6 | CTGGG | 67 (122) | 12 (258) | 2.1 | |
| CGCGC | 28 (143) | 3 (362) | 2.5 | CGCGC | 28 (143) | 5 (301) | 2.1 | |
| GCGCA | 79 (148) | 8 (354) | 2.4 | TAGCC | 96 (74) | 17 (155) | 2.1 | |
| TAAGC | 99 (28) | 10 (67) | 2.4 | GGGCG | 30 (133) | 5 (262) | 2.0 | |
| TAAGT | 369 (40) | 37 (95) | 2.4 | CCTAA | 139 (42) | 23 (82) | 2.0 | |
| CGCCC | 30 (164) | 3 (388) | 2.4 | TAAGT | 369 (40) | 61 (78) | 2.0 | |
| TAAGG | 83 (22) | 8 (50) | 2.3 | GTCAG | 80 (57) | 13 (110) | 1.9 | |
| AAGGG | 94 (60) | 9 (136) | 2.3 | AGTAG | 200 (53) | 32 (100) | 1.9 | |
| AGCGC | 42 (79) | 4 (177) | 2.2 | CCCTG | 25 (52) | 4 (99) | 1.9 | |
| AGGTC | 42 (30) | 4 (68) | 2.2 | CAGGG | 25 (44) | 4 (83) | 1.9 | |
Only the 15 motifs with the highest relative enrichment are shown for each virus. For AcMNPV, sequences that are part of the consensus TATA box [TATA(a/t)A] are underlined. Bold indicates a P value of ≤0.05
Both strands, excluding homologous repeats (present for AcMNPV and SHGV).
Occurrence of a 4-mer or 5-mer based on random distribution of nucleotides in the complete genome.
Part of the AcMNPV late initiator sequence (a/g/t)TAAG.
Part of the AcMNPV upstream activating element with sequence (a/t)GATA(a/t).
Part of the AcMNPV downstream activating element with sequence (a/t)CACNG.
Sequence of the AcMNPV early initiator sequence CAGT.