Abstract
We studied the frequency distribution of oligonucleotides 10 bp long in a sample of 620 Kb of viral genomes, containing 102 sequences from GenBank, with the aim of detecting transcription control signals. Two thousand three hundred decamers had a frequency 10 times higher than the mean and were subjected to further statistical analysis. For each of the 2300 decamers (parents), we counted the individual frequencies of the 30 decamers differing from the parent by one base mutation (progeny) and then calculated two variance/mean chi squares for the progeny, with and without the parent. We then studied the distribution of the ratio between the two chi squares. Out of 2300 decamers, 10 times more frequent than average, 479 decamers had a chi square ratio of 1.9 or larger. In this final set, which corresponds to less than 0.05% of all possible decamers, 58 decamers were found to contain viral and eukaryotic transcription control elements, like NF-kB, Sp1 and others. Furthermore, this set contains an excess of signals of length 5, 6, 7, 8, 9 and 10, when compared to 150 random sets, bootstrapped from the same viral genomes.
Full text
PDF







Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Almagor H. A Markov analysis of DNA sequences. J Theor Biol. 1983 Oct 21;104(4):633–645. doi: 10.1016/0022-5193(83)90251-5. [DOI] [PubMed] [Google Scholar]
- Barrai I., Scapoli C., Barale R., Volinia S. Oligonucleotide correlations between infector and host genomes hint at evolutionary relationships. Nucleic Acids Res. 1990 May 25;18(10):3021–3025. doi: 10.1093/nar/18.10.3021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clark S. P., Kaufhold R., Chan A., Mak T. W. Comparison of the transcriptional properties of the Friend and Moloney retrovirus long terminal repeats: importance of tandem duplications and of the core enhancer sequence. Virology. 1985 Jul 30;144(2):481–494. doi: 10.1016/0042-6822(85)90288-0. [DOI] [PubMed] [Google Scholar]
- Claverie J. M., Bougueleret L. Heuristic informational analysis of sequences. Nucleic Acids Res. 1986 Jan 10;14(1):179–196. doi: 10.1093/nar/14.1.179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- D'Ambrosio E., Waitzkin S. D., Witney F. R., Salemme A., Furano A. V. Structure of the highly repeated, long interspersed DNA family (LINE or L1Rn) of the rat. Mol Cell Biol. 1986 Feb;6(2):411–424. doi: 10.1128/mcb.6.2.411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DiGiovanni L., Haynes S. R., Misra R., Jelinek W. R. Kpn I family of long-dispersed repeated DNA sequences of man: evidence for entry into genomic DNA of DNA copies of poly(A)-terminated Kpn I RNAs. Proc Natl Acad Sci U S A. 1983 Nov;80(21):6533–6537. doi: 10.1073/pnas.80.21.6533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dorn A., Fehling H. J., Koch W., Le Meur M., Gerlinger P., Benoist C., Mathis D. B-cell control region at the 5' end of a major histocompatibility complex class II gene: sequences and factors. Mol Cell Biol. 1988 Oct;8(10):3975–3987. doi: 10.1128/mcb.8.10.3975. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grantham R., Gautier C., Gouy M., Jacobzone M., Mercier R. Codon catalog usage is a genome strategy modulated for gene expressivity. Nucleic Acids Res. 1981 Jan 10;9(1):r43–r74. doi: 10.1093/nar/9.1.213-b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grantham R., Gautier C., Gouy M., Mercier R., Pavé A. Codon catalog usage and the genome hypothesis. Nucleic Acids Res. 1980 Jan 11;8(1):r49–r62. doi: 10.1093/nar/8.1.197-c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones K. A., Kadonaga J. T., Luciw P. A., Tjian R. Activation of the AIDS retrovirus promoter by the cellular transcription factor, Sp1. Science. 1986 May 9;232(4751):755–759. doi: 10.1126/science.3008338. [DOI] [PubMed] [Google Scholar]
- Kim S., Ikeuchi K., Groopman J., Baltimore D. Factors affecting cellular tropism of human immunodeficiency virus. J Virol. 1990 Nov;64(11):5600–5604. doi: 10.1128/jvi.64.11.5600-5604.1990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McGeoch D. J., Dolan A., Donald S., Brauer D. H. Complete DNA sequence of the short repeat region in the genome of herpes simplex virus type 1. Nucleic Acids Res. 1986 Feb 25;14(4):1727–1745. doi: 10.1093/nar/14.4.1727. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sadler J. R., Waterman M. S., Smith T. F. Regulatory pattern identification in nucleic acid sequences. Nucleic Acids Res. 1983 Apr 11;11(7):2221–2231. doi: 10.1093/nar/11.7.2221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seto M. H., Brunck T. K., Bernstein R. L. Overlapping redundant septuplets identical with regulatory elements of HIV-1 and SV40. Nucleic Acids Res. 1989 Apr 11;17(7):2783–2800. doi: 10.1093/nar/17.7.2783. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith T. F., Waterman M. S., Sadler J. R. Statistical characterization of nucleic acid sequence functional domains. Nucleic Acids Res. 1983 Apr 11;11(7):2205–2220. doi: 10.1093/nar/11.7.2205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ullu E., Weiner A. M. Upstream sequences modulate the internal promoter of the human 7SL RNA gene. 1985 Nov 28-Dec 4Nature. 318(6044):371–374. doi: 10.1038/318371a0. [DOI] [PubMed] [Google Scholar]
- Volinia S., Bernardi F., Gambari R., Barrai I. Co-localization of rare oligonucleotides and regulatory elements in mammalian upstream gene regions. J Mol Biol. 1988 Sep 20;203(2):385–390. doi: 10.1016/0022-2836(88)90006-x. [DOI] [PubMed] [Google Scholar]
- Volinia S., Gambari R., Bernardi F., Barrai I. The frequency of oligonucleotides in mammalian genic regions. Comput Appl Biosci. 1989 Feb;5(1):33–40. doi: 10.1093/bioinformatics/5.1.33. [DOI] [PubMed] [Google Scholar]
- Wasylyk B. Enhancers and transcription factors in the control of gene expression. Biochim Biophys Acta. 1988 Nov 10;951(1):17–35. doi: 10.1016/0167-4781(88)90021-8. [DOI] [PubMed] [Google Scholar]
- Williams J. L., Garcia J., Harrich D., Pearson L., Wu F., Gaynor R. Lymphoid specific gene expression of the adenovirus early region 3 promoter is mediated by NF-kappa B binding motifs. EMBO J. 1990 Dec;9(13):4435–4442. doi: 10.1002/j.1460-2075.1990.tb07894.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wingender E. Compilation of transcription regulating proteins. Nucleic Acids Res. 1988 Mar 25;16(5):1879–1902. doi: 10.1093/nar/16.5.1879. [DOI] [PMC free article] [PubMed] [Google Scholar]
