Abstract
We have written a computer program, BIGPROBE, which facilitates the design of long nucleic acid probes from the partial or complete amino acid sequence of a protein. BIGPROBE relies upon information on codon usage, intercodon dinucleotide frequency, and potential probe self-complementarity. We have examined the accuracy with which the program predicts coding sequences using sample human and rat genes and probe lengths of 30-60 nucleotides. Rat probe sequences selected by BIGPROBE using either codon usage or dinucleotide frequency data alone averaged 86-92% homology with the known exons of the corresponding gene sequences. Predictive accuracy with rat gene probes could be improved to 89-94%, depending upon probe length, by applying codon usage and dinucleotide frequency data in combination. Similar accuracy was achieved for human genes.
Full text
PDF











Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Anderson S., Kingston I. B. Isolation of a genomic clone for bovine pancreatic trypsin inhibitor by using a unique-sequence synthetic DNA probe. Proc Natl Acad Sci U S A. 1983 Nov;80(22):6838–6842. doi: 10.1073/pnas.80.22.6838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Coussens L., Parker P. J., Rhee L., Yang-Feng T. L., Chen E., Waterfield M. D., Francke U., Ullrich A. Multiple, distinct forms of bovine and human protein kinase C suggest diversity in cellular signaling pathways. Science. 1986 Aug 22;233(4766):859–866. doi: 10.1126/science.3755548. [DOI] [PubMed] [Google Scholar]
- Derynck R., Jarrett J. A., Chen E. Y., Eaton D. H., Bell J. R., Assoian R. K., Roberts A. B., Sporn M. B., Goeddel D. V. Human transforming growth factor-beta complementary DNA sequence and expression in normal and transformed cells. Nature. 1985 Aug 22;316(6030):701–705. doi: 10.1038/316701a0. [DOI] [PubMed] [Google Scholar]
- Derynck R., Roberts A. B., Winkler M. E., Chen E. Y., Goeddel D. V. Human transforming growth factor-alpha: precursor structure and expression in E. coli. Cell. 1984 Aug;38(1):287–297. doi: 10.1016/0092-8674(84)90550-6. [DOI] [PubMed] [Google Scholar]
- Gitschier J., Wood W. I., Goralka T. M., Wion K. L., Chen E. Y., Eaton D. H., Vehar G. A., Capon D. J., Lawn R. M. Characterization of the human factor VIII gene. Nature. 1984 Nov 22;312(5992):326–330. doi: 10.1038/312326a0. [DOI] [PubMed] [Google Scholar]
- Itakura K., Rossi J. J., Wallace R. B. Synthesis and use of synthetic oligonucleotides. Annu Rev Biochem. 1984;53:323–356. doi: 10.1146/annurev.bi.53.070184.001543. [DOI] [PubMed] [Google Scholar]
- Jaye M., de la Salle H., Schamber F., Balland A., Kohli V., Findeli A., Tolstoshev P., Lecocq J. P. Isolation of a human anti-haemophilic factor IX cDNA clone using a unique 52-base synthetic oligonucleotide probe deduced from the amino acid sequence of bovine factor IX. Nucleic Acids Res. 1983 Apr 25;11(8):2325–2335. doi: 10.1093/nar/11.8.2325. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kubo T., Fukuda K., Mikami A., Maeda A., Takahashi H., Mishina M., Haga T., Haga K., Ichiyama A., Kangawa K. Cloning, sequencing and expression of complementary DNA encoding the muscarinic acetylcholine receptor. Nature. 1986 Oct 2;323(6087):411–416. doi: 10.1038/323411a0. [DOI] [PubMed] [Google Scholar]
- Lathe R. Synthetic oligonucleotide probes deduced from amino acid sequence data. Theoretical and practical considerations. J Mol Biol. 1985 May 5;183(1):1–12. doi: 10.1016/0022-2836(85)90276-1. [DOI] [PubMed] [Google Scholar]
- Lewis R. M. PROBFIND: a computer program for selecting oligonucleotide probes from peptide sequences. Nucleic Acids Res. 1986 Jan 10;14(1):567–570. doi: 10.1093/nar/14.1.567. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Linnenbach A. J., Speicher D. W., Marchesi V. T., Forget B. G. Cloning of a portion of the chromosomal gene for human erythrocyte alpha-spectrin by using a synthetic gene fragment. Proc Natl Acad Sci U S A. 1986 Apr;83(8):2397–2401. doi: 10.1073/pnas.83.8.2397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maruyama T., Gojobori T., Aota S., Ikemura T. Codon usage tabulated from the GenBank genetic sequence data. Nucleic Acids Res. 1986;14 (Suppl):r151–r197. doi: 10.1093/nar/14.suppl.r151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McLean J., Fielding C., Drayna D., Dieplinger H., Baer B., Kohr W., Henzel W., Lawn R. Cloning and expression of human lecithin-cholesterol acyltransferase cDNA. Proc Natl Acad Sci U S A. 1986 Apr;83(8):2335–2339. doi: 10.1073/pnas.83.8.2335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mount D. W., Conrad B. Improved programs for DNA and protein sequence analysis on the IBM personal computer and other standard computer systems. Nucleic Acids Res. 1986 Jan 10;14(1):443–454. doi: 10.1093/nar/14.1.443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parker P. J., Coussens L., Totty N., Rhee L., Young S., Chen E., Stabel S., Waterfield M. D., Ullrich A. The complete primary structure of protein kinase C--the major phorbol ester receptor. Science. 1986 Aug 22;233(4766):853–859. doi: 10.1126/science.3755547. [DOI] [PubMed] [Google Scholar]
- Raupach R. E. Computer programs used to aid in the selection of DNA hybridization probes. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):833–836. doi: 10.1093/nar/12.1part2.833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Santibanez-Koref M., Reich J. G. Dinucleotide frequencies in different reading frame positions of coding bacterial DNA sequences. Biomed Biochim Acta. 1986;45(9):1105–1109. [PubMed] [Google Scholar]
- Santibánez-Koref M., Reich J. G. Dinucleotide frequencies in different reading frame positions of coding mammalian DNA sequences. Biomed Biochim Acta. 1986;45(6):737–748. [PubMed] [Google Scholar]
- Smith T. F., Waterman M. S., Sadler J. R. Statistical characterization of nucleic acid sequence functional domains. Nucleic Acids Res. 1983 Apr 11;11(7):2205–2220. doi: 10.1093/nar/11.7.2205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Toole J. J., Knopf J. L., Wozney J. M., Sultzman L. A., Buecker J. L., Pittman D. D., Kaufman R. J., Brown E., Shoemaker C., Orr E. C. Molecular cloning of a cDNA encoding human antihaemophilic factor. Nature. 1984 Nov 22;312(5992):342–347. doi: 10.1038/312342a0. [DOI] [PubMed] [Google Scholar]
- Ullrich A., Bell J. R., Chen E. Y., Herrera R., Petruzzelli L. M., Dull T. J., Gray A., Coussens L., Liao Y. C., Tsubokawa M. Human insulin receptor and its relationship to the tyrosine kinase family of oncogenes. 1985 Feb 28-Mar 6Nature. 313(6005):756–761. doi: 10.1038/313756a0. [DOI] [PubMed] [Google Scholar]
- Ullrich A., Berman C. H., Dull T. J., Gray A., Lee J. M. Isolation of the human insulin-like growth factor I gene using a single synthetic DNA probe. EMBO J. 1984 Feb;3(2):361–364. doi: 10.1002/j.1460-2075.1984.tb01812.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wood W. I., Capon D. J., Simonsen C. C., Eaton D. L., Gitschier J., Keyt B., Seeburg P. H., Smith D. H., Hollingshead P., Wion K. L. Expression of active human factor VIII from recombinant DNA clones. Nature. 1984 Nov 22;312(5992):330–337. doi: 10.1038/312330a0. [DOI] [PubMed] [Google Scholar]
- Yang J. H., Ye J. H., Wallace D. C. Computer selection of oligonucleotide probes from amino acid sequences for use in gene library screening. Nucleic Acids Res. 1984 Jan 11;12(1 Pt 2):837–843. doi: 10.1093/nar/12.1part2.837. [DOI] [PMC free article] [PubMed] [Google Scholar]
