Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 1985 Jun;82(12):4090–4094. doi: 10.1073/pnas.82.12.4090

Significance of similarities in patterns: an application to beta interferon-related DNA on human chromosome 2.

L T May, F R Landsberger, M Inouye, P B Sehgal
PMCID: PMC397940  PMID: 3858866

Abstract

The nucleotide sequence of a 14-kilobase (kb) region of the human beta interferon (IFN-beta)-related DNA locus on chromosome 2 (genomic DNA clone lambda B3) was determined and compared to that of the IFN-beta 1 gene by using the Sellers TT algorithm. This algorithm aligns segments of one sequence with similar segments in a second sequence. A strategy was developed for assessing the significance of similarities between DNA sequences based on a scheme that recognizes patterns or runs of identities within an alignment. The pattern score (II) thus obtained is an entropy-like measure. Numerically it is a reflection of the length of the second longest run of identity in an alignment plus a correction factor due to the other shorter identity runs in the alignment. When the IFN-beta 1 gene is compared to a random nucleotide sequence, the distribution of II scores in such comparisons fits a Gaussian function. This strategy has been used to identify seven segments along one strand of lambda B3 DNA that are related to segments in IFN-beta 1; these seven alignments have II scores greater than or equal to 3 standard deviations above the mean score obtained in comparisons between IFN-beta 1 and random nucleotide sequences. One of these alignments (section 7) has a II score 8.02 standard deviations above this mean score. The likelihood of finding an alignment statement as good as that in section 7 in a random sequence the length of the human genome is approximately 10(-7). Furthermore, the lambda B3 DNA sequence in section 7 selects the human IFN-beta 1 gene as the most significant alignment in computer searches of mammalian nucleotide sequence data bases.

Full text

PDF
4090

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Church W. R., Jernigan R. L., Toole J., Hewick R. M., Knopf J., Knutson G. J., Nesheim M. E., Mann K. G., Fass D. N. Coagulation factors V and VIII and ceruloplasmin constitute a family of structurally related proteins. Proc Natl Acad Sci U S A. 1984 Nov;81(22):6934–6937. doi: 10.1073/pnas.81.22.6934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Doolittle R. F. Similar amino acid sequences: chance or common ancestry? Science. 1981 Oct 9;214(4517):149–159. doi: 10.1126/science.7280687. [DOI] [PubMed] [Google Scholar]
  3. Erickson B. W., May L. T., Sehgal P. B. Internal duplication in human alpha 1 and beta 1 interferons. Proc Natl Acad Sci U S A. 1984 Nov;81(22):7171–7175. doi: 10.1073/pnas.81.22.7171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Fitch W. M., Smith T. F. Optimal sequence alignments. Proc Natl Acad Sci U S A. 1983 Mar;80(5):1382–1386. doi: 10.1073/pnas.80.5.1382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Goad W. B., Kanehisa M. I. Pattern recognition in nucleic acid sequences. I. A general method for finding local homologies and symmetries. Nucleic Acids Res. 1982 Jan 11;10(1):247–263. doi: 10.1093/nar/10.1.247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Hieter P. A., Max E. E., Seidman J. G., Maizel J. V., Jr, Leder P. Cloned human and mouse kappa immunoglobulin constant and J region genes conserve homology in functional segments. Cell. 1980 Nov;22(1 Pt 1):197–207. doi: 10.1016/0092-8674(80)90168-3. [DOI] [PubMed] [Google Scholar]
  7. Kanehisa M. I. Los Alamos sequence analysis package for nucleic acids and proteins. Nucleic Acids Res. 1982 Jan 11;10(1):183–196. doi: 10.1093/nar/10.1.183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Korn L. J., Queen C. L., Wegman M. N. Computer analysis of nucleic acid regulatory sequences. Proc Natl Acad Sci U S A. 1977 Oct;74(10):4401–4405. doi: 10.1073/pnas.74.10.4401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Messing J., Vieira J. A new pair of M13 vectors for selecting either DNA strand of double-digest restriction fragments. Gene. 1982 Oct;19(3):269–276. doi: 10.1016/0378-1119(82)90016-6. [DOI] [PubMed] [Google Scholar]
  10. Needleman S. B., Wunsch C. D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970 Mar;48(3):443–453. doi: 10.1016/0022-2836(70)90057-4. [DOI] [PubMed] [Google Scholar]
  11. Ohno S., Taniguchi T. Structure of a chromosomal gene for human interferon beta. Proc Natl Acad Sci U S A. 1981 Sep;78(9):5305–5309. doi: 10.1073/pnas.78.9.5305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Owerbach D., Rutter W. J., Shows T. B., Gray P., Goeddel D. V., Lawn R. M. Leukocyte and fibroblast interferon genes are located on human chromosome 9. Proc Natl Acad Sci U S A. 1981 May;78(5):3123–3127. doi: 10.1073/pnas.78.5.3123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Reich J. G., Drabsch H., Däumler A. On the statistical assessment of similarities in DNA sequences. Nucleic Acids Res. 1984 Jul 11;12(13):5529–5543. doi: 10.1093/nar/12.13.5529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Sagar A. D., Sehgal P. B., May L. T., Inouye M., Slate D. L., Shulman L., Ruddle F. H. Interferon-beta-related DNA is dispersed in the human genome. Science. 1984 Mar 23;223(4642):1312–1315. doi: 10.1126/science.6546621. [DOI] [PubMed] [Google Scholar]
  15. Sanger F., Nicklen S., Coulson A. R. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A. 1977 Dec;74(12):5463–5467. doi: 10.1073/pnas.74.12.5463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Schmid C. W., Jelinek W. R. The Alu family of dispersed repetitive sequences. Science. 1982 Jun 4;216(4550):1065–1070. doi: 10.1126/science.6281889. [DOI] [PubMed] [Google Scholar]
  17. Sehgal P. B., May L. T., Sagar A. D., LaForge K. S., Inouye M. Isolation of novel human genomic DNA clones related to human interferon-beta 1 cDNA. Proc Natl Acad Sci U S A. 1983 Jun;80(12):3632–3636. doi: 10.1073/pnas.80.12.3632. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Wilbur W. J., Lipman D. J. Rapid similarity searches of nucleic acid and protein data banks. Proc Natl Acad Sci U S A. 1983 Feb;80(3):726–730. doi: 10.1073/pnas.80.3.726. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES