Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2007 Jun 1;30(2):249–268. doi: 10.1007/s00291-007-0089-0

A LAD-based method for selecting short oligo probes for genotyping applications

Kwangsoo Kim 1, Hong Seo Ryoo 1,
PMCID: PMC7080176  PMID: 32214570

Abstract

Specializing a general framework of logical analysis of data for efficiently handling large-scale genomic data, we develop in this paper a probe design method for selecting short oligo probes for genotyping applications. When tested on genomic sequences obtained from the National Center of Biotechnology Information in various monospecific and polyspecific in silico experiments, the proposed probe design method was able to select a small number of oligo probes of length 7 or 8 nucleotides that perfectly classified all unseen testing sequences. These results demonstrate the efficacy of the proposed probe design method and illustrate the usefulness and potential a well-designed optimization-based probe selection method has in genotyping applications.

Keywords: Oligo probes, Microarrays, LAD, Set covering, Classification, Optimization, SARS, AI

References

  1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  2. Borneman J, Chrobak M, Vedova GD, Figueroa A, Jiang T. Probe selection algorithms with applications in the analysis of microbial communities. Bioinformatics. 2001;17(Suppl 1):S39–S48. doi: 10.1093/bioinformatics/17.suppl_1.s39. [DOI] [PubMed] [Google Scholar]
  3. Boros E, Hammer PL, Ibaraki T, Kogan A, Mayoraz E, Muchnik I. An implementation of logical analysis of data. IEEE Trans Knowl Data Eng. 2000;12:292–306. doi: 10.1109/69.842268. [DOI] [Google Scholar]
  4. Bosch FX, Lorincz A, Muñoz N, Meijer CJLM, Shah KV. The causal relation between human papillomavirus and cervical cancer. J Clin Pathol. 2002;55:244–265. doi: 10.1136/jcp.55.4.244. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Caprara A, Fischetti M, Toth P. A heuristic method for the set covering problem. Oper Res. 1999;47(5):730–743. doi: 10.1287/opre.47.5.730. [DOI] [Google Scholar]
  6. Cortes C, Vapnik VN. Support vector networks. Mach Learn. 1995;20:273–297. [Google Scholar]
  7. Eom J-H, Park S-B, Zhang B-T. Genetic mining of dna sequence structures for effective classification of the risk types of human papillomavirus (hpv) In: Pal NR, Kasabov N, Mudi RK, Pal S, Parui SK, editors. Lecture notes in computer science, vol 3316. Heidelberg: Springer; 2004. pp. 1334–1343. [Google Scholar]
  8. Garey MR, Johnson DS. Computers and intractability: a guide to the theory of Inline graphic –completeness. New York: Freeman; 1979. [Google Scholar]
  9. Hammer PL (1986) Partially defined boolean functions and cause-effect relationships. In: Proceedings of the international conference on multi-attribute decision making via OR-based expert systems, April 1986
  10. Heller RA, Schena M, Chai A, Shalon D, Bedilion T, Gilmore J, Woolley DE, Davis RW. Discovery and analysis of inflammatory disease-related genes using cdna microarrays. Proc Nat Acad Sci. 1997;94:2150–2155. doi: 10.1073/pnas.94.6.2150. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Herwig R, Schmitt AO, Steinfath M, O’Brien J, Seidel H, Meier-Ewert S, Lehrach H, Radelof U. Information theoretical probe selection for hybridisation experiments. Bioinformatics. 2000;16(10):890–898. doi: 10.1093/bioinformatics/16.10.890. [DOI] [PubMed] [Google Scholar]
  12. Klau GW, Rahmann S, Schliep A, Vingron M, Reinert K. Optimal robust non-unique probe selection using integer linear programming. Bioinformatics. 2004;20(Suppl 1):i186–i193. doi: 10.1093/bioinformatics/bth936. [DOI] [PubMed] [Google Scholar]
  13. Koopmans M, Wilbrink B, Conyn M, Natrop G, van der Nat H, Vennema H, Meijer A, van Steenbergen J, Fouchier R, Osterhaus A, Bosman A. Transmission of h7n7 avian influenza a virus to human beings during a large outbreak in commercial poultry farms in the netherlands. Lancet. 2004;363:587–593. doi: 10.1016/S0140-6736(04)15589-X. [DOI] [PubMed] [Google Scholar]
  14. Lee I-H, Kim S, Zhang B-T. Multi-objective evolutionary probe design based on thermodynamic criteria for hpv detection. In: Zhang C, Guesgen HW, Yeap WK, editors. Lecture notes in artificial intelligence, vol 3157. Berlin Heidelberg: Springer; 2004. pp. 742–750. [Google Scholar]
  15. Lee Y, Lee C-K. Classification of multiple cancer types by multicategory support vector machines using gene expression data. Bioinformatics, 2003;19(9):1132–1139. doi: 10.1093/bioinformatics/btg102. [DOI] [PubMed] [Google Scholar]
  16. Li F, Stormo GD. Selection of optimal dna oligos for gene expression arrays. Bioinformatics. 2001;17(11):1067–1076. doi: 10.1093/bioinformatics/17.11.1067. [DOI] [PubMed] [Google Scholar]
  17. Liu C-H, Ma W-L, Shi R, Ou Y-Q, Zhang B, Zheng W-L. Possibility of using dna chip technology for diagnosis of human papillomavirus. J Biochemistry Mol Biol. 2003;36(4):349–353. doi: 10.5483/bmbrep.2003.36.4.349. [DOI] [PubMed] [Google Scholar]
  18. McFadden SE, Schumann L. The role of human papillomavirus in screening for cervical cancer. J Am Acad Nurse Pract. 2001;13:116–125. doi: 10.1111/j.1745-7599.2001.tb00231.x. [DOI] [PubMed] [Google Scholar]
  19. Megiddo N. On the complexity of polyhedral separability. Discrete Comput Geom. 1988;3:325–337. doi: 10.1007/BF02187916. [DOI] [Google Scholar]
  20. Muñoz N, Bosch FX, de Sanjosé S, Herrero R, Castellsagué X, Shah KV, Snijders PJF, Meijer CJLM. For the International Agency for Research on Cancer Multicenter Cervical Cancer Study Group Epidemiologic classification of human papillomavirus types associated with cervical cancer. New Engl J Med. 2003;348(6):518–527. doi: 10.1056/NEJMoa021641. [DOI] [PubMed] [Google Scholar]
  21. Nemhauser GL, Wolsey LA. Integer and combinatorial optimization. Wiley-Interscience Series I: discrete mathematics and optimization. New York: Wiley; 1988. [Google Scholar]
  22. Park S-B, Hwang S-H, Zhang B-T (2003) Classification of the risk types of human papillomavirus by decision trees. In: Proceedings of the 4th international conference on intelligent data engineering and automated learning, pp 540–544
  23. Rahmann S. Fast large scale oligonucleotide selection using the longest common factor approach. J Bioinform Comput Biol. 2003;1(2):343–361. doi: 10.1142/S0219720003000125. [DOI] [PubMed] [Google Scholar]
  24. Ryoo HS, Jang I-Y (2005) Milp approach to pattern generation in logical analysis of data (submitted)
  25. Schena M. DNA microarray: a practical approach. Oxford: Oxford University Press; 1999. [Google Scholar]
  26. Sengupta S, Onodera K, Lai A, Melcher U. Molecular detection and identification of influenza viruses by oligonucleotide microarray hybridization. J Clin Microbiol. 2003;41(10):4542–4550. doi: 10.1128/JCM.41.10.4542-4550.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Snijder EJ, Bredenbeek PJ, Dobbe JC, Thiel V, Ziebuhr J, Poon LLM, Guan Y, Rozanov M, Spaan WJM, Gorbalenya AE. Unique and conserved features of genome and proteome of sars-coronavirus, an early split-off from the coronavirus group 2 lineage. J Mol Biol. 2003;331:991–1004. doi: 10.1016/S0022-2836(03)00865-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Stears RL, Martinsky T, Schena M. Trends in microarray analysis. Nat Med. 2003;9(1):140–145. doi: 10.1038/nm0103-140. [DOI] [PubMed] [Google Scholar]
  29. Ullman J. Pattern recognition techniques. London: Crane; 1973. [Google Scholar]
  30. Vapnik VN. Statistical learning theory. New York: Wiley-Interscience; 1998. [Google Scholar]
  31. Vernet G. Dna-chip technology and infectious diseases. Virus Res. 2002;82:65–71. doi: 10.1016/S0168-1702(01)00435-X. [DOI] [PubMed] [Google Scholar]
  32. Wang X, Seed B. Selection of oligonucleotide probes for protein coding sequences. Bioinformatics. 2003;19(7):796–802. doi: 10.1093/bioinformatics/btg086. [DOI] [PubMed] [Google Scholar]
  33. Wang D, Coscoy L, Zylberberg M, Avila PC, Boushey HA, Ganem D, DeRisi JL. Microarray-based detection and genotyping of viral pathogens. PNAS. 2002;99(24):15687–15692. doi: 10.1073/pnas.242579699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Webby RJ, Webster RG. Are we ready for pandemic influenza. Science. 2003;302:1519–1522. doi: 10.1126/science.1090350. [DOI] [PubMed] [Google Scholar]
  35. Zhou YM, Yang RQ, Tao SC, Li Z, Zhang Q, Gao HF, Zhang ZW, Du JY, Zhu PX, Ren LL, Zhang L, Wang D, Guo L, Wang YB, Guo Y, Zhang Y, Zhao CZ, Wang C, Jiang D, Liu YH, Yang HW, Rong L, Zhao YJ, An S, Li Z, Fan XD, Wang JW, Cheng Y, Liu O, Zheng Z, Zuo HC, Shan QZ, Ruan L, Lu ZX, Hung T, Cheng J. The design and application of dna chips for early detection of sars-cov from clinical samples. J Clin Virol. 2005;33(2):123–131. doi: 10.1016/j.jcv.2004.10.019. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Or Spectrum are provided here courtesy of Nature Publishing Group

RESOURCES