Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2005;3610:179–186. doi: 10.1007/11539087_20

Bio-kernel Self-organizing Map for HIV Drug Resistance Classification

Zheng Rong Yang 4, Natasha Young 4
Editors: Lipo Wang16, Ke Chen17, Yew Soon Ong18
PMCID: PMC7122014

Abstract

Kernel self-organizing map has been recently studied by Fyfe and his colleagues [1]. This paper investigates the use of a novel bio-kernel function for the kernel self-organizing map. For verification, the application of the proposed new kernel self-organizing map to HIV drug resistance classification using mutation patterns in protease sequences is presented. The original self-organizing map together with the distributed encoding method was compared. It has been found that the use of the kernel self-organizing map with the novel bio-kernel function leads to better classification and faster convergence rate ...

Keywords: Protease Cleavage Site, Support Vector Machine Approach, Signal Peptide Cleavage Site, Regularization Factor, Protein Secondary Structure Prediction

Contributor Information

Lipo Wang, Email: elpwang@ntu.edu.sg.

Ke Chen, Email: chenk@cs.zju.edu.cn.

Yew Soon Ong, Email: asysong@ntu.edu.sg.

Zheng Rong Yang, Email: z.r.yang@ex.ac.uk, http://www.dcs.ex.ac.uk/~zryang.

Natasha Young, http://www.dcs.ex.ac.uk/~zryang.

References

  • 1.Corchado, E., Fyfe, C.: Relevance and kernel self-organising maps. In: International Conference on Artificial Neural Networks (2003)
  • 2.Qian N., Sejnowski T.J. Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol. 1988;202:865–884. doi: 10.1016/0022-2836(88)90564-5. [DOI] [PubMed] [Google Scholar]
  • 3.Thompson T.B., Chou K.C., Zhang C. Neural network prediction of the HIV-1 protease cleavage sites. Journal of Theoretical Biology. 1995;177:369–379. doi: 10.1006/jtbi.1995.0254. [DOI] [PubMed] [Google Scholar]
  • 4.Nielsen M., Lundegaard C., Worning P., Lauemoller S.L., Lamberth K., Buss S., Brukak S., Lund O. Reliable prediction of T-cell epitopes using neural networks with novel sequence representations. Protein Science. 2003;12:1007–1017. doi: 10.1110/ps.0239403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hansen J.E., Lund O., Engelbrecht J., Bohr H., Nielsen J.O. Prediction of O-glycosylation of mammalian proteins: specificity patterns of UDP-GalNAc: polypeptide N-acetylgalactosaminyltransferase. Biochem. J. 1995;30:801–813. doi: 10.1042/bj3080801. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Gutteridge A., Bartlett G.J., Thornton J.M. Using a neural network and spatial clustering to predict the location of active sites in enzymes. Journal of Molecular Biology. 2003;330:719–734. doi: 10.1016/S0022-2836(03)00515-1. [DOI] [PubMed] [Google Scholar]
  • 7.Blom N., Gammeltoft S., Brunak S. Sequence and structure based prediction of eukaryotic protein phosphorylation sites. J. Mol. Biol. 1999;294:1351–1362. doi: 10.1006/jmbi.1999.3310. [DOI] [PubMed] [Google Scholar]
  • 8.Ehrlich L., Reczko M., Bohr H., Wade R.C. Prediction of protein hydration sites from sequence by modular neural networks. Protein Eng. 1998;11:11–19. doi: 10.1093/protein/11.1.11. [DOI] [PubMed] [Google Scholar]
  • 9.Thomson R., Hodgman T.C., Yang Z.R., Doyle A.K. Characterising proteolytic cleavage site activity using bio-basis function neural networks. Bioinformatics. 2003;19:1741–1747. doi: 10.1093/bioinformatics/btg237. [DOI] [PubMed] [Google Scholar]
  • 10.Yang Z.R., Thomson R. A novel neural network method in mining molecular sequence data. IEEE Trans. on Neural Networks. 2005;16:263–274. doi: 10.1109/TNN.2004.836196. [DOI] [PubMed] [Google Scholar]
  • 11.Yang Z.R. Orthogonal kernel machine in prediction of functional sites in preteins. IEEE Trans on Systems, Man and Cybernetics. 2005;35:100–106. doi: 10.1109/TSMCB.2004.840723. [DOI] [PubMed] [Google Scholar]
  • 12.Cai Y.D., Ricardo P.W., Jen C.H., Chou K.C. Application of SVMs to predict membrane protein types. Journal of Theoretical Biology. 2004;226:373–376. doi: 10.1016/j.jtbi.2003.08.015. [DOI] [PubMed] [Google Scholar]
  • 13.Cai Y.D., Lin X.J., Xu X.B., Chou K.C. Prediction of protein structural classes by support vector machines. Computers & Chemistry. 2002;26:293–296. doi: 10.1016/S0097-8485(01)00113-9. [DOI] [PubMed] [Google Scholar]
  • 14.Hua S., Sun Z. Support vector machine approach for protein subcellular localization prediction. Bioinformatics. 2001;17:721–728. doi: 10.1093/bioinformatics/17.8.721. [DOI] [PubMed] [Google Scholar]
  • 15.Chu F., Jin G., Wang L. Cancer diagnosis and protein secondary structure prediction using support vector machines. In: Wang L., editor. Support Vector Machines. Heidelberg: Springer; 2004. [Google Scholar]
  • 16.Park K., Kanehisa M. Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs. Bioinformatics. 2003;19:1656–1663. doi: 10.1093/bioinformatics/btg222. [DOI] [PubMed] [Google Scholar]
  • 17.Carter R.J., Dubchak I., Holbrook S.R. A computational approach to identify genes for functional RNAs in genomic sequences. Nucleic Acids Res. 2001;29:3928–3938. doi: 10.1093/nar/29.19.3928. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Ding C.H.Q., Dubchak I. Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics. 2001;17:349–358. doi: 10.1093/bioinformatics/17.4.349. [DOI] [PubMed] [Google Scholar]
  • 19.Cai C.Z., Wang W.L., Sun L.Z., Chen Y.Z. Protein function classification via support vector machine approach. Mathematical Biosciences. 2003;185:111–122. doi: 10.1016/S0025-5564(03)00096-8. [DOI] [PubMed] [Google Scholar]
  • 20.Cai Y.D., Lin S.L. Support vector machines for predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence. Biochimica et Biophysica Acta (BBA) - Proteins & Proteomics. 2003;1648:127–133. doi: 10.1016/S1570-9639(03)00112-2. [DOI] [PubMed] [Google Scholar]
  • 21.Lin K., Kuang Y., Joseph J.S., Kolatkar P.R. Conserved codon composition of ribo-somal protein coding genes in Escherichia coli, Mycobacterium tuberculosis and Saccharomyces cerevisiae: lessons from supervised machine learning in functional genomics. Nucleic Acids Res. 2002;30:2599–2607. doi: 10.1093/nar/30.11.2599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Jaakkola, T., Diekhans, M., Haussler, D.: Using the Fisher kernel method to detect remote protein homologies. In: Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology, pp. 149–158 (1999) [PubMed]
  • 23.Jaakkola T., Diekhans M., Haussler D. A Discriminative Framework for Detecting Remote Protein Homologies. Journal of Computational Biology. 2000;7:95–114. doi: 10.1089/10665270050081405. [DOI] [PubMed] [Google Scholar]
  • 24.Karchin R., Karplus K., Haussler D. Classifying G-protein coupled receptors with support vector machines. Bioinformatics. 2002;18:147–159. doi: 10.1093/bioinformatics/18.1.147. [DOI] [PubMed] [Google Scholar]
  • 25.Guermeur Y., Pollastri G., Elisseeff A., Zelus D., Paugam-Moisy H., Baldi P. Combining protein secondary structure prediction models with ensemble methods of optimal complexity. Neurocomputing. 2004;56:305–327. doi: 10.1016/j.neucom.2003.10.004. [DOI] [Google Scholar]
  • 26.Kohonen T. Self organization and associative Memory. 3. Berling: Springer; 1989. [Google Scholar]
  • 27.Arrigo, P., Giuliano, F., Scalia, F., Rapallo, A., Damiani, G.: Identification of a new motif on nucleic acid sequence data using Kohonen’s self-organising map. In: CABIOS, vol. 7, pp .353–357 (1991) [DOI] [PubMed]
  • 28.Bengio, Y., Pouliot, Y.: Efficient recognition of immunoglobulin domains from amino acid sequences using a neural network. In: CABIOS, vol. 6, pp. 319–324 (1990) [DOI] [PubMed]
  • 29.Ferran E.A., Ferrara P. Topological maps of protein sequences. Biological Cybernetics. 1991;65:451–458. doi: 10.1007/BF00204658. [DOI] [PubMed] [Google Scholar]
  • 30.Wang H.C., Dopazo J., Carazo J.M. Self-organising tree growing network for classifying amino acids. Bioinformatics. 1998;14:376–377. doi: 10.1093/bioinformatics/14.4.376. [DOI] [PubMed] [Google Scholar]
  • 31.Ferran, E.A., Pflugfelder, B.: A hybrid method to cluster protein sequences based on statistics and artificial neural networks. In: CABIOS, vol. 9, pp. 671–680 (1993) [DOI] [PubMed]
  • 32.Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. In: PNAS 1999, vol. 96, pp. 2907–2912 (1999) [DOI] [PMC free article] [PubMed]
  • 33.Scholkopf, B.: The kernel trick for distances, Technical Report. Microsoft Research (May 2000)
  • 34.MacDonald D., Koetsier J., Corchado E., Fyfe C. A kernel method for classification. In: Monroy R., Arroyo-Figueroa G., Sucar L.E., Sossa H., editors. MICAI 2004: Advances in Artificial Intelligence; Heidelberg: Springer; 2004. [Google Scholar]
  • 35.Fyfe C., MacDonald D. Epsilon-insensitive Hebbian learning. Neuralcomputing. 2002;47:35–57. doi: 10.1016/S0925-2312(01)00579-3. [DOI] [Google Scholar]
  • 36.Dayhoff M.O., Schwartz R.M., Orcutt B.C. A model of evolutionary change in proteins. matrices for detecting distant relationships. Atlas of protein sequence and structure. 1978;5:345–358. [Google Scholar]
  • 37.Johnson M.S., Overington J.P. A structural basis for sequence comparisons-an evaluation of scoring methodologies. J. Molec. Biol. 1993;233:716–738. doi: 10.1006/jmbi.1993.1548. [DOI] [PubMed] [Google Scholar]
  • 38.Yang Z.R., Berry E. Reduced bio-basis function neural networks for protease cleavage site prediction. Journal of Computational Biology and Bioinformatics. 2004;2:511–531. doi: 10.1142/S0219720004000715. [DOI] [PubMed] [Google Scholar]
  • 39.Thomson R., Esnouf R. Predict disordered proteins using bio-basis function neural networks. In: Yang Z.R., Yin H., Everson R.M., editors. Intelligent Data Engineering and Automated Learning – IDEAL 2004. Heidelberg: Springer; 2004. pp. 19–27. [Google Scholar]
  • 40.Yang, Z.R., Thomson, R., Esnouf, R.: RONN: use of the bio-basis function neural network technique for the detection of natively disordered regions in proteins. Bioinformatics (accepted) [DOI] [PubMed]
  • 41.Berry E., Dalby A., Yang Z.R. Reduced bio basis function neural network for identification of protein phosphorylation sites: Comparison with pattern recognition algorithms. Computational Biology and Chemistry. 2004;28:75–85. doi: 10.1016/j.compbiolchem.2003.11.005. [DOI] [PubMed] [Google Scholar]
  • 42.Yang Z.R., Chou K.C. Bio-basis function neural networks for the prediction of the O-linkage sites in glyco-proteins. Bioinformatics. 2004;20:903–908. doi: 10.1093/bioinformatics/bth001. [DOI] [PubMed] [Google Scholar]
  • 43.Yang, Z.R.: Prediction of Caspase Cleavage Sites Using Bayesian Bio-Basis Function Neural Networks. Bioinformatics (in press) [DOI] [PubMed]
  • 44.Yang, Z.R.: Mining SARS-CoV protease cleavage data using decision trees, a novel method for decisive template searching. Bioinformatics (accepted) [DOI] [PMC free article] [PubMed]
  • 45.Sidhu, A., Yang, Z.R.: Predict signal peptides using bio-basis function neural networks. Applied Bioinformatics (accepted) [DOI] [PubMed]
  • 46.Draghici S., Potter R.B. Predicting HIV drug resistance with neural networks. Bioinformatics. 2003;19:98–107. doi: 10.1093/bioinformatics/19.1.98. [DOI] [PubMed] [Google Scholar]
  • 47.Beerenwinkel N., Daumer M., Oette M., Korn K., Hoffmann D., Kaiser R., Lengauer T., Selbig J., Walter H. Geno2pheno: estimating phenotypic drug resistance from HIV-1 genotypes. NAR. 2003;31:3850–3855. doi: 10.1093/nar/gkg575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Beerenwinkel N., Schmidt B., Walter H., Kaiser R., Lengauer T., Hoffmann D., Korn K., Selbig J. Diversity and complexity of HIV-1 drug resistance: a bioinformatics approach to predicting phenotype from genotype. PNAS. 2002;99:8271–8276. doi: 10.1073/pnas.112177799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Zazzi M., Romano L., Giulietta V., Shafer R.W., Reid C., Bello F., Parolin C., Palu G., Valensin P. Comparative evaluation of three computerized algorithms for prediction of antiretroviral susceptibility from HIV type 1 genotype. Journal of Antiimicrobial Chemotherapy. 2004;53:356–360. doi: 10.1093/jac/dkh021. [DOI] [PubMed] [Google Scholar]
  • 50.Sa-Filho D.J., Costa L.J., de Oliceira C.F., Guimaraes A.P.C., Accetturi C.A., Tanuri A., Diaz R.S. Analysis of the protease sequences of HIV-1 infected individuals after Indinavir monotherapy. Journal of Clinical Virology. 2003;28:186–202. doi: 10.1016/S1386-6532(03)00007-6. [DOI] [PubMed] [Google Scholar]

Articles from Advances in Natural Computation are provided here courtesy of Nature Publishing Group

RESOURCES