Abstract
Rotavirus (RV) diarrhoea causes huge number deaths in children less than 5 years of age. In spite of available vaccines, it has been difficult to combat RV due to large number of antigenically distinct genotypes, high mutation rates, generation of reassortant viruses due to segmented genome. RV is an eukaryotic virus which utilizes host cell machinery for its propagation. Since RV only encodes 12 proteins, posttranslational modification (PTM) is important mechanism for modification, which consequently alters their function. A single protein exhibiting different functions in different locations or in different subcellular sites, are known to be ’moonlighting‘. So there is a possibility that viral proteins moonlight in separate location and in different time to exhibit diverse cellular effects. Based on the primary sequence, the putative behaviour of proteins in cellular environment can be predicted, which helps to classify them into different functional families with high reliability score. In this study, sites for phosphorylation, glycosylation and SUMOylation of the six RV structural proteins (VP1, VP2, VP3, VP4, VP6 & VP7) & five nonstructural proteins (NSP1, NSP2,NSP3,NSP4 & NSP5) and the functional families were predicted. As NSP6 is a very small protein and not required for virus growth & replication, it was not included in the study. Classification of RV proteins revealed multiple putative functions of each structural protein and varied number of PTM sites, indicating that RV proteins may also moonlight depending on requirements during viral life cycle. Targeting the crucial PTM sites on RV structural proteins may have implications in developing future antirotaviral strategies.
Background
Rotavirus, member of genus Reoviridae is one of the major diarrheal agents causing huge number of deaths worldwide especially among children and dairy animals. An estimated 527,000 children aged <5 years die from rotavirus diarrhea each year, with >85% of these deaths occurring in lowincome countries of Africa and Asia [1]. There are seven groups of rotavirus, designated as A, B, C, D, E, F and G. Humans are primarily infected by group A, B and C, however group A is most pathogenic. The viruses consist of eleven double stranded RNA genome encoding twelve proteins. Among the twelve proteins NSP1NSP6 are nonstructural and remaining six VP1VP4, VP6 & VP7 are viral RNA polymerase and others structural components of the virus [2]. Rotaviruses primarily infect the enterocytes of the small intestine though there are few recent reports of extraintestinal infections. From 1975 through 2000 there were 12 case reports where rotavirus infection was found in CNS complications [3].
Viruses are the major cause of several newly emerged diseases throughout the world. In spite of their small genomes, viruses are capable of capturing host machinery for replication and propagation. Viruses in general express multifunctional proteins, with various sites for posttranslational modification (PTM). This flexibility gives them chance to participate in cellular signaling events, either to increase their pathogenicity or to escape the robust host immune system. The modification of this kind is not probably mediated by host factors. Some modifications of the rotavirus proteins are known, such as the hyperphosphorylation of the NSP5 protein and glycosylation of the VP7 protein. But overall posttranslational modification status of the proteome of rotavirus is yet to be elucidated. As rotavirus is a eukaryotic virus and utilizes the host machinery to replicate and propagate, it can be hypothesized that rotaviral proteins are also modified in host cells.
The phenomenon by which a single protein exhibits different roles in different environment or in different subcellular location is known as ’moonlighting‘ [4]. As previously stated, viruses with their small repertoire of proteins are capable to capture host machineries, so it is possible that viral proteins moonlight in the host environment either by interaction with cellular multifunctional protein complexes or with the help of posttranslational modification. Examples of multifunctional viral proteins are, NS1 protein of influenza virus which inhibits host immune responses by limiting both interferon (IFN) production and the downstream effects of IFNinduced proteins and also acts directly to modulate other important aspects of the virus replication cycle, including viral RNA replication, viral protein synthesis, and general hostcell physiology [5], Nucleocapsid (N) protein of Hantaviruses is associated with viral RNA but it also interacts with the virus polymerase and as one of the glycoproteins, interferes with important regulatory pathways in the infected cells [6]. Similarly hepatitis B virus (HBV) protein HBx is a multifunctional viral protein that modulates transcription, cell responses to genotoxic stress, protein degradation, and signaling pathways [7] and NS5A of hepatitis C virus (HCV) has role in viral replication, modulation of the cellular signaling apparatus and cell cycleregulatory kinases [8]. Functional families of JEV proteins have been assigned previously using a Support Vector Machine based software (SVMProt) which categorizes every viral protein into different families [9].
Examples of major modifications of the proteins which might help the virus during the life cycles are phosphorylation of PB1F2 protein which results in promotion of apoptosis in influenza [10], phosphorylation of HCV NS5A to modulate host interferoninduced antiviral response [11]. Similarly glycosylated WNV envelope protein plays an important role in replication and maturation processes, glycosylation of E1 protein in HCV caused the translocation of the protein to the cell surface whereas glycosylation decreases the virulence of influenza virus H3N2 [12]. SUMO modification of the adenoviral oncoprotein E1B55kDa results in localization of the protein from cytoplasm to nucleus [13] and SUMOylated EBV protein BZLF1 disrupts PML Bodies [14]. Methylation is mostly studied in the case of histones, which are associated with DNA and can activate or repress gene expression [15]. Nuclear proteins and various cytoplasmic regulators are subject to lysine acetylation. Lysine acetylation can results in diverse functional role of proteins [16]. In proteins with one or few sites, it generally acts as an onoff switch or it may interact with other PTMs to exert their effects. Thus understanding the functional role of viral proteins is a very important aspect in studying the virus mediated pathogenesis.
Methodology
Sequence retrieval
Rotavirus protein sequences were downloaded from NCBI database (http://www.ncbi.nlm.nih.gov/). Sequences of human Rotavirus strain PA169 were considered for the study. As NSP6 protein is not indispensable for the Rotavirus lifecycle in the cell, it is not included in the study. Accession numbers for the viral sequences are as follows: NSP1 (EF554132), NSP2 (EF554133), NSP3 (EF554134), NSP4 (EF554135), NSP5 (EF554136), VP1 (EF554126), VP2 (EF554127), VP3 (EF554128), VP4 (EF554129), VP6 (EF554130) and VP7 (EF554131).
Prediction of protein functional family by web-based Support Vector Machine software SVMProt
SVMProt (http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi) classifies a protein into different functional families from its primary sequence information. A slightly modified reliability score, Rvalue, is used in SVMProt. Rvalue is a scoring function for estimating the accuracy of support vector machine classification. Pvalue is the expected classification accuracy (probability of correct classification). It is derived from the statistical relationship between the Rvalue and actual classification accuracy based on the analysis of 9,932 positive and 45,999 negative samples of proteins [17].
Prediction of phosphorylation sites in proteins using Netphos 2.0 server
Serine, threonine and tyrosine phosphorylation status of the Rotavirus proteins were predicted using the online open access server NetPhos 2.0 (http://www.cbs.dtu.dk/services/NetPhos/). The NetPhos 2.0 server predicts serine, threonine and tyrosine phosphorylation sites in eukaryotic proteins based on artificial neural network. The threshold value for the prediction is 0.5 [18].
Prediction of NLinked and OLinked Glycosylation profile for the rotavirus proteins
NetNGlyc 1.0 server was used to find the Nglycosylated proteins of the virus. NetNGlyc 1.0 server (http://www.cbs.dtu.dk/services/NetNGlyc/) predicts NGlycosylation sites in human proteins using artificial neural networks that examine the sequence context of AsnXaaSer/Thr sequences. Sequences having Nglycosylation potential >0.5 are considered as cutoff value [19].
NetOGlyc 3.1 server (http://www.cbs.dtu.dk/services/NetOGlyc/) was used to find the Oglycosylation of the viral proteins during infection. The NetOglyc server produces neural network based predictions of mucin type GalNAc Oglycosylation sites in mammalian proteins. The Gscore is the score from the best general predictor; the Iscore is the score from the best isolated site predictor. If the Gscore is >0.5 the residue is predicted as glycosylated; the higher the score the more confident the prediction. For threonines an additional score is used [20].
Prediction of potential SUMOylation sites in viral proteins
SUMOylation prediction sites were analyzed using SUMOsp2.0 server (http://sumosp.biocuckoo.org/online.php). Low threshold was chosen for the prediction of the modification. GPS and MotifX, two powerful prediction strategies previously used for phosphorylation prediction were used to create this algorithm [21].
Prediction of few other post-translational modifications in AutoMotif2.0 Server
The AutoMotif 2.0 Server (http://ams2.bioinfo.pl/) allows the identification of posttranslational modification )PTM) sites in proteins based only on local sequence information. The Support Vector Machine )SVM) search type was used for the screening. The performance of SVM models is described by recall R and the precision P. The recall R value measures the percentage of correct predictions, whereas precision P gives the percentage of observed positives that are correctly predicted [22]. AutoMotif 2.0 search was done taking Methylation and Acetylation in consideration.
Discussion
The data on the types & numbers of PTM sites and functional classification of Rotavirus proteins is given in Table 1 (see supplementary material). NSP1, the viral IRF3 antagonist contains 26 phosphorylation sites. It also has total 5 glycosylation sites, 4 of them are Nlinked in nature. Though SUMOylation reveals 3 type II sites, no type I site was found. It does not have a methylation site but one acetylation site is present. Prediction of functional family of proteins using the SVMProt classifies NSP1 into 6 functional families with highest Pvalue for the Zinc-binding family. Next to come according to the Pvalue are RNA-binding, Metal-binding, Nuclear receptor, DNA repair and Magnesium-binding. The results are consistent with previous wet-lab experiments of having a Zinc-binding domain and RNA-binding activity.
NSP2 is a smaller protein than NSP1 and contain 12 phosphorylation sites. It has only one Nlinked glycosylation site. NSP2 has 5 type II and type I SUMOylation sites. It has only one methylation site but is devoid of any acetylation sites. NSP2 may function as Zinc binding, Actin binding, Calcium binding, Type II secretory pathway and Poreforming toxins.
NSP3 has 23 phosphorylation sites; it has no predicted glycosylation and acetylation sites. NSP3 has 4 SUMOylation sites and one methylation site. NSP3 belongs to seven families which according to Pvalue are Zinc binding, all lipidbinding, metal binding chlorophyll biosynthesis, Actin biosynthesis, DNA repair and DNA condensation.
NSP4, the putative viral enterotoxin has 9 phosphorylation sites. NSP4 sequence contains 3 glycosylation and one type II SUMOylation sites. It also possesses one methylation site. NSP4 is predicted to be belonged to following three families Transmembrane, ABC family and Magnesiumbinding.
NSP5 as previously reported was found to be hyperphosphorylated in 31 sites. Though VP1 & VP4 contains more phosphorylation sites; considering the size of NSP5 the extent of phosphorylation is much higher in NSP5. As NSP5 is involved in replication of the virus so phosphorylation might have a role during it. Glycosylation status prediction has shown total 9 sites but contrary to other proteins, 8 out of them are Oglycosylated. It has 5 SUMOylation sites and 1 acetylation site but no methylation sites. According to SVMProt, NSP5 belonged to the Coat protein family which was not known previously.
VP1, the largest protein of the virus is the viral polymerase contains 64 phosphorylation sites, high compared to other viral proteins. VP1 also has 5 glycosylation sites, 3 SUMOylation sites, 2 methylation sites and 3 acetylation sites. VP1 belongs to following seven families; Transferase, Structural, DNA replication, DNAbinding, Zincbinding, Photosystem I and Pore forming toxins.
VP2 has 26 phosphorylation sites and 4 glycosylation sites. The VP2 protein has 13 SUMOylation sites and may play a big role in the virus lifecycle. VP2 protein is found to have 4 methylation sites, which is more than any other rotaviral proteins but it lacks the acetylation site. Previously it is known that VP2 remain associated with viral RNA and acts as a scaffolding protein. So there is a possibility that it may control the expression of viral RNAs after infection. VP2 may function as RNAbinding, Oxidoreductases, Type II secretory pathway and Magnesiumbinding proteins.
VP3 has 46 phosphorylation sites. It is also one of the most glycosylated proteins and has 9 Nlinked & 1 Olinked glycosylation sites. It is also highly SUMOylated as it contains 5, type I and 5, type II sites. VP3 has only 2 methylation and 1 acetylation sites. VP3 classifies as with Zincbinding and Poreforming toxins family of proteins.
VP4 also contains 64 phosphorylation sites same as VP1. It is the most glycosylated protein and contains 13 sites. Rotavirus VP4 is the surface protein so there is a possibility that number of glycosylation sites modulate the virulence of the virus which can only be proved by in vivo studies. It has 3 SUMOylation sites and 1 methylation site. VP4 is predicted to be member of two families, Coat protein and Structural protein.
VP6 has 18 phosphorylation and 4 glycosylation sites. It contains 1 SUMOylation and 1 methylation sites but no acetylation site. VP6 belong to coat protein family which is same with respect to its actual role in virus.
VP7 contains 13 phosphorylation, 1 glycosylation and 2 SUMOylation sites. VP7 has neither methylation nor acetylation sites. According to SVMProt, VP7 belong to transmembrane, coat protein and all lipidbinding proteins.
Conclusion
Based on the sequence analysis, it is found that rotavirus proteins have many potential sites which may undergo PTM. On the other hand, study of “functional family of proteins” has categorized each of them into different groups. These two aspects of the present study have raised the possibility of moonlighting behavior of rotavirus proteins. Protein phosphorylation results in either modulation of protein activity or effect replication or changes their specificity for interaction with other proteins. Phosphorylation prediction of rotavirus reveals moderate to many phosphorylation sites on different proteins. Further experimental studies on analyzing the role of phosphorylation in different viral proteins will uncover function of this kind of modification. Glycosylation is generally correlated with virulence of viruses. So glycosylated proteins of rotavirus may have some possible roles in their virulence or cellular effects. Similarly SUMOylation, Acetylation and methylation of the proteins may have different roles which depends on the context of its action or interaction with others cellular factors.
Functional family analysis of rotavirus proteins revealed different predicted roles, some of which were known previously, while some are completely novel; proper investigation of these functions in wet laboratory may open a new window for better understanding of rotavirus pathogenesis. Understanding of the role of posttranslational modifications in biology of viral infection is one step towards developing successful treatment strategies. Importance of glycosylation, phosphorylation or SUMOylation in viral protein function & virus replication is well established. This knowledge should be exploited in future while designing novel therapeutics, which specifically target one and more crucial PTMs in virus life cycle.
Supplementary material
Acknowledgments
The study was supported by financial assistance from Indian Council of Medical Research )ICMR)s New Delhi and the Program of funding research centers for emerging and reemerging infectious diseases )Okayama UniversityNICEDs India) from the Ministry of Education, Culture, Sportss Science and Technologys Japan.
Footnotes
Citation:Chattopadhyay et al, Bioinformation 4(10): 448-451 (2010)
References
- Centers for Disease Control and Prevention (CDC) MMWR Morb Mortal Wkly Rep. 2008;57:1255–1257. [PubMed] [Google Scholar]
- 2.Graff JW, et al. J Gen Virol. 2007;88:613–620. doi: 10.1099/vir.0.82255-0. [DOI] [PubMed] [Google Scholar]
- 3.Lynch M, et al. Clinical Infectious Diseases. 2001;33:932–938. doi: 10.1086/322650. [DOI] [PubMed] [Google Scholar]
- 4.Jeffery CJ. Trends Biochem Sci. 1999;24(1):8–11. doi: 10.1016/s0968-0004(98)01335-8. [DOI] [PubMed] [Google Scholar]
- 5.Hale BG, et al. J Gen Virol. 2008;89:2359–2376. doi: 10.1099/vir.0.2008/004606-0. [DOI] [PubMed] [Google Scholar]
- 6.Kaukinen P, et al. Arch Virol. 2005;150:1693–1713. doi: 10.1007/s00705-005-0555-4. [DOI] [PubMed] [Google Scholar]
- 7.Seishi Murakami J. Gastroenterol. 2001;36:651–660. doi: 10.1007/s005350170027. [DOI] [PubMed] [Google Scholar]
- 8.Gregory Reyes R, et al. Journal of Biomedical Science. 2002;9:187–197. doi: 10.1007/BF02256065. [DOI] [PubMed] [Google Scholar]
- 9.Sahoo GC, et al. Bioinformation. 2008;3:1–7. doi: 10.6026/97320630003001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Mitzner D, et al. Cell Microbiol. 2009;11:1502–1516. doi: 10.1111/j.1462-5822.2009.01343.x. [DOI] [PubMed] [Google Scholar]
- 11.Huang Y, et al. Virology. 2007;364:1–9. doi: 10.1016/j.virol.2007.01.042. [DOI] [PubMed] [Google Scholar]
- 12.Vigerust DJ, et al. Trends Microbiol. 2007;15:211–218. doi: 10.1016/j.tim.2007.03.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Endter C, et al. Proc Natl Acad Sci U S A. 2001;98:11312–11317. doi: 10.1073/pnas.191361798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Adamson AL, et al. J Virol. 2001;75:2388–2399. doi: 10.1128/JVI.75.5.2388-2399.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kouzarides T. Cell. 2007;128:693–705. doi: 10.1016/j.cell.2007.02.005. [DOI] [PubMed] [Google Scholar]
- 16.Eberharter Anton, et al. EMBO reports. 2002;3:224–229. doi: 10.1093/embo-reports/kvf053. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Cai CZ, et al. Nucleic Acids Res. 2003;31:3692–3697. doi: 10.1093/nar/gkg600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Blo N, et al. Journal of Molecular Biology. 1999;294:1351–1362. doi: 10.1006/jmbi.1999.3310. [DOI] [PubMed] [Google Scholar]
- 19. http://www.cbs.dtu.dk/services/NetNGlyc/
- 20.Julenius K, et al. Glycobiology. 2005;15:153–164. doi: 10.1093/glycob/cwh151. [DOI] [PubMed] [Google Scholar]
- 21.Ren J, et al. Proteomics. 2009;9:3409–3412. doi: 10.1002/pmic.200800646. [DOI] [PubMed] [Google Scholar]
- 22.Plewczynski D, et al. Bioinformatics. 2005;21(10):2525–2527. doi: 10.1093/bioinformatics/bti333. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.