Skip to main content
Protein & Cell logoLink to Protein & Cell
. 2012 Aug 4;3(8):602–608. doi: 10.1007/s13238-012-2914-8

Functional annotation from the genome sequence of the giant panda

Tong Huo 1,3, Yinjie Zhang 1,3, Jianping Lin 2,3,
PMCID: PMC4875358  PMID: 22865348

Abstract

The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided into two groups: 20,179 proteins whose functions can be predicted by GeneScan formed the known-function group, whereas 822 proteins whose functions cannot be predicted by GeneScan comprised the unknown-function group. For the known-function group, we further classified the proteins by molecular function, biological process, cellular component, and tissue specificity. For the unknown-function group, we developed a strategy in which the proteins were filtered by cross-Blast to identify panda-specific proteins under the assumption that proteins related to the panda-specific traits in the unknown-function group exist. After this filtering procedure, we identified 32 proteins (2 of which are membrane proteins) specific to the giant panda genome as compared against the dog and horse genomes. Based on their amino acid sequences, these 32 proteins were further analyzed by functional classification using SVM-Prot, motif prediction using MyHits, and interacting protein prediction using the Database of Interacting Proteins. Nineteen proteins were predicted to be zinc-binding proteins, thus affecting the activities of nucleic acids. The 32 panda-specific proteins will be further investigated by structural and functional analysis.

Electronic Supplementary Material

The online version of this article (doi:10.1007/s13238-012-2914-8 contains supplementary material, which is available to authorized users.

Keywords: Giant panda, GPPD, cross-Blast

Electronic Supplementary Material

Table S1(PDF 153 kb) (152.2KB, pdf)

Footnotes

Electronic Supplementary Material

The online version of this article (doi:10.1007/s13238-012-2914-8 contains supplementary material, which is available to authorized users.

References

  1. Birney E., Clamp M., Durbin R. GeneWise and Genomewise. Genome Res. 2004;14:988–995. doi: 10.1101/gr.1865504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bocca S. N., Magioli C., Mangeon A., Junqueira R. M., Cardeal V., Margis R., Sachetto-Martins G. Survey of glycine-rich proteins (GRPs) in the Eucalyptus expressed sequence tag database (ForEST) Genet Mole Biol. 2005;28:608–624. doi: 10.1590/S1415-47572005000400016. [DOI] [Google Scholar]
  3. Burge C., Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997;268:78–94. doi: 10.1006/jmbi.1997.0951. [DOI] [PubMed] [Google Scholar]
  4. Cai C.Z., Han L.Y., Ji Z.L., Chen X., Chen Y.Z. SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res. 2003;31:3692–3697. doi: 10.1093/nar/gkg600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Hao Y.Z., Hou W.R., Hou Y.L., Du Y.J., Zhang T., Peng Z.S. cDNA, genomic sequence cloning and overexpression of ribosomal protein S25 gene (RPS25) from the Giant Panda. Mol Biol Rep. 2009;36:2139–2145. doi: 10.1007/s11033-008-9427-9. [DOI] [PubMed] [Google Scholar]
  6. Hama N., Kanemitsu H., Tanikawa M., Shibaya M., Sakamoto K., Oyama Y., Acosta T.J., Ishikawa O., Pengyan W., Okuda K. Development of an enzyme immunoassay for urinary pregnanediol-3-glucuronide in a female giant panda (Ailuropoda melanoleuca) J Vet Med Sci. 2009;71:879–884. doi: 10.1292/jvms.71.879. [DOI] [PubMed] [Google Scholar]
  7. Krause J., Unger T., Nocon A., Malaspinas A.S., Kolokotronis S.O., Stiller M., Soibelzon L., Spriggs H., Dear P.H., Briggs A.W., et al. Mitochondrial genomes reveal an explosive radiation of extinct and extant bears near the Miocene-Pliocene boundary. BMC Evol Biol. 2008;8:220. doi: 10.1186/1471-2148-8-220. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Krogh A., Larsson B., von Heijne G., Sonnhammer E.L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol. 2001;305:567–580. doi: 10.1006/jmbi.2000.4315. [DOI] [PubMed] [Google Scholar]
  9. Li R., Fan W., Tian G., Zhu H., He L., Cai J., Huang Q., Cai Q., Li B., Bai Y., et al. The sequence and de novo assembly of the giant panda genome. Nature. 2010;463:311–317. doi: 10.1038/nature08696. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Magrane M., Consortium U. UniProt Knowledgebase: a hub of integrated protein data. Database (Oxford) 2011;2011:bar009. doi: 10.1093/database/bar009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Nash W.G., Wienberg J., Ferguson-Smith M.A., Menninger J.C., O’Brien S.J. Comparative genomics: tracking chromosome evolution in the family ursidae using reciprocal chromosome painting. Cytogenet Cell Genet. 1998;83:182–192. doi: 10.1159/000015176. [DOI] [PubMed] [Google Scholar]
  12. Pages M., Calvignac S., Klein C., Paris M., Hughes S., Hanni C. Combined analysis of fourteen nuclear genes refines the Ursidae phylogeny. Mol Phylogenet Evol. 2008;47:73–83. doi: 10.1016/j.ympev.2007.10.019. [DOI] [PubMed] [Google Scholar]
  13. Pagni M., Ioannidis V., Cerutti L., Zahn-Zabal M., Jongeneel C.V., Hau J., Martin O., Kuznetsov D., Falquet L. MyHits: improvements to an interactive resource for analyzing protein sequences. Nucleic Acids Res. 2007;35:W433–437. doi: 10.1093/nar/gkm352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. She R., Chu J.S., Wang K., Pei J., Chen N. Gen-BlastA: enabling BLAST to identify homologous gene sequences. Genome Res. 2009;19:143–149. doi: 10.1101/gr.082081.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Stanke M., Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003;19(Suppl2):ii215–ii225. doi: 10.1093/bioinformatics/btg1080. [DOI] [PubMed] [Google Scholar]
  16. Williamson M.P. The structure and function of proline-rich regions in proteins. Biochem J. 1994;297(Pt2):249–260. doi: 10.1042/bj2970249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Xenarios I., Salwinski L., Duan X.J., Higney P., Kim S.M., Eisenberg D. DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 2002;30:303–305. doi: 10.1093/nar/30.1.303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Zhan X., Li M., Zhang Z., Goossens B., Chen Y., Wang H., Bruford M.W., Wei F. Molecular censusing doubles giant panda population estimate in a key nature reserve. Curr Biol. 2006;16:R451–452. doi: 10.1016/j.cub.2006.05.042. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1(PDF 153 kb) (152.2KB, pdf)

Articles from Protein & Cell are provided here courtesy of Oxford University Press

RESOURCES