Abstract
Papillomaviruses (PVs) infect a wide range of vertebrates and have diversified into multiple genetic types, some of which have serious consequences for human health. Although PVs have to date only been characterized as exogenous viral forms, here we report the observation of an endogenous viral element (EPVLoa) in the genome of the platypus (Ornithorhynchus anatinus) that is related to PVs. Further data mining for endogenous PV-like elements is therefore warranted.
Papillomaviruses (PVs; family Papillomaviridae) are small, circular DNA viruses with a dsDNA genome approximately 8000 bp in length. PVs have been identified in a wide range of vertebrate species, particularly mammals (de Villiers et al., 2004; Antonsson & McMillan, 2006; Herbst et al., 2009; Lange et al., 2011), and >30 genera and 189 genetically distinct viral types have been described to date (de Villiers et al., 2004; Bernard et al., 2010). Depending on the viral type in question, human infection by PVs can be either asymptomatic or ultimately result in cancerous tumours (Bernard et al., 2006; Muñoz et al., 2006; Munday & Kiupel, 2010). Importantly, PVs possess a dsDNA genome, enter the cell nucleus and have a number of characteristics that might facilitate endogenization (Holmes, 2011). In particular, although the replication of PVs does not involve integration into the host genome, many genomic-integration events have been characterized in mammalian cells (Wentzensen et al., 2004).
We employed a genomic mining of 74 chordate genomes (Table S1, available in JGV Online) available at the NCBI database (http://www.ncbi.nlm.nih.gov/sutils/genom_table.cgi) to screen for endogenous PVs. We used L1 (major capsid) protein sequences of various vertebrate PVs as queries in a genomic blast analysis, employing a cut-off value of 10e−10 to signify a positive hit. The following representative PV sequences, which cover the full phylogenetic diversity of PVs (Bernard et al., 2010), were used as queries: bovine PV 8 (GenBank accession no. NC_009752, host = cow), canine PV 2 (NC_006564, dog), Caretta caretta PV 1 (NC_011530, loggerhead sea turtle), common chimpanzee PV 1 (NC_001838, chimpanzee), human PV 1 (NC_001356, human), human PV type 6b (NC_001355, human), Mastomys coucha PV 2 (NC_008519, Southern multimammate mouse), Phocoena spinipinnis PV (NC_003348, Burmeister’s porpoise), Psittacus erithacus timneh PV (NC_003973, African grey parrot), Rousettus aegyptiacus PV type 1 (NC_008298, Egyptian fruit bat) and Tursiops truncatus PV 2 (NC_008184, bottlenose dolphin).
Surprisingly, all of these queries resulted in strongly positive hits to sequences in the genome of the platypus (Ornithorhynchus anatinus); contig 159295.1, e-value 1e−25, sequence identity = 93/325 (29 %), using the query with Caretta caretta PV 1 as an example. We term this sequence ‘endogenous PV-like element of O. anatinus’ (EPVLoa), although we only observed two copies with a strong match to L1 in the platypus genome: contig 159295.1, 1465 bp in length, containing one premature stop codon, and contig 9789.3, 20 736 bp in length, containing six premature stop codons. In addition, only partial L1 sequences were recovered. A reciprocal blast using the EPVLoa L1 sequence as the query confirmed its relationship to exogenous PVs, in this case exhibiting the closest match to human PV type 45 (e-value 5e−35, sequence identity = 91/286 = 32 %; Fig. 1). A blast analysis using other PV proteins revealed no positive hits, as did an equivalent analysis using Polyomaviridae, the viral family related most closely to the Papillomaviridae (Woolford et al., 2007). That EPVLoa is very rare and that we were unable to identify a complete PV genome may be a function of the relatively low quality of the platypus genome, which is currently only available at sixfold sequencing coverage and therefore of uncertain nature (Lewin et al., 2009). Alternatively, it may be that these endogenization events are extremely rare, and/or that integration events only involve L1 sequences. With respect to the latter, it is notable that integration events involving partial PV genomes, as well as single PV genes, have been documented (zur Hausen, 2002). Finally, that EPVLoa is divergent from any known extant PV, and contains multiple premature stop codons, argues strongly against contamination by exogenous viruses.
Fig. 1.
Sequence alignment of partial L1 protein sequences from EPVLoa and exogenous papillomaviruses. Asterisks above the alignment denote those amino acids shared among all sequences; ⧫ represent sites in which EPVLoa possesses an amino acid residue shared with ≥80 % of the exogenous PVs in the alignment. Those amino acids shared by EPVLoa and HPV45 (GenBank accession no. ABP99855) are highlighted by double underlining at the bottom of the alignment. It is important to note that this is not the alignment used in the phylogenetic analysis (Fig. 2); in this case, all highly divergent regions, including insertions and deletions, were removed using Gblocks (Talavera & Castresana, 2007; Fig. S1). Virus abbreviations are given in Table 1.
To determine the evolutionary relationships between EPVLoa and exogenous PVs, we collected representatives of the full phylogenetic diversity of exogenous PVs from GenBank (n = 44; Table 1). We then aligned these sequences with EPVLoa using clustal_x (Larkin et al., 2007), with a subsequent manual adjustment undertaken using Se-Al (http://tree.bio.ed.ac.uk/software/seal/). This resulted in an L1 protein alignment of 312 aa in length, of which 15 % of amino acid sites (47/312) were conserved among all sequences including EPVLoa (Fig. 1). Although EPVLoa is clearly divergent from the exogenous PVs, all of these sequences share a number of relatively conserved regions (such as residues 47–52, 129–133, 140–147 and 154–174; Fig. 1). Next, we used the Gblocks program (Talavera & Castresana, 2007) to remove the divergent and ambiguously aligned regions, including all those containing insertions and deletions (Fig. S1). This resulted in a final sequence alignment of 196 aa (including 19 invariant amino acid residues) from which evolutionary relationships could be inferred. Phylogenetic analysis of this 196 residue alignment was performed using the maximum-likelihood method available in PhyML 3.0 (Guindon et al., 2010), incorporating the WAG+Γ model of amino acid substitution, with the robustness of each node determined using 1000 bootstrap replicates. The resulting phylogenetic tree placed EPVLoa as more divergent than all known exogenous PVs (Fig. 2), indicative of an ancient divergence event, and hence our designation that it is derived from a ‘PV-like’ virus.
Table 1.
GenBank accession numbers of L1 protein sequences of exogenous PVs used in this analysis
Virus | Abbreviation | GenBank accession no. |
Common chimpanzee papillomavirus 1 | CCPV | NP_045018 |
Human papillomavirus type 6b | HPV6b | NP_040304 |
Rhesus monkey papillomavirus | RMPV | NP_043338 |
Colobus guereza papillomavirus type 2 | CGPV2 | YP_004646343 |
Human papillomavirus type 92 | HPV92 | NP_775311 |
Macaca fascicularis papillomavirus type 2 | MFPV2 | YP_004646337 |
Canine papillomavirus 5 | CPV5 | YP_003204674 |
Canine papillomavirus 4 | CPV4 | YP_001648805 |
Capreolus capreolus papillomavirus 1 | CaCPV1 | YP_002004574 |
Deer papillomavirus | DPV | NP_041300 |
European elk papillomavirus | EElPV | NP_041313 |
Ovine papillomavirus 1 | OPV1 | NP_044438 |
Sus scrofa papillomavirus type 1 | SSPV1 | YP_002235542 |
Francolinus leucoscepus papillomavirus 1 | FLPV1 | YP_003104804 |
Erinaceus europaeus papillomavirus | EEuPV | YP_002427696 |
Equine papillomavirus 2 | EPV2 | YP_002635574 |
Caretta caretta papillomavirus 1 | CCPV1 | YP_002308363 |
Chelonia mydas papillomavirus 1 | CMPV1 | YP_002308370 |
Bovine papillomavirus 8 | BPV8 | YP_001429551 |
Fringilla coelebs papillomavirus | FCPV | NP_663767 |
Human papillomavirus 116 | HPV116 | YP_003084352 |
Mastomys natalensis papillomavirus | MNPV | NP_042019 |
Cottontail rabbit papillomavirus | CRPV | NP_077113 |
Bovine papillomavirus 1 | BPV1 | NP_056744 |
Canine oral papillomavirus | COPV | NP_056819 |
Felis domesticus papillomavirus type 1 | FDPV1 | NP_848025 |
Procyon lotor papillomavirus 1 | PLPV1 | YP_249604 |
Human papillomavirus 1 | HPV1 | NP_040309 |
Human papillomavirus type 41 | HPV41 | NP_040294 |
Ursus maritimus papillomavirus 1 | UMPV1 | YP_001931973 |
Phocoena spinipinnis papillomavirus | PSPV | NP_542623 |
Capra hircus papillomavirus type 1 | CHPV1 | YP_610959 |
Mastomys coucha papillomavirus 2 | MCPV2 | YP_803393 |
Mus musculus papillomavirus type 1 | MMPV1 | YP_003778198 |
Old World harvest mouse papillomavirus | WHMPV | YP_873945 |
Rattus norvegicus papillomavirus 1 | RNPV1 | YP_003169705 |
Rousettus aegyptiacus papillomavirus type 1 | RAPV1 | YP_717913 |
Trichechus manatus latirostris papillomavirus 1 | TMLPV1 | YP_164627 |
Erethizon dorsatum papillomavirus type 1 | EDPV1 | YP_224227 |
Canine papillomavirus 2 | CPV2 | YP_164635 |
Psittacus erithacus timneh papillomavirus | PETPV | NP_647590 |
Tursiops truncatus papillomavirus 1 | TTPV1 | YP_002117846 |
Bovine papillomavirus 3 | BPV3 | NP_694451 |
Equinus papillomavirus | EPV | NP_694429 |
Fig. 2.
Phylogenetic relationships of EPVLoa and exogenous PVs. Bootstrap values (>70 %) are shown for key nodes. The tree is midpoint-rooted for purposes of clarity only. Host species information is shown in parentheses; the PV genus of each sequence is shown in square brackets. Bar, 0.3 amino acid substitutions per site.
Despite the apparent rarity of EPVLoa in the platypus genome, its discovery is important for two reasons: not only is this the first observation of an endogenous PV-like element, but it also means that PVs, or viruses very closely related to PVs, must be capable of infecting germ-line cells. As a consequence, we suggest that more attention is given to the possibility that the endogenization of viruses of this kind has occurred during vertebrate evolution.
Supplementary Data
Footnotes
A supplementary table and figure are available with the online version of this paper.
References
- Antonsson A., McMillan N. A. J. Papillomavirus in healthy skin of Australian animals. J Gen Virol. 2006;87:3195–3200. doi: 10.1099/vir.0.82195-0. [DOI] [PubMed] [Google Scholar]
- Bernard H.-U., Calleja-Macias I. E., Dunn S. T. Genome variation of human papillomavirus types: phylogenetic and medical implications. Int J Cancer. 2006;118:1071–1076. doi: 10.1002/ijc.21655. [DOI] [PubMed] [Google Scholar]
- Bernard H.-U., Burk R. D., Chen Z., van Doorslaer K., Hausen H., de Villiers E. M. Classification of papillomaviruses (PVs) based on 189 PV types and proposal of taxonomic amendments. Virology. 2010;401:70–79. doi: 10.1016/j.virol.2010.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Villiers E. M., Fauquet C., Broker T. R., Bernard H.-U., zur Hausen H. Classification of papillomaviruses. Virology. 2004;324:17–27. doi: 10.1016/j.virol.2004.03.033. [DOI] [PubMed] [Google Scholar]
- Guindon S., Dufayard J. F., Lefort V., Anisimova M., Hordijk W., Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59:307–321. doi: 10.1093/sysbio/syq010. [DOI] [PubMed] [Google Scholar]
- Herbst L. H., Lenz J., Van Doorslaer K., Chen Z., Stacy B. A., Wellehan J. F., Jr, Manire C. A., Burk R. D. Genomic characterization of two novel reptilian papillomaviruses, Chelonia mydas papillomavirus 1 and Caretta caretta papillomavirus 1. Virology. 2009;383:131–135. doi: 10.1016/j.virol.2008.09.022. [DOI] [PubMed] [Google Scholar]
- Holmes E. C. The evolution of endogenous viral elements. Cell Host Microbe. 2011;10:368–377. doi: 10.1016/j.chom.2011.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lange C. E., Zollinger S., Tobler K., Ackermann M., Favrot C. Clinically healthy skin of dogs is a potential reservoir for canine papillomaviruses. J Clin Microbiol. 2011;49:707–709. doi: 10.1128/JCM.02047-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Larkin M. A., Blackshields G., Brown N. P., Chenna R., McGettigan P. A., McWilliam H., Valentin F., Wallace I. M., Wilm A., other authors Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23:2947–2948. doi: 10.1093/bioinformatics/btm404. [DOI] [PubMed] [Google Scholar]
- Lewin H. A., Larkin D. M., Pontius J., O’Brien S. J. Every genome sequence needs a good map. Genome Res. 2009;19:1925–1928. doi: 10.1101/gr.094557.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Munday J. S., Kiupel M. Papillomavirus-associated cutaneous neoplasia in mammals. Vet Pathol. 2010;47:254–264. doi: 10.1177/0300985809358604. [DOI] [PubMed] [Google Scholar]
- Muñoz N., Castellsagué X., de González A. B., Gissmann L. Chapter 1: HPV in the etiology of human cancer. Vaccine. 2006;24(Suppl. 3):S3/1–S3/10. doi: 10.1016/j.vaccine.2006.05.115. [DOI] [PubMed] [Google Scholar]
- Talavera G., Castresana J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol. 2007;56:564–577. doi: 10.1080/10635150701472164. [DOI] [PubMed] [Google Scholar]
- Wentzensen N., Vinokurova S., von Knebel Doeberitz M. Systematic review of genomic integration sites of human papillomavirus genomes in epithelial dysplasia and invasive cancer of the female lower genital tract. Cancer Res. 2004;64:3878–3884. doi: 10.1158/0008-5472.CAN-04-0009. [DOI] [PubMed] [Google Scholar]
- Woolford L., Rector A., Van Ranst M., Ducki A., Bennett M. D., Nicholls P. K., Warren K. S., Swan R. A., Wilcox G. E., O’Hara A. J. A novel virus detected in papillomas and carcinomas of the endangered western barred bandicoot (Perameles bougainville) exhibits genomic features of both the Papillomaviridae and Polyomaviridae. J Virol. 2007;81:13280–13290. doi: 10.1128/JVI.01662-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- zur Hausen H. Papillomaviruses and cancer: from basic studies to clinical application. Nat Rev Cancer. 2002;2:342–350. doi: 10.1038/nrc798. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.