Abstract
Human papillomavirus (HPV) causes virtually all cervical cancers and many cancers at other anatomical sites in both men and women. However, only 12 of 448 known HPV types are currently classified as carcinogens, and even the most carcinogenic type — HPV16 — only rarely leads to cancer. HPV is therefore necessary but insufficient for cervical cancer, with other contributing factors including host and viral genetics. Over the last decade, HPV whole genome sequencing has established that even fine-scale within-type HPV variation influences precancer/cancer risks, and that these risks vary by histology and host race/ethnicity. In this review, we place these findings in the context of the HPV life cycle and evolution at various levels of viral diversity: between-type, within-type, and within-host. We also discuss key concepts necessary for interpreting HPV genomic data, including features of the viral genome; events leading to carcinogenesis; the role of APOBEC3 in HPV infection and evolution; and methodologies that use deep (high-coverage) sequencing to characterize within-host variation, as opposed to relying on a single representative (consensus) sequence. Given the continued high burden of HPV-associated cancers, understanding HPV carcinogenicity remains important for better understanding, preventing, and treating cancers attributable to infection.
Keywords: Cervical cancer, HPV16, HPV evolution, HPV genomics, Next-generation sequencing (NGS), Within-host (intrahost) diversity
1. Introduction
Human papillomavirus (HPV) causes ∼4.5% of all human cancers [1], including tumours of the cervix, anus, vagina, penis, oropharynx, vulva, oral cavity, and larynx [2]. Cervical cancer is the most common of these, with 604,000 new cases and 342,000 deaths per year, virtually all attributable to HPV [3]. HPV is one of the most consequential human carcinogens [4,5]. However, the majority of HPV types (genotypes) do not cause cancer; of 448 types that have been documented [6,7], only 12 are currently classified as carcinogenic: types 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, and 59 [8]. Infections by these carcinogenic HPV types are extremely common [9], but ∼80% are cleared by the immune system within three years, and only ∼3% progress to cervical precancer/cancer within 7 years [10]. HPV is therefore necessary but insufficient for cervical cancer. Further, because infectious virus particles are not produced in tumours [11], cancer cannot provide an evolutionary benefit to the virus [12]. Cancer is therefore a rare and inadvertent consequence — not an objective — of HPV infection.
HPV16 and HPV18 are the most common carcinogenic types [1], together responsible for ∼71% of cervical cancers [13,14] and virtually all HPV-associated cancers in males [2]. However, despite advances in genomics [15], pinpointing genetic variants that confer differences in carcinogenicity has remained elusive, due in part to a lack of sufficiently abundant HPV whole genome sequences [16]. Even when genomes are available, data interpretation can be complex. For example, there is a poor correlation between HPV genetic relatedness and carcinogenicity: HPV31 and HPV35 are the types most closely related (genetically similar) to HPV16, but they are much less carcinogenic. At the same time, HPV18 is highly carcinogenic, but it is relatively distantly related to HPV16 and preferentially causes glandular lesions.
In this review, we discuss HPV-related carcinogenesis from the perspective of genomics, focusing on HPV16, cervical cancer, and key concepts necessary for the interpretation of genomics data (see Box 1 for Glossary of bold terms). We also document how next-generation sequencing (NGS) has dramatically increased the number of HPV genome sequences over the last decade, leading to new discoveries about genetic differences between HPV types, within the same HPV type, and even among HPV genomes that infect a single host individual.
Box 1. Glossary of Key Concepts.
Between-host (interhost): genetic differences between viruses infecting different individuals. Contrast within-host.
Between-type (intertypic): genetic differences between HPV types, with patterns of relatedness typically determined using the L1 ORF. Contrast within-type.
Consensus: a summary nucleotide sequence wherein each genome position has been assigned the most common (major) nucleotide detected in the sequencing reads; generally does not represent within-host polymorphism.
Deep sequencing: next-generation sequencing specifically aimed at producing high sequence coverage (read depth). When applied to a sample containing multiple genomes, can be used to estimate variant allele frequencies within the sample or source population.
Dinucleotide: two contiguous nucleotides (sequence positions) on the same strand of DNA or RNA. Often represented with a ‘p’ to denote the intervening phosphate group (e.g., TpC).
Divergence (d): the rate of evolutionary substitution (fixation) between lineages, such as HPV types or lineages. Can be estimated separately at sites that are nonsynonymous (dN) and synonymous (dS) to detect natural selection, typically long after selection has acted. Contrast nucleotide diversity (π).
Epitope: a molecular pattern that may be recognized as foreign by the host and stimulate an immune response. B cells (antibodies) typically recognize conformational epitopes such as the viral capsid, whereas T cells typically recognize short (e.g., 9–12 amino acid) MHC-bound peptide fragments derived from surveillance of intracellularly translated proteins.
Fitness: generally refers to the reproductive success of a self-replicating entity. Numerous factors influence the fitness of a virus, including genome viability, evasion of immunity, and successful transmission to a new host.
Genetic drift: chance evolution in which allele frequencies fluctuate randomly in a population. Dominates evolution unless overcome by a directional force like natural selection.
Host: an individual organism or population that is infected by a pathogen such as a virus.
Integration: the insertion of full or partial HPV genome sequences into the host (somatic) genome.
iSNV: intrahost single nucleotide variant. Refers specifically to within-host virus polymorphism. By contrast, between-host single nucleotide differences (i.e., different samples or isolates) are often referred to as SNVs or SNPs (single nucleotide polymorphisms). Contrast somatic.
Lineage: an evolutionary line of descent from an ancestor, often visualized as a branch on a tree. In the context of HPV nomenclature, the term refers to distinct groups of related isolates within a single type, denoted by a capitalized letter (e.g., A). HPV lineages typically differ from one another by ∼1.0–10.0% at the whole genome level.
Major allele: the most common allele at a given genome position in a sample or population.
Minor allele: the least common allele(s) at a given genome position in a sample or population.
Mutation: a change at one or more nucleotide positions in an individual genome, before the influence of natural selection.
Natural selection: the differential replication success of certain phenotypes. When phenotypes have a genetic basis, selection can shape allele frequencies in a directional manner over time.
Neutral: a nucleotide or amino acid change that produces no change in fitness, whose fate is therefore determined by genetic drift.
Nonsynonymous: a nucleotide change in a protein-coding region that changes the amino acid encoded. More likely than synonymous changes to alter fitness and experience natural selection.
Nucleotide diversity (π): the mean number of differences per site for a randomly chosen pair of sequences in a population. Can be estimated separately at sites that are nonsynonymous (πN) and synonymous (πS) to detect natural selection, typically while it is still acting. Contrast divergence (d).
Open reading frame (ORF): a stretch of contiguous codons that begins with a START codon, ends with a STOP codon, and is free of mid-sequence STOP codons. Capable of encoding a complete peptide. Distinct from ‘gene’, as an ORF may not have its own promoter, may be present in multiple transcripts, and may be co-located in the same transcript with other ORFs (i.e., polycistronic mRNA).
Overlapping ORF: two or more ORFs encoded by the same genome positions, such that distinct protein products are translated from the same nucleotides by using different reading frames. Also called ‘overlapping genes’ or ‘out-of-frame ORFs’.
Positive selection: natural selection that favors an increase in frequency (directional selection) or maintenance at a non-zero frequency (balancing selection) of a particular allele.
Purifying selection: negative natural selection that favors the decrease in frequency and extinction of a particular allele.
Prevalence: the frequency of a virus in a host population.
Quasispecies: a network (‘mutant cloud’ or ‘swarm’) of interrelated genotypes produced by very high mutation rates and large population sizes, such that individual genome sequences are unstable.
Somatic: generally refers to body cells in organisms that have a soma vs. germline distinction; in relation to mutations, refers to genetic changes acquired during an individual's lifetime.
Sublineage: an evolutionarily related group nested within a larger lineage. In the context of HPV nomenclature, the term refers to a distinct group of related isolates that form a subset within a single type/lineage, denoted by appending a number to the lineage's capitalized letter (e.g., A1). Sublineages typically differ from one another by 0.5–1.0% at the whole genome level.
Substitution: the evolutionary replacement of one allele by another in a population, resulting in the new allele's fixation (frequency of 100%). Unlike mutation, substitution is the result of evolution after forces like selection have acted. May also refer to point mutations (e.g., single base substitutions).
Synonymous: a nucleotide change in a protein-coding region that does not change the amino acid encoded.
Trinucleotide: three contiguous nucleotides (sequence positions) on the same strand of DNA or RNA; a trimer. Often represented with two ‘p’ letters to denote the intervening phosphate groups (e.g., TpCpA).
Type: genotype. In HPV genomics, refers to a distinct group of antigenically similar, evolutionarily related viral genomes, denoted with a number (e.g., HPV16). Using current classification criteria, one type differs from all other types by ≥ 10% in its L1 nucleotide sequence.
Variant Allele Fraction (VAF): the frequency of a particular allele among all sequencing reads at a given genome position. Refers exclusively to a single sequenced sample. In the case of viral variants, it is an estimate of a variant's allele frequency in the within-host virus population.
Within-host (intrahost): genetic changes occurring within the population of viruses infecting a single host during a single infection, including iSNVs. Contrast between-host and quasispecies.
Within-type (intratypic): genetic changes occurring within one HPV type, with patterns of relatedness categorized as lineages, sublineages, and single nucleotide variants. Includes both between-host and within-host variation.
Alt-text: Box 1
2. HPV genome and life cycle
2.1. Open reading frames
HPVs have circular, ∼7.9 kb double-stranded DNA genomes consisting of an upstream regulatory region (URR), an intergenic noncoding region (NCR) with simple (AT)n and poly-T repeats, and eight main expressed protein-coding open reading frames (ORFs). The ORFs are named according to their approximate timing of expression during the viral life cycle, where ‘E’ denotes early and ‘L’ denotes late: E6, E7, E1, E2, E4, E5, L2, and L1 (listed 5′–3′) (Fig. 1; Fig. 2; Table 1). In addition to the main ORFs, E8 — a sequence often 12 ⅔ codons in length — is spliced to E2 to form E8^E2 at certain stages of infection. All ORFs occupy the sense (forward) strand and are expressed as polycistronic (multi-ORF) mRNAs [17].
Fig. 1.
Human papillomavirus type 16 genome diagram. The circular, ∼7.9 kb double-stranded DNA genome of HPV16 is depicted as three sense-strand trinucleotide (codon) reading frames: 1 (outside track), 2 (middle track), and 3 (inside track), where frame 1 begins at position 1 of the genome. Protein-coding open reading frames (ORFs) are depicted as coloured rectangles in the appropriate reading frame. E8 (frame 2) is encoded entirely within E1 (frame 1), and E4 (frame 3) is encoded entirely within E2 (frame 2). E4 is the only ORF occupying frame 3 in HPV16. Early and late promoters (p) are denoted p97 and p670, respectively; early and late polyadenylation sites (polyA) are denoted polyAE and polyAL, respectively. Black and grey circles denote E1 and E2 binding sites, respectively, where the E1 binding site occurs within the origin of replication (ori) that overlaps position 1. The 3′ terminus of E6 does not overlap the 5′ terminus of E7, in contrast to the overlap observed in non-carcinogenic HPV types. All coordinates correspond to reference genome HPV16REF from PaVE [6,7]. See Table 1 for additional details. Figure made in R [188] (ggplot2; tidyverse; scales; RColorBrewer) and modified in PowerPoint.
Fig. 2.
Life cycle of carcinogenic HPVs in stratified squamous epithelia. Infection is thought to require a microtear exposing the basal (lowermost) layers of the epithelium, which is where host cells susceptible to infection reside. The time required for a basal cell to differentiate and migrate to the epithelial surface is ∼3 weeks, placing a lower limit on the length of time required for the viral life cycle [46]. During this time, virus genome copy numbers increase from reservoir levels by at least an order of magnitude. Different viral proteins dominate expression at different levels of the epithelium, in coordination with host cell differentiation. Virus particle formation takes place only in the upper layers, where the capsid proteins L1 and L2 are expressed; no virus particles are formed in the basal layer. Virus genomes are shown as extrachromosomal circular episomes, but integration into the host genome may also occur — effectively ending the virus life cycle by preventing viral genome encapsidation. Figure reflects a synthesis of information presented in the text, primarily refs. [11,[41], [42], [43], [44],[46], [47], [48]]. Figure made in PowerPoint; virus capsid image modified from Protein Data Bank record 3J6R [[189], [190], [191]].
Table 1.
Human papillomavirus open reading frames in three closely related carcinogenic types: HPV16, HPV31, and HPV35.
| ORF (5′–3′) | Key protein functions | Key references | Type | Reading framea | Amino acid length | Nucleotide length | CDS start | CDS end | Overlapping ORFs (overlap type) | Overlapping nucleotides (%) |
|---|---|---|---|---|---|---|---|---|---|---|
| E6 |
Oncoprotein; targets p53 for degradation; offsets E7's antiviral effects by blocking apoptosis; necessary for maintaining cancer |
Vande Pol and Klingelhutz 2013; Vats et al., 2021 |
HPV16 | 2 | 151 | 456 | 104b | 559 | – | 0 |
| HPV31 | 3 | 149 | 450 | 108 | 557 | – | 0 | |||
| HPV35 | 2 | 149 | 450 | 110 | 559 | – | 0 | |||
| E7 |
Oncoprotein; targets pRb for degradation; increases DNA replication; stabilises APOBEC; necessary for maintaining cancer |
Roman and Munger 2013; Vats et al., 2021 |
HPV16 | 1 | 98 | 297 | 562 | 858 | – | 0 |
| HPV31 | 2 | 98 | 297 | 560 | 856 | – | 0 | |||
| HPV35 | 1 | 99 | 300 | 562 | 861 | – | 0 | |||
| E1 |
Helicase; essential for replication; interacts with host replication factors; the only HPV enzyme |
Bergvall et al., 2013 |
HPV16 | 1 | 649 | 1950 | 865 | 2814 | E8 (internal), E2 (terminal) |
97 (5%) |
| HPV31 | 1 | 629 | 1890 | 862 | 2751 | E8 (internal), E2 (terminal) |
97 (5%) | |||
| HPV35 | 1 | 637 | 1914 | 868 | 2781 | E8 (internal), E2 (terminal) |
103 (5%) | |||
| E8 (E8^E2) |
Suppresses viral replication in the basal epithelium; spliced to E2 to form E8^E2 |
McBride 2013; Kuehner and Stubenrauch 2022 |
HPV16 | 2 | 12 ⅔c | 38 | 1265 | 1302 | E1 (full) | 38 (100%) |
| HPV31 | 2 | 12 ⅔c | 38 | 1259 | 1296 | E1 (full) | 38 (100%) | |||
| HPV35 | 2 | 12 ⅔c | 38 | 1268 | 1305 | E1 (full) | 38 (100%) | |||
| E2 |
DNA binding protein; downregulates E6 and E7; partitions viral genomes to daughter cells upon division |
McBride 2013; Kuehner and Stubenrauch 2022 |
HPV16 | 2 | 365 | 1098 | 2756 | 3853 | E1 (terminal), E4 (internal), E5 (terminal) |
326 (30%) |
| HPV31 | 2 | 372 | 1119 | 2693 | 3811 | E1 (terminal), E4 (internal) |
343 (31%) | |||
| HPV35 | 2 | 366 | 1101 | 2717 | 3817 | E1 (terminal), E4 (internal), E5 (terminal) |
335 (30%) | |||
| E4 (E1^E4) |
May assist in virus synthesis and release by disrupting cellular keratin in the upper epithelium; highly expressed biomarker of infection; lacks a conserved start codon; usually spliced to E1 (E1^E4) |
Doorbar 2013 |
HPV16 | 3d | 86 ⅔d | 263 | 3358 | 3620 | E2 (full) | 263 (100%) |
| HPV31 | 3d | 93 ⅔d | 284 | 3295 | 3578 | E2 (full) | 284 (100%) | |||
| HPV35 | 3d | 87 ⅔d | 266 | 3319 | 3584 | E2 (full) | 266 (100%) | |||
| E5α |
Accessory oncoprotein; hydrophobic transmembrane protein; downregulates MHC expression and disrupts presentation of virus T-cell epitopes |
DiMiao and Petti 2013; Willemsen et al., 2019 |
HPV16 | 1 | 83 | 252 | 3850 | 4101 | E2 (terminal) | 4 (2%) |
| HPV31 | 3 | 84 | 255 | 3816 | 4070 | – | 0 | |||
| HPV35 | 1 | 83 | 252 | 3814 | 4065 | E2 (terminal) | 4 (2%) | |||
| L2 |
Minor capsid; guides virus genomes to nucleus upon infection; up to 72 copies per virus particle |
Wang and Roden 2013 |
HPV16 | 1 | 473 | 1422 | 4237 | 5658 | L1 (terminal) | 20 (1%) |
| HPV31 | 1 | 466 | 1401 | 4171 | 5571 | L1 (terminal) | 20 (1%) | |||
| HPV35 | 2 | 469 | 1410 | 4211 | 5620 | L1 (terminal) | 20 (1%) | |||
| L1 | Major capsid; mediates virus attachment and entry; 360 copies per virus particle; self-assembles into virus-like particles (VLPs) used for vaccines | Buck et al., 2013 | HPV16 | 2 | 505 | 1518 | 5639 | 7156 | L2 (terminal) | 20 (1%) |
| HPV31 | 2 | 504 | 1515 | 5552 | 7066 | L2 (terminal) | 20 (1%) | |||
| HPV35 | 3 | 502 | 1509 | 5601 | 7109 | L2 (terminal) | 20 (1%) | |||
ORF lengths and positions are given for HPV reference genomes found at PaVE: HPV16REF (7906 bp), HPV31REF (7912 bp), and HPV35REF (7879 bp) [6,7]. Overlapping ORFs refer to out-of-frame protein-coding ORFs.
Reading frames refer to codons occupying the trinucleotides (codons) starting at positions 1, 2, and 3 of the reference genome for each type.
In HPV16, E6 is sometimes annotated as beginning at position 83 of the genome, i.e., 7 additional codons at its 5′ terminus (start); for consistency, we instead employ numbering based on the start site annotated in PaVE.
E8 encodes 12 codons, plus the first two nucleotides of a one additional codon at its 3′ terminus (end). The final nucleotide of the additional codon is spliced from, and maintains the reading frame of, E2.
E4 encodes the end (3′ portion) of E1^E4, beginning with the last 2 nucleotides of a codon; the first nucleotide of the additional codon is spliced from E1.
E6 and E7 are the primary HPV oncoproteins. In carcinogenic types, E6 and E7 degrade p53 and pRb, respectively [18]. They also interact with numerous other host cell proteins to delay differentiation, promote DNA replication, and evade host immunity [12]. Continued expression of both E6 and E7 is thought to be required for the maintenance of cervical cancer [19,20].
E5 is an accessory oncoprotein that plays a supportive role in, but is not necessary for, oncogenesis [21]. E5s are characterised by high hydrophobicity, transmembrane regions, and downregulation of MHC/HLA (major histocompatibility complex/human leukocyte antigen) class I molecules, thereby disrupting peptide presentation to cytotoxic (CD8+) T cells [[22], [23], [24], [25]]. There are at least four distinct evolutionary groups of E5 ORFs (E5α, E5β, E5γ, E5δ) interspersed among HPV types lacking E5 [26], important to consider for comparative analyses. E5α is the group present in carcinogenic HPVs [22].
E1 (helicase) and E2 (DNA binding protein) are the core viral proteins involved in replication and genome maintenance [27]. Full-length E2 tethers virus genomes to host chromosomes for distribution to daughter cells [28,29]. E2 also downregulates E6 and E7 at certain points during the viral life cycle [30]. The shorter E8^E2 splice product is expressed in the basal epithelium to suppress viral replication and maintain low virus copy numbers [31], and this is suggested to play a role in avoiding immune detection [11,32].
E4 is thought to assist in genome amplification and virion release, and is one of the most highly expressed ORFs [33]. Both E8 (38 nucleotides within E1) and E4 (263–284 nucleotides within E2) are out-of-frame overlapping ORFs, i.e., their full sequences are encoded in alternative reading frames of other ORFs (Fig. 1; Table 1).
L1 and L2 are the major and minor structural proteins of the viral icosahedral capsid, respectively. Because L1 is generally the most conserved (least variable) ORF, its sequence is used to define HPV types. Specifically, a new HPV type is designated if an isolate's L1 nucleotide sequence differs by ≥ 10% from any previously defined type [34]. Nevertheless, L1 does contain five highly variable stretches, ∼10–30 codons each, that encode outward-facing loops [35]. These loops contain L1's neutralising antibody epitopes, necessary for vaccine-induced immunity [36,37]. Thus, the genetic differences in L1 — used to define different types — correspond to antigenic differences [38], and may reflect natural selection for immune escape [9,39].
2.2. Infection
HPV infection of stratified cutaneous and mucosal epithelia (e.g., skin and cervix) is thought to require exposure of long-lived basal (lowermost) cells, including stem or stem-like cells [40]. HPV maintains a stable copy number in this reservoir set of (initially infected) cells, and only later produces infectious virus particles in the upper epithelial layers in coordination with cell differentiation (Fig. 2). Thus, under normal circumstances, no lateral (side-to-side) infection of neighbouring cells occurs in the basal layer; these cells contain virus genomes but not virus particles.
Upon successful infection of the basal layer, viral genomes localise to the nucleus and replicate to a stable number, thought to be an average of ∼50–200 copies per cell [[41], [42], [43], [44]]. These genomes persist as virion-free episomes (extrachromosomal circular plasmids) that replicate an average of once per cell cycle [40], but occasionally integrate into the host genome [45]. In the basal layer, gene expression remains low, which limits the probability of immune detection [32,40]. However, when daughter cells migrate toward the epithelial surface and differentiate, viral intermediate and late gene expression commences, virus genome copy numbers increase to >103 [11,44,46,47], and virus particles are formed (Fig. 2). This is all accomplished with no viraemia, no virus-induced cell death, and no inflammation, making the virus practically invisible to the host immune system [46].
Given the above, it is likely that the state and abundance of viral genomes obtained for sequencing depend on the anatomic site and time of sampling. Samples obtained from the epithelial surface during productive infection may include fully viable circular genomes encapsidated within infectious virus particles. On the other hand, samples obtained from cancerous tissue may include partial viral genomes, some or all of which may be integrated into the host cell genome, and which may have incurred deleterious mutations or deletions.
2.3. Cancer
HPV viruses replicate their genomes using host polymerases that are normally expressed before differentiation, but virion production requires host transcription factors that are expressed during differentiation [48]. Both requirements must be met without triggering an immune response that would lead to apoptosis [49]. Strategies used by HPV to achieve these conflicting goals can inadvertently lead to cancer because there is substantial overlap between the cellular functions required for viral success and those which increase the susceptibility of host cells to oncogenesis. Thus, cancer is not the objective of HPV, but rather a ‘rare byproduct’ [48] that has been termed ‘collateral damage’ [50] of infection.
Rather than being required for virus propagation, cancer is usually a dead end for HPV: once precancerous lesions form, infectious viral particles are no longer produced [11,12,51]. Thus, cancer — which typically occurs decades after initial infection [52] — does not contribute to viral evolutionary fitness. Additionally, HPV-associated ‘driver’ mutations, most notably integration events, may themselves bring an end to the viral life cycle. Most sequencing methodologies do not distinguish between HPV genomes that are viable or nonviable; or between HPV genomes that exist as virion-encapsidated copies (ready to be transmitted), free episomal copies (may transmit if encapsidated), or integrated copies (unlikely to transmit) [53,54]. Critically, such factors determine how viral variation may be interpreted, e.g., mutations in integrated HPV copies are unlikely to experience onward transmission and contribute to viral evolution.
3. HPV evolution and diversity
3.1. Fitness
Evolutionary fitness refers to reproductive success. Because viruses are self-replicating entities with a genotype/phenotype connection, they can undergo evolution via natural selection to maximise their fitness. The fitness of an HPV type can be estimated by its prevalence (frequency in the host population), which is itself a function of persistence (length of productive infection) and incidence (rate of successful transmission to new hosts) [16,50,55]. However, it is important to note that prevalence may also be influenced by chance factors, e.g., a founder effect in which a given viral genotype happens to enter a host population at an earlier date than other genotype(s).
HPV16 is both the most prevalent carcinogenic HPV type and the most prevalent type in cancer. This implies that the replication strategies it employs (or niches it occupies) potentiate cancer. However, some non-carcinogenic types are more or equally prevalent in the general population [56] and likely have even higher fitness than carcinogenic types. Thus, an HPV type may have high fitness without causing cancer — as is true of most viruses.
3.2. Mutation
At least four distinct mechanisms give rise to HPV mutations at different stages of its life cycle. In the basal epithelial layer, copy numbers are maintained using bidirectional replication, which may disproportionately lead to mutation and recombination in the region between E2 and L2 where replication forks meet [26,32,57,58]. Second, when viral copy numbers increase in differentiating cells destined for the surface, HPV switches to unidirectional (rolling circle or recombination-dependent) replication [27,32,57], which may involve distinct mutational processes. Third, when DNA enters the single-stranded state during either transcription or replication, host APOBEC3 enzymes can target TpC dinucleotides to induce C➞T (G➞A) mutations [59] (section 4.4.1). Finally, deamination of methylated CpG dinucleotides, which also occurs in the single-stranded state [60], may also cause C➞T (G➞A) HPV mutations [61,62].
Because HPV genomes use host DNA polymerases to replicate, they have low mutation rates. Direct estimates of the HPV mutation rate are hindered by the difficulty of growing HPV in cell culture [[63], [64], [65]] and its high replication fidelity. Thus, evolutionary comparisons are used, where mutation rates can be inferred from substitution rates. Across the whole papillomavirus genome, evolutionary substitution rates are ∼5 times higher than in the genomes of their mammalian hosts [66]. However, mutation and substitution rates are equal only at sites that are neutral, i.e., lack functional constraint and therefore evolve predominantly by random genetic drift rather than natural selection [67,68].
Two candidates for neutral sites in HPV are the upstream regulatory region (URR) and synonymous positions in protein-coding regions. A comparison between the URR of HPV18 and HPV45 yields a single nucleotide substitution rate of ∼4.5 × 10−7 per site per year [69]. Similarly, an analysis of feline papillomaviruses yields a rate of ∼2.69 × 10−8 per site per year [70]. Assuming URR neutrality, these serve as estimates of the papillomavirus mutation rate per unit time. However, because the URR encodes regulatory elements that make it subject to purifying selection, even these are likely to be underestimates. To our knowledge, no estimates based on synonymous protein-coding sites are available.
The above HPV mutation rate estimates are >1000 times lower than those of RNA viruses (∼10−4 to 10−3 per site per year estimated from synonymous sites [71,72]), but ∼500 times higher than the human germline mutation rate (∼4.27 × 10−10 per site per year estimated from father/mother/child trios [73]). Numerous factors contribute to these differences, including 1) different generation times; 2) different numbers of genome replications per generation; 3) selection acting on sites assumed to be neutral; 4) acute vs. persistent life cycles, along with any associated latency or replication throughout time; 5) mutagenesis of viral genomes by host enzymes such as APOBEC3; and 6) the enzymes and specific activities involved in DNA replication.
3.3. Nucleotide diversity (π)
Nucleotide diversity (π) [74] is an unbiased metric ideal for measuring the diversity of virus populations [75]. In protein-coding regions, a significant difference between π at nonsynonymous (πN) and synonymous (πS) sites is evidence for ongoing positive (πN/πS > 1) or purifying (πN/πS < 1) selection [76,77]. Within-population selection is expected to influence substitution rates and therefore divergence (d; dN/dS) among HPV lineages and types over time [78,79]. Thus, πN/πS and dN/dS are routinely used for detecting functionally important genome regions (Fig. 3).
Fig. 3.
Human papillomavirus nucleotide diversity and natural selection in three closely related carcinogenic types: HPV16, HPV31, and HPV35. (A) Nonsynonymous (amino acid changing; πN) and synonymous (not amino acid changing; πS) nucleotide diversities were calculated as the mean number of pairwise differences per site [74] using SNPGenie [192] based on whole genome consensus sequences (one representative sequence per sample) for HPV16 (n = 3220; [113]), HPV31 (n = 1577; [114]), and HPV35 (n = 512; [115]). Sequences were derived from samples obtained from the NCI-Kaiser Permanente Persistence and Progression (PaP) cohort. The null hypothesis of πN = πS was evaluated using a Z-test (1000 bootstrap replicates, codon unit) [193]. Significance is indicated as *P < 0.05; **P < 0.01; ***P < 0.001. (B) The ratio of πN to πS can provide evidence for positive selection (πN/πS > 1) or purifying selection (πN/πS < 1). ‘Overlapping regions’ refers to protein-coding sites that overlap a second protein-coding ORF in another reading frame. Results are only shown for the E2/E4 overlap (∼24% of E2 and 100% of E4); sites involved in shorter overlaps yielded highly variable estimates and were excluded (E1/E8, E1/E2, E2/E5, L2/L1). Standard πN/πS and dN/dS methods were used to analyse non-overlapping regions [76,192]. Revised methods that account for a variant's effects in two proteins were used to analyse overlapping regions, specifically by limiting to sites that are nonsynonymous in the overlapping frame, i.e., the πNN/πSN ratio in OLGenie [81,82]. Positive selection is not significant for any whole ORF, a result that may reflect a counterbalance between sites under positive and purifying selection. The tree topology is that inferred from the L1 ORF (PaVE [6,7]). Figure made in R [188] (ggplot2; tidyverse; scales; RColorBrewer) and modified in PowerPoint. Source data: Supplementary File 1.
Standard methods for estimating πN/πS and dN/dS are not applicable to overlapping ORFs, such as E2 and E4 in HPV genomes. Because the genome positions encoding E4 also encode E2, the corresponding nucleotides are subject to selective constraints acting on both proteins. Specifically, because the majority of random mutations are nonsynonymous, synonymous changes in E2 are likely to be nonsynonymous in E4 — and therefore subject to purifying selection. As a consequence, standard πN/πS methods [76] tend to underestimate πS (overestimate πN/πS) in such regions, leading to a spurious inference of positive selection [80,81]. Overlapping ORFs therefore require more sophisticated πN/πS methods that account for a variant's effects in two proteins [82].
In HPV16, the region of E2 overlapping E4 exhibits πN/πS > 1 when analysed using standard methods, suggestive of positive selection (Fig. 3). This has been attributed to the presence of B and T cell epitopes in E2 [83,84], which may experience selection for immune escape. As an alternative explanation, E4 is very highly expressed [33,40] and does not match human codon usage preferences [85], suggesting it may be subject to especially strong purifying selection [86]. Using publicly available HPV16 sequence data and a πN/πS method developed for overlapping ORFs [81], we show that E4 maintains a strong signal of purifying selection, whereas the πN/πS ratio of E2 drops from 1.7 to 0.6 when limiting to sites that are nonsynonymous in E4 (Fig. 3B). Other evidence indicates that amino acid changes in E2 tend to be tolerated only when they occur in such a way as to produce synonymous changes in E4 [87,88]. Taken together, these results suggest that the functional constraint of E4 outweighs positive selection on E2.
3.4. Recombination
Recombination can produce new combinations of pre-existing mutations, potentially linking adaptive (or maladaptive) variants in the same genome. Although some evidence exists for recombination in HPV [89], it is thought to be very rare. Recent HPV16 genome sequence data were suggested to provide evidence of recombination [90], but the observed patterns were subsequently attributed to co-infection by multiple sublineages of the same HPV type — events that can be hard to distinguish given a consensus sequence alone.
One obstacle to detecting recombination is that it requires enough dissimilarity between two sequences to infer a breakpoint and rule out sequencing error. Because mutation during the course of a single HPV infection is unlikely to introduce sufficient variation, detectable (and biologically meaningful) recombination would likely require co- or super-infection of the same basal cell by distinct viral types, lineages, or variants. This is expected to be rare. Nevertheless, important recombination events may have occurred at key moments in HPV evolution. Most notably, evolutionary trees inferred from early (E) ORFs cluster carcinogenic HPV species together, whereas those inferred from late (L) ORFs do not (Fig. 4) (section 4.2). This suggests a recombination event between the E5 and L2 ORFs near the root of the Alphapapillomavirus genus [91]. However, convergent evolution cannot be ruled out to explain this pattern.
Fig. 4.
Evolutionary relationships between and within carcinogenic HPV types. Trees of multiple types are typically inferred using the L1 ORF, reflecting how types are classified, whereas trees of within-type (intratypic) variation are inferred using whole genomes. (A) Subtree of the Alphapapillomavirus genus including all carcinogenic species (Alpha-5, -6, -7, and -9) and types (red dots), as determined by the L1 ORF (modified from PaVE [6,7]). Note that trees inferred from the early (E) ORFs would instead place the carcinogenic species into one clade [91,194], possibly due to convergence or recombination early in Alphapapillomavirus evolution. (B) Lineages (A, B, C, D) and sublineages (A1-4, B1-4, C1-4, D1-4) of HPV16, as reported in Ref. [50]. Key characteristics as determined by HPV genomic and epidemiologic data are noted. Odds ratio estimates for cancer are shown for specific sublineages compared to the most common A1/A2 sublineages, as reported in Mirabello et al. [149] using data from a large U.S. case-control study (colour denotes risk). The odds ratio for B1 is reported for precancer/cancer for statistical power (small sample size). Old sublineage names are shown for reference, but their use is discouraged. ADC = adenocarcinoma. Figure made in PowerPoint. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)
3.5. Natural selection and immunity
Mutations that affect viral persistence and transmission are subject to natural selection. One important selective pressure is host immunity [92,93]. For HPV, it is thought that B cells (antibodies) contribute primarily to the prevention of infection, made possible by the long lag time between virion binding and eventual cell entry [94]. On the other hand, T cells contribute primarily to the control of already-established infections, in part through the presentation of viral epitopes by the MHC [47,95]. Because MHC presentation is determined by an individual's human leukocyte antigen (HLA) genotype [96], hosts likely differ in their ability to control infection for a given viral type, lineage, or variant. Indeed, genome-wide association studies have pointed to specific HLA class I (presentation to CD8+ T cells) and class II (presentation to CD4+ T cells) alleles that confer higher or lower risk of cervical cancer [[97], [98], [99], [100], [101], [102]], including in an HPV type- or variant-specific manner [[97], [98], [99]].
Within a host, HPV genomes are divided into small ‘islands’ — distinct subpopulations infecting distinct cells. This limits the power of natural selection, because variants in isolated compartments are subject to chance extinction [103], e.g., via host cell death. Nevertheless, within-host (intrahost) selection may still occur between genomes infecting the same basal cell, or between genomes infecting different basal cells. Within the same basal cell, selection must be very weak owing to low copy numbers and lack of a genotype/phenotype connection (i.e., no virus particles). Between cells, there is opportunity for HPV genomes to compete via group selection. Specifically, if a virus mutation confers a growth advantage to its host cell, all genomes in the cell will benefit. As a result of such cell-to-cell competition, certain virus genotypes may drive others to extinction as their host cells replace one another. This selection among somatic cells may allow viral genome persistence within the host, potentiating clonal expansion and progression to cancer in rare instances [104].
3.6. Interpreting diversity
Not all evolutionary change in the HPV genome benefits the virus. As mentioned in section 2.3, infectious virions are not produced in cancer tissues [11]. This implies that many viral functions will freely accumulate mutations in cancers — even if they would render normal virus nonviable. Such relaxation of purifying selection may apply to specific genome positions, specific protein residues, or even whole ORFs. Furthermore, because each peptide produced by the virus may potentially encode an epitope that stimulates an immune response, unnecessary protein products may constitute an ‘antigenic liability’ for the virus and be selected against.
There is a salient historical example of the relaxation of purifying selection: HPV vaccines rely on self-assembling L1 proteins to form virus-like particles (VLPs). However, the first attempts to generate VLPs in vitro failed to achieve full-size particles [105] or high yield [106]. It was soon recognized that the L1 sequence being used had been obtained from a cervical cancer, and that this sequence differed from wild type (non-mutated) infectious virions by one amino acid (H202D). When the mutated L1 was reverted to its wild type, full-size VLPs were generated with a 103-fold increase in yield [107]. Thus, relaxation of purifying selection had allowed a deleterious nonsynonymous mutation in L1 to freely accumulate in the cancer tissue.
In summary, the HPV life cycle informs interpretation of evolutionary genomics data, and vice versa.
4. Key HPV genomics advances
4.1. HPV whole genome sequencing
Sanger sequencing yields high-quality HPV genome sequences but is slow, costly, and labour-intensive. As a consequence, only ∼100 HPV16 whole genomes were publicly available by the early 2010s [108,109]. Additionally, Sanger's dependency on primers limits its application to known HPV types, and it is not amenable to studying within-host variation, i.e., minor alleles of HPV within a single host. Nevertheless, it remains the ‘gold standard’ and continues to produce important insights [110,111].
Next-generation sequencing (NGS) approaches have potentiated an enormous jump in the number of HPV whole genomes available. An amplicon-based Ion Torrent assay introduced in 2015 [112] is responsible for most of this increase, providing over 5000 HPV16 genomes by 2017 [113], as well as large numbers for other carcinogenic types including HPV31 [114] and HPV35 [115].
Ion Torrent yields fewer single base errors than Illumina, but more indel errors due to the difficulty of sequencing homopolymers (e.g., AAAAA) [116,117]. Improvements to Ion chemistry (Hi-Q) and chips yield single base substitution error rates of only ∼0.000129 per base [116], compared to first generation error rates of ∼0.00431–0.0110 per base [117]. Nevertheless, even using the older chemistry, the HPV Ion Torrent sequencing assay shows 99.97% (standard deviation [sd] = 0.07%) concordance between sample duplicates, as well as 99.97% (sd = 0.13%) concordance with Sanger sequencing [112]. This compares favourably to concordance between Illumina and Sanger, which has been reported at >99.8% for HPV [110]. Further, whereas ∼80% of HPV16 isolates from different women have ≥2 nucleotide differences, ∼80% of isolates from the same woman have ≤1 differences (72% are identical) using the Ion assay, confirming its robustness [113]. In fact, the quality of Ion Torrent data is likely sufficient to allow deep (high-coverage) sequencing for detection of within-host HPV polymorphisms (see section 4.4).
Another NGS approach uses full-circle PCR followed by sequencing with Illumina [53,54]. This technology has yielded hundreds of genomes to date, and gives a low error rate of ∼0.000076 per base; it is typically used to deep sequence within-host samples [54]. Shotgun metagenomics with Illumina sequencing has also been used to reveal hundreds of new skin HPVs in immunodeficient individuals, approximately doubling the number of known HPV types in recent years [118,119].
4.2. Between-type (intertypic) HPV divergence
Classification of papillomaviruses (family Papillomaviridae) follows guidelines from the International Committee on the Taxonomy of Viruses (ICTV) [120]. The current criteria rely on an empirical distribution of pairwise L1 nucleotide sequence identity between HPV isolates that suggests natural groupings into the same genus (>60% identity), species (>70% identity), and type (>90% identity) [120]. However, it was recently noted that bias may have been introduced into this distribution by an overrepresentation of types from the Alphapapillomavirus genus, and that these groupings may not hold up for recent genome data (see Ref. [121] for more details).
Five genera contain HPVs: Alpha-, Beta-, Gamma-, Mu-, and Nu-papillomavirus. These genera roughly reflect tissue tropism, e.g., Alphapapillomavirus members tend to infect mucosal/genital epithelia, while Betapapillomavirus members tend to infect cutaneous/skin epithelia [40,122], with exceptions [9,61]. All 12 carcinogenic HPV types are present in the Alphapapillomavirus genus and are limited to four species: Alpha-5 (HPV51), Alpha-6 (HPV56), Alpha-7 (HPV18, 39, 45, and 59), and Alpha-9 (HPV16, 31, 33, 35, 52, and 58). Because types are substantially diverged (i.e., ≥10% nucleotide difference in L1), immunity to one type offers only limited cross-immunity to closely related types (e.g., HPV16 and HPV31) [47]. Phylogenetic trees built on early ORFs (E6, E7, E1, E2, E5) cluster the four carcinogenic species (Alpha-5, -6, -7, and -9) with a single carcinogenic ancestor [26,55], while trees built on just the late ORFs (L2, L1) instead yield two separate carcinogenic clusters (Alpha-9 vs. Alpha-5/6/7) [91] (Fig. 4). The genetic changes underlying this phylogenetic incongruence (different tree topologies) are concentrated in E6 and L2 (5′-terminal portion), suggesting important between-type (intertypic) differences may fall in these regions. Interestingly, each carcinogenic species contains at least one type that does not cause cancer (e.g., HPV67 in Alpha-9; but see Ref. [123]).
It has been noted since the discovery of HPV16 that a type's prevalence differs by geography [124]. For example, globally, HPV16 accounts for ∼60% of cervical cancers while one of its closest relatives, HPV35, accounts for only ∼2% [13,14]. However, HPV35 is especially prevalent in women with African ancestry, where it accounts for up to 4.9–10.4% of cancers [115,[125], [126], [127], [128], [129]]. Interestingly, HPV35 exhibits low π values in the early (E) genes but extremely high (albeit insignificant) πN/πS ratios in E2 and E4 (Fig. 3) — a striking difference from its HPV16 and HPV31 relatives. HPV35 is not included in any current vaccines, and its addition might confer better protection than relying on cross-protection from HPV16 and HPV31.
HPV types also differ in their frequencies of integration into the host genome (see section 4.5). Although integration is observed in ∼83% of HPV-positive cervical cancers overall, it is seen in only ∼76% of cervical cancers caused by HPV16 but virtually all those caused by HPV18 [130]. Thus, data from evolutionary, epidemiologic, and molecular studies imply that the precise mechanisms of infection and carcinogenesis may differ even among closely related types.
Key features specific to carcinogenic HPV types include (1) the ability to degrade p53 and pRb members; (2) regulation of E6 and E7 expression through differential mRNA splicing rather than separate promoters; (3) the ability to immortalise keratinocytes in cell culture; and (4) a propensity for dysregulated gene expression (reviewed in Refs. [9,18,40,131]). Recently, it has been further noted that, in non-carcinogenic types, the end of E6 overlaps the beginning of E7, similar to other ORFs that overlap at their termini (Fig. 1; Table 1). In contrast, in carcinogenic types, these ORFs no longer overlap due to an insertion that has extended the end of E6 [132]. This region of E6 encodes the protein's PDZ binding motif, which is central to numerous host protein interactions [18]. Thus, the genomic decoupling of E6 and E7 in this region may have potentiated oncogenesis and deserves further attention.
4.3. Within-type (intratypic) HPV diversity
The recent explosion of HPV whole genome sequences has allowed genetic variation within each HPV type to be analysed in unprecedented detail. The simplest way to study this within-type (intratypic) diversity is the consensus sequence approach, i.e., one representative HPV sequence per sample. Comparing consensus sequences between hosts (interhost) has worked particularly well for studying within-type HPV diversity owing to its low mutation rate. However, it is important to recognize that each host is infected not by a single virus but by a population of viruses, within which consequential variation may exist (see section 4.4).
4.3.1. Lineages, sublineages, and SNPs
Classification of HPV sequences within a type employs an alphanumeric nomenclature. At the highest level, lineages differ from one another by ∼1.0–10% across the whole genome and are denoted with an uppercase letter (e.g., A). Within each lineage, sublineages differ from one another by ∼0.5–1.0% and are further denoted with a number (e.g., A1) [34]. For example, HPV16 has a total of four lineages (A, B, C, D) that are divided into 16 sublineages (A1-4, B1-4, C1-4, D1-4) (Fig. 4). Lower levels of classification (e.g., A1.1) have not yet been utilised. The reference sequence for a type is preferentially assigned to lineage A or, if defined, sublineage A1 (e.g., HPV16REF in A1) [34]. Note that lineages and sublineages are best classified using the whole genome rather than L1, because L1 is not sufficiently variable to resolve within-type differences [15,34].
4.3.1.1. HPV16
For HPV16, the existence, geographic clustering, and potential clinical importance of within-type sequence variation has been recognized since at least 1991 [133]. A1 is by far the most common sublineage and is relatively evenly dispersed across the globe. Other sublineages exhibit sometimes extreme clustering by geographic region, often being prevalent where they evolved, most notably A3 and A4 in East Asia; B1-4 and C1-4 in Africa; D2 and D3 in the Americas; and B4, C4, and D4 in North Africa [134,135]. Remarkably, this peculiar distribution is due at least in part to an ancient host split ∼500 thousand years ago, when the lineage giving rise to A was carried by the Neanderthals/Denisovans, and BCD by the ancestors of modern humans. After a period of separation, the A lineage was then sexually transmitted to modern humans — at the same time as introgression of host nuclear alleles [136,137].
Before the availability of large numbers of HPV16 whole genomes, it was necessary to group the rarer BCD (previously ‘non-European’) lineages together for statistical power (Fig. 4B). These earlier pioneering studies revealed an increased risk of cancer for BCD compared to the A lineage [[138], [139], [140], [141], [142], [143], [144], [145], [146], [147], [148]]. Since that time, more fine-scale evaluations of individual sublineages have become possible with the availability of large numbers of cervical samples for sequencing, e.g., the exfoliated cervical cell samples from the Kaiser Permanente Northern California PaP (Persistence and Progression) cohort [149].
Compared to the most common sublineages (A1 and A2, reference), certain sublineages were shown to be significantly associated with increased risks of cervical precancer and cancer: A4 (odds ratio [OR] for cancer = 3.2), C1 (OR = 2.1), D2 (OR = 28.5), and D3 (OR = 13.9) (Fig. 4B). In contrast, D1 and D4 are not associated with precancer/cancer, and B1 is significantly associated with a lower risk of precancer/cancer (OR = 0.6) [149]. Sublineage risks of precancer and cancer also vary by histologic subtype [147,[149], [150], [151], [152], [153]] (but see Refs. [154,155]). This was most strikingly observed in the U.S., with a strong increased risk of adenocarcinoma conferred by A4 (OR = 9.8), D2 (OR = 137.3), and D3 (OR = 59.5), as compared to A1/A2 [149] (Fig. 4B).
Precancer and cancer risks associated with sublineages have also been shown to be influenced by host race/ethnicity [140,156] (but see Refs. [157,158]). Specifically, results suggest that precancer/cancer risk is highest when there is a match between a woman's self-reported race/ethnicity and the ancestry in which the infecting sublineage evolved: A1/A2 in whites; A4 in Asians; and D2/D3 in Hispanics [149]. Similarly, there are increased cancer risks for A3, A4, and D sublineages in regions where they are common: A3 in East Asia (OR = 2.2); A4 in East Asia (OR = 6.6) and North America (OR = 3.8); and D in North America (OR = 6.2), where D sublineages are also more frequent in adenocarcinoma [134].
The aforementioned risk differences are particularly remarkable given the relatively small number of genetic differences between the sublineages: in HPV16, the A4 and D2/D3 sublineages differ from A1 by only ∼60 and ∼150 nucleotides, respectively. Early phylogeny-based dN/dS analyses identified evidence for positive selection in E6 [[159], [160], [161]], E5 [159], and L2 [161], suggesting particular codons as candidates for important functional differences between sublineages. More recently, two single nucleotide polymorphisms (SNPs) in the URR were shown to greatly reduce risk of precancer/cancer (ORs ≤0.06) [113]. Additionally, individual SNPs have been linked to differences in HPV16-driven oropharyngeal cancer survival: patients with ≥1 high-risk HPV16 SNP had a median survival of only 4 years compared to 19 years for patients without these SNPs [162]. HPV genetic variation has not been evaluated related to cervical cancer prognosis.
4.3.1.2. HPV18
Although HPV18 is the second most common type associated with cancer, much less is known about the relationship between its genetic variation and risk of precancer/cancer. It is less prevalent than HPV16, less commonly detected in precancerous lesions, and found in more adenocarcinoma than squamous cell carcinoma, which has likely limited its thorough study.
HPV18 can be classified into three main lineages (A, B, C) and nine sublineages (A1-A5, B1–B3, C). Two small studies suggest that variants may be differentially associated with adenocarcinoma [150,163], but others do not [164]. A dN/dS analysis identified evidence for positive selection in E5 [165]. A worldwide study of HPV18 lineages/sublineages found no major differences in the distribution of lineages between cancer-free controls and cancer cases or histologies; however, when stratified by geographic region, they observed that the A1 sublineage is associated with more cancer in Eastern Asia [163]. More studies are needed to understand the role of genetic variation in HPV18-related disease.
4.3.1.3. HPV31
HPV31 has three lineages (A, B, C) that are divided into eight sublineages (A1-2, B1-2, C1-4). Early studies showed that lineages A/B are associated with precancer compared to C [141,166]. A large analysis of 2093 genomes has since revealed that the A1 (OR = 1.7), A2 (OR = 2.5), and B2 (OR = 1.9) sublineages confer higher risk of precancer/cancer than C3 (most common sublineage) [114]. In addition, a single nonsynonymous change in E7 (H23Y) was shown to increase risk of precancer/cancer (OR = 1.6) [114].
4.3.1.4. HPV35
With only two lineages (A, B) and three sublineages (A1, A2, B), HPV35 has less genetic variation and fewer lineages/sublineages than most other carcinogenic types, including its sibling types HPV16 and HPV31 (Fig. 3). One early study suggested that the A1 sublineage is associated with elevated risk of precancer compared to A2 [141]. A subsequent large analysis of 1053 HPV35 genomes has further revealed important differences in risk associated with viral variation by host race/ethnicity. The A2 sublineage confers higher risk (OR = 5.6) of precancer/cancer specifically in African American women, but not other racial/ethnic groups, compared to A1 (most common sublineage) [115]. Consistent with this, A2 is more prevalent among cancers in Africa compared to other world geographic regions [115]. Further, 12 SNPs are associated with precancer/cancer only in women of African ancestry, and women with two or more of these individual SNPs have a strong increased precancer/cancer risk (OR = 69) [115].
4.3.2. E7 constraint in HPV16
Although the genetic basis of HPV16's unique carcinogenicity is far from understood, comparisons between precancer/cancer cases and cancer-free controls in large studies point to E7 as a key factor. Specifically, examination of 5328 HPV16 consensus genomes shows cancers are characterized by significantly fewer nonsynonymous variants than controls, evidenced by a low odds ratio of 0.16 and a ∼5.6-fold reduction in πN/πS [113]. An in vitro study of these E7 variants showed that the specific variants observed in the controls lead to a reduced level of E7 protein and lower transforming activity [167]. This suggests that E7 may exist in a damaged state in controls, reducing carcinogenicity. However, while E7 conservation appears to be critical for the carcinogenicity of HPV16, this is not necessarily true of other types, e.g., HPV31 [114]. Of note, the increased variation among controls is unlikely to be due to the production of virion early in infection, which would affect all ORFs equally and require new HPV variants to reach high frequencies (i.e., become major alleles) specifically in controls but not cancers.
Although the elevation of nonsynonymous changes in HPV16 controls compared to cases is most pronounced in E7, elevation is also observed in E1 and L1 and somewhat in most ORFs [113]. It is possible that specific amino acid changes may promote viral clearance. Such changes could represent random mutations during replication (see section 3.2), but could also represent an antiviral mechanism, namely the mutagenic activity of human APOBEC3 cytidine deaminases (see section 4.4.1). Specifically, consensus-level nonsynonymous differences in E7 are enriched for C→T at TpC dinucleotides (i.e., TpC→TpT) in controls, a change consistent with APOBEC3 activity [168]. Further, HPV genomes exhibit an overall depletion of TpC, particularly at third codon positions where they would have been most likely to cause tolerable synonymous changes [61,62,169,170]. At those TpC sites that remain, the vast majority of possible C→T changes are nonsynonymous [168]. Thus, synonymous APOBEC3 changes have largely been saturated in the HPV genome. Finally, within HPV16, the D2/D3 sublineage has the fewest remaining TpC sites — as a result of having the largest proportion of TpC→TpT changes in its evolutionary history — compared to A1/A2 sublineages [168]. These changes may have contributed to the lower fitness (prevalence) but enhanced carcinogenicity of D2/D3.
4.3.3. Key considerations for evaluating genetic risk associations
As larger studies are published, is it clear that the grouping of disease outcomes (e.g., precancer and cancer; squamous and glandular lesions) and sublineages (e.g., BCD in HPV16) in smaller studies can conceal important qualitative heterogeneity in lineages and disease outcomes, and mask specific associations. These findings raise the exciting prospect that, as whole genome HPV sequences continue to accumulate, we may gain sufficient resolution to pinpoint more specific variants or combinations of variants that modulate cancer risk even below the sublineage level.
In summary, when evaluating viral genetics and precancer/cancer risk, it is important to consider the genetic variation that exists within individual HPV types with respect to geographic distribution, host race/ethnicity, and histologic subtypes (squamous cell carcinoma vs. adenocarcinoma). For HPV16, findings to date imply that sublineages have adapted to the niches (tissues and cell types) and populations in which they historically evolved — likely including strategies for avoiding immune clearance.
4.4. Within-host (intrahost) HPV diversity
Fine-scale analysis is required to study HPV evolution within a single infected individual and go ‘beyond the consensus’ [171]. Quantification of such within-host viral variation has only recently been made possible by next-generation ‘deep sequencing’, where a very large number of sequencing reads — often thousands — overlap each position being sequenced. This allows the detection of within-host viral variants such as intrahost single nucleotide variants (iSNVs). The relative frequency of a particular variant among the viral sequence reads, often referred to as its variant allele fraction (VAF), can then be used to estimate the allele's frequency in the within-host virus population. For example, a C→T iSNV that is present in 10% of reads is inferred to have a relative frequency of 10% in the virus population infecting the host.
Because sequencing error alone can produce low-frequency false-positive variants, appropriate filtering and quality control metrics are essential for within-host analyses. Filtering usually includes a minimum VAF (e.g., 5%), minimum total read coverage (e.g., 200), minimum absolute number of reads containing the variant (e.g., 10), and elimination of variants displaying strand bias. It has been suggested that the total read coverage should be 10 times the reciprocal of the desired minimum VAF, e.g., 10/0.05 = 200 reads to reliably detect variants at a frequency of 5% [172]. Further, amplicons containing mismatches in PCR primer regions should be eliminated because they can experience amplification biases that invalidate frequency estimates [173].
Within-host diversity is not always capable of being transmitted. Transmission to a new host requires a fully functional virion, and is therefore a major selective event. As a result, within-host diversity often includes transient, potentially deleterious mutations that do not transmit to new hosts, evidenced by the fact that πN/πS or dN/dS ratios are usually higher (closer to 1) within hosts than between hosts [174]. In the case of HPV, this is clearly seen from the accumulation of nonsynonymous changes which — even if they contribute to within-host persistence — would fail to produce infectious virion (e.g., in L1).
HPV types have historically been treated as static or fixed sequences. The major insight provided by genomics over the past decade has been that substantial variation within a type can exist, accumulate, and even modulate cancer risk by orders of magnitude. Ultimately, all such viral variation must have initially arisen as a within-host mutation.
4.4.1. APOBEC3-induced variation
One of the major causes of within-host HPV polymorphism is the interferon-stimulated host APOBEC3 (apolipoprotein B mRNA editing enzyme catalytic polypeptide-like 3) family of cytidine deaminases. APOBEC3 is thought to combat infection by introducing deleterious mutations into the viral genome. This could cause viral clearance either through specific changes (e.g., creation of neoantigens that expose the virus to the immune system) or a sufficiently large number of changes that the viral genomes are rendered nonviable (i.e., lethal mutagenesis [175]). For this to be effective, the mutations likely must occur in the viral reservoir in the basal cell layer.
APOBEC3 specifically acts on single-stranded DNA, such as occurs during transcription and replication, to induce C→U changes predominantly at the C of TpCpW (W = A or T) trinucleotide motifs. This can lead to C→T changes (via lack of repair or base excision repair/Strauss's A rule) and C→G changes (via base excision repair/REV1), accounting for COSMIC single base substitution (SBS) mutational signatures SBS2 and SBS13, respectively [176]. However, while APOBEC signatures SBS2 and SBS13 are both observed in the host (somatic) genome [177], only SBS2 has been observed in the HPV genome during infection [168].
A role for APOBEC3 in HPV infection was first established when Vartanian et al. showed that HPV16 mutations in cervical precancers correlate with changes induced by APOBEC3 expression in vitro [178]. However, it wasn't yet clear how or if these variations contribute to carcinogenesis. More recently, deep sequencing of 151 clinical samples with HPV types 16, 52, and 58 has suggested that the frequency of APOBEC3-compatible iSNVs decreases with progression to cancer [54]. Focusing on HPV16, deep sequencing of 5328 HPV16 samples shows that iSNVs consistent with APOBEC3 activity are enriched in controls compared to cases [168]. These results suggest APOBEC3 may help to reduce viral persistence — and, by extension, progression to cancer — within a host.
Despite APOBEC3's role in controlling viral infection, its mutagenic activity may be a double-edged sword: APOBEC3 signatures are also evident in host (somatic) genomes [177]. Such ‘off-target’ mutagenesis may contribute to carcinogenesis, i.e., the antiviral mechanism may inadvertently cause cancer and play the role of either ‘friend or foe’ [59]. Interestingly, a deletion removing the unique portion of APOBEC3B to create an APOBEC3A/APOBEC3B hybrid was found to be very common in East Asian, Native American, and Oceanic populations [179]. Although the effect of this deletion on HPV clearance or cancer risk is unclear (reviewed in Ref. [59]), it is conceivable that it could modulate clearance of different HPV types or variants in different populations.
Beyond contributing to cancer, APOBEC3 may also help to compensate for HPV's evolutionary limitation of a low mutation rate by providing additional mutational resources, e.g., for immune evasion in a present or future host. This is compatible with the overall saturation of nonsynonymous TpC→TpT changes observed in the virus' evolutionary history [168,169], and the fact that APOBEC-induced mutations have been observed to inadvertently benefit other viruses [180,181].
Finally, it is important to recognize that observed APOBEC3 mutations have likely been biased by natural selection; any mutations that eliminate a viral genome within a host or prevent its transmission to a new host will not persist to be sampled and sequenced.
4.4.2. Neither quasispecies nor invariant
Within-host variation should not be confused with the concept of quasispecies [182]. Briefly, quasispecies theory applies to situations in which mutation rates are so high — typically >1 mutation per genome per replication — that they produce a network (‘cloud’ or ‘swarm’) of interrelated genotypes each replication cycle. Such high mutation rates lead to an approximate steady state of unstable sequences, such that selection no longer acts on individual genomes, but rather groups of closely related genomes connected by ‘mutational coupling’ [183]. This does not describe the situation with HPV, where mutation rates are too low and within-host viral populations too small to give rise to quasispecies dynamics, and where a single viral genome sequence physically exists and forms a consensus in the majority of samples. For perspective, it is even questionable whether quasispecies theory applies to highly mutable RNA viruses [174]. Quasispecies is not synonymous with the presence of within-host viral polymorphism.
4.5. Integration
Integration into the host genome exists on a continuum, ranging from none to some to all of the HPV genomes in a cell. It is not a normal part of the HPV life cycle, and often occurs in such a way as to disrupt or delete whole ORFs, representing a dead end for the virus [45]. Nevertheless, integration can lead to cancer by conferring a growth advantage on its host cell. Most notably, integration disrupting E1 and E2 (which together regulate the expression of E6 and E7) is thought to be a major path of HPV-driven oncogenesis [45,48]. Less commonly, integration near host genes (e.g., MYC) may cause aberrant expression that promotes cancer [184,185]. However, other mechanisms such as mutations or methylation may lead to similar results, and a substantial proportion of specifically HPV16-associated cancers do not involve integrants [130].
Short-read technologies like Ion Torrent and Illumina can be used to detect integration breakpoints (sites of fusion between host and virus DNA) but may fail to characterize full integration events (complete stretches of HPV DNA flanked on both sides by host DNA). To address this limitation, Nanopore long-read sequencing has recently been used to describe integration events in HPV16-positive cervical cancers. Integration was observed in 15 of 16 tumour samples, with 0–13 breakpoints and 0–5 events per sample, i.e., the same breakpoint was often observed in multiple events in the same sample [186]. Breakpoints were enriched in E1 and E2, and all samples with integration contained at least one event maintaining E6 and E7 DNA, consistent with the dogma that expression of the oncoproteins is important for cervical cancer maintenance [45]. However, RNA expression of E6 and E7 was relatively low in one sample, raising the possibility of an alternative oncogenic pathway [186]. Of note, another study of oropharyngeal cancers did not find an enrichment of E2 breakpoints [187], raising the possibility that cancers at different anatomical sites differ in mechanism.
An important caveat applies when interpreting studies of integration. Integration events may occur randomly, but the subset of events maintaining the oncoprotein ORFs may be positively selected because they confer a growth advantage to their host cell. As a result, most studies of viral integration only describe the properties of integration specifically after within-host natural selection of virus and/or host (somatic) genomes has occurred.
5. Conclusions
Despite the availability of highly effective VLP-based vaccines against HPV, carcinogenic HPVs still cause ∼604,000 new cervical cancers and ∼124,000 new non-cervical cancers globally each year [2,3]. Thus, it remains important to understand the genetic basis of HPV oncogenicity, particularly that of the uniquely carcinogenic HPV16 type [10].
Like all cancers, HPV-induced cancer development likely involves numerous chance events including transmission, infection by a specific HPV type or variant, mutation, integration, and host (somatic) genetic changes. The stochastic nature of this process suggests there are many unique genetic causes of cervical cancer. Nevertheless, it is hoped that general patterns can be deciphered, and the availability of unprecedented numbers of whole HPV genomes is making this goal increasingly attainable. At the same time, new data are raising a smorgasbord of questions including the relative contributions of virus and host genetics to cancer, the importance of variability between and within HPV types, and the importance of within-host viral polymorphism (see Box 2). Understanding the genetic basis of HPV carcinogenicity will not only assist in the fight against morbidity and mortality associated with cervical cancer, but also increasingly prevalent HPV-driven cancers at other anatomical sites in both men and women — as well as other infection-attributable cancers that may share HPV's mechanisms of oncogenesis.
Box 2. Open Questions in HPV Genomics.
-
1.
What makes HPV16 uniquely carcinogenic at the cervix and particularly at non-cervical sites?
-
2.
Why are HPV16 sublineages A4, D2, and D3 more associated with adenocarcinoma than A1/A2?
-
3.
Can studies of non-carcinogenic HPV types help to inform us about cancer mechanisms? For example, does the loss of carcinogenicity in some HPV types (e.g., HPV97 in the HPV18/HPV45/HPV97 cluster) help to inform about the genetic basis of cancer in related types?
-
4.
What are the relative contributions of virus vs. host genomic changes in the steps leading to carcinogenesis?
-
5.
What are the relative contributions of virus genetics (e.g., E7 genotype) vs. host genetics (e.g., MHC/HLA alleles) in determining infection outcomes?
-
6.
Can synonymous sites in protein-coding regions be used to derive a better estimate of the HPV mutation rate?
-
7.
What are the relative contributions of natural selection (e.g., immune escape), mutation pressure (e.g., APOBEC3), host/pathogen co-divergence, and genetic drift to HPV evolutionary history?
-
8.
Given APOBEC3 signatures are present in both virus and host (somatic) genomes, does this enzyme primarily promote or impede carcinogenesis?
-
9.
Does a mutation or deletion of APOBEC3 modulate risk of cervical cancer and/or clearance of specific HPV types or variants in different populations?
-
10.
Can infectious virus ever be produced from integrated copies, or is integration always a dead end for the virus life cycle? If integration is not always a dead end, is it ever employed as a strategy for immune avoidance or latency?
Alt-text: Box 2
CRediT author statement
Chase W. Nelson: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data Curation, Writing - Original Draft, Writing - Review & Editing, Visualization.
Lisa Mirabello: Conceptualization, Validation, Investigation, Resources, Data Curation, Writing - Original Draft, Writing - Review & Editing, Visualization, Supervision, Project administration, Funding acquisition.
Funding
This research was supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics of the National Cancer Institute (NCI), and by the NCI Research Participation Program administered by the Oak Ridge Institute for Science and Education (ORISE) through an interagency agreement between the U.S. Department of Energy (DOE) and the National Institute of Health (NIH). ORISE is managed by ORAU under DOE contract number DESC0014664. All opinions expressed in this paper are the author's and do not necessarily reflect the policies and views of NIH, NCBI, DOE, or ORAU/ORISE.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
We thank Meredith Yeager for extensive discussion and feedback; Ming-Hsueh Lin for feedback on figures; Leonardo Varuzza and Felipe Luiz Pereira for feedback on Ion Torrent error rates; the members of the DCEG-NCI HPV Genomics Group (Laurie Burdette, Michael Dean, Aimee Koestler, Elisa Lee, Hong Lou, Sambit Mishra, Maisa Pinheiro, Meredith Yeager), Zachary Ardern, Chen-Hao Kuo, and Xinzhu (April) Wei for discussion; and our reviewers for feedback. We express sincere apologies to those researchers whose work could not be cited due to space limitations and the scope of this work.
Footnotes
Supplementary data to this article can be found online at https://doi.org/10.1016/j.tvr.2023.200258.
Contributor Information
Chase W. Nelson, Email: chase.nelson@nih.gov.
Lisa Mirabello, Email: mirabellol@mail.nih.gov.
Appendix A. Supplementary data
The following is the Supplementary data to this article:
Data availability
All sequence data are cited and publicly available on GenBank.
References
- 1.de Martel C., Plummer M., Vignat J., Franceschi S. Worldwide burden of cancer attributable to HPV by site, country and HPV type. Int. J. Cancer. 2017;141:664–670. doi: 10.1002/ijc.30716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.de Martel C., Georges D., Bray F., Ferlay J., Clifford G.M. Global burden of cancer attributable to infections in 2018: a worldwide incidence analysis. Lancet Global Health. 2020;8 doi: 10.1016/S2214-109X(19)30488-7. e180–e190. [DOI] [PubMed] [Google Scholar]
- 3.GLOBOCAN, The Global Cancer Observatory . 2020. Cancer Today.https://gco.iarc.fr/ (accessed June 21, 2022) [Google Scholar]
- 4.Bouvard V., Baan R., Straif K., Grosse Y., Secretan B., El Ghissassi F., Benbrahim-Tallaa L., Guha N., Freeman C., Galichet L. A review of human carcinogens—Part B: biological agents. Lancet Oncol. 2009;10:321–322. doi: 10.1016/s1470-2045(09)70096-8. [DOI] [PubMed] [Google Scholar]
- 5.IARC Working Group on the Evaluation of Carcinogenic Risks to Humans Biological agents. Volume 100 B. A review of human carcinogens. IARC Monogr. Eval. Carcinog. Risks Hum. 2012;100:1–441. [PMC free article] [PubMed] [Google Scholar]
- 6.Van Doorslaer K., Li Z., Xirasagar S., Maes P., Kaminsky D., Liou D., Sun Q., Kaur R., Huyen Y., McBride A.A. The Papillomavirus Episteme: a major update to the papillomavirus sequence database. Nucleic Acids Res. 2017;45:D499–D506. doi: 10.1093/nar/gkw879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.PaVE . 2022. The Papillomavirus Episteme, the Papillomavirus Episteme.https://pave.niaid.nih.gov/ (accessed June 21, 2022) [Google Scholar]
- 8.IARC IARC monographs on the identification of carcinogenic hazards to humans, online database. 2022. https://monographs.iarc.who.int/ (accessed June 21, 2022)
- 9.Doorbar J., Egawa N., Griffin H., Kranjec C., Murakami I. Human papillomavirus molecular biology and disease association. Rev. Med. Virol. 2015;25:2–23. doi: 10.1002/rmv.1822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Demarco M., Hyun N., Carter-Pokras O., Raine-Bennett T.R., Cheung L., Chen X., Hammer A., Campos N., Kinney W., Gage J.C., Befano B., Perkins R.B., He X., Dallal C., Chen J., Poitras N., Mayrand M.-H., Coutlee F., Burk R.D., Lorey T., Castle P.E., Wentzensen N., Schiffman M. A study of type-specific HPV natural history and implications for contemporary cervical cancer screening programs. EClinicalMedicine. 2020;22 doi: 10.1016/j.eclinm.2020.100293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Doorbar J., Quint W., Banks L., Bravo I.G., Stoler M., Broker T.R., Stanley M.A. The biology and life-cycle of human papillomaviruses. Vaccine. 2012;30 doi: 10.1016/j.vaccine.2012.06.083. F55–F70. [DOI] [PubMed] [Google Scholar]
- 12.Krump N.A., You J. Molecular mechanisms of viral oncogenesis in humans. Nat. Rev. Microbiol. 2018;16:684–698. doi: 10.1038/s41579-018-0064-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.de Sanjose S., Quint W.G., Alemany L., Geraets D.T., Klaustermeier J.E., Lloveras B., Tous S., Felix A., Bravo L.E., Shin H.-R., Vallejos C.S., de Ruiz P.A., Lima M.A., Guimera N., Clavero O., Alejo M., Llombart-Bosch A., Cheng-Yang C., Tatti S.A., Kasamatsu E., Iljazovic E., Odida M., Prado R., Seoud M., Grce M., Usubutun A., Jain A., Suarez G.A.H., Lombardi L.E., Banjo A., Menéndez C., Domingo E.J., Velasco J., Nessa A., Chichareon S.C.B., Qiao Y.L., Lerma E., Garland S.M., Sasagawa T., Ferrera A., Hammouda D., Mariani L., Pelayo A., Steiner I., Oliva E., Meijer C.J., Al-Jassar W.F., Cruz E., Wright T.C., Puras A., Llave C.L., Tzardi M., Agorastos T., Garcia-Barriola V., Clavel C., Ordi J., Andújar M., Castellsagué X., Sánchez G.I., Nowakowski A.M., Bornstein J., Muñoz N., Bosch F.X. Human papillomavirus genotype attribution in invasive cervical cancer: a retrospective cross-sectional worldwide study. Lancet Oncol. 2010;11:1048–1056. doi: 10.1016/S1470-2045(10)70230-8. [DOI] [PubMed] [Google Scholar]
- 14.Arbyn M., Tommasino M., Depuydt C., Dillner J. Are 20 human papillomavirus types causing cervical cancer? J. Pathol. 2014;234:431–435. doi: 10.1002/path.4424. [DOI] [PubMed] [Google Scholar]
- 15.Harari A., Chen Z., Burk R.D. In: Current Problems in Dermatology. Ramírez-Fort M.K., Khan F., Rady P.L., Tyring S.K., editors. S. KARGER AG; Basel: 2014. Human papillomavirus genomics: past, present and future; pp. 1–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Burk R.D., Chen Z., Van Doorslaer K. Human papillomaviruses: genetic basis of carcinogenicity. Public Health Genomics. 2009;12:281–290. doi: 10.1159/000214919. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Yu L., Majerciak V., Zheng Z.-M. HPV16 and HPV18 genome structure, expression, and post-transcriptional regulation. IJMS. 2022;23:4943. doi: 10.3390/ijms23094943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Vats A., Trejo-Cerro O., Thomas M., Banks L. Human papillomavirus E6 and E7: what remains? Tumour Virus Res. 2021;11 doi: 10.1016/j.tvr.2021.200213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Goodwin E.C., Yang E., Lee C.-J., Lee H.-W., DiMaio D., Hwang E.-S. Rapid induction of senescence in human cervical carcinoma cells. Proc. Natl. Acad. Sci. U.S.A. 2000;97:10978–10983. doi: 10.1073/pnas.97.20.10978. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Mesri E.A., Feitelson M.A., Munger K. Human viral oncogenesis: a cancer hallmarks analysis. Cell Host Microbe. 2014;15:266–282. doi: 10.1016/j.chom.2014.02.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Moody C.A., Laimins L.A. Human papillomavirus oncoproteins: pathways to transformation. Nat. Rev. Cancer. 2010;10:550–560. doi: 10.1038/nrc2886. [DOI] [PubMed] [Google Scholar]
- 22.Bravo I.G., Alonso Á. Mucosal human papillomaviruses encode four different E5 proteins whose chemistry and phylogeny correlate with malignant or benign growth. J. Virol. 2004;78:13613–13626. doi: 10.1128/JVI.78.24.13613-13626.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Campo M.S., Graham S.V., Cortese M.S., Ashrafi G.H., Araibi E.H., Dornan E.S., Miners K., Nunes C., Man S. HPV-16 E5 down-regulates expression of surface HLA class I and reduces recognition by CD8 T cells. Virology. 2010;407:137–142. doi: 10.1016/j.virol.2010.07.044. [DOI] [PubMed] [Google Scholar]
- 24.DiMaio D., Petti L.M. The E5 proteins. Virology. 2013;445:99–114. doi: 10.1016/j.virol.2013.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Ashrafi G.H., Tsirimonaki E., Marchetti B., O'Brien P.M., Sibbet G.J., Andrew L., Campo M.S. Down-regulation of MHC class I by bovine papillomavirus E5 oncoproteins. Oncogene. 2002;21:248–259. doi: 10.1038/sj.onc.1205008. [DOI] [PubMed] [Google Scholar]
- 26.Willemsen A., Félez-Sánchez M., Bravo I.G. Genome plasticity in papillomaviruses and de novo emergence of E5 oncogenes. Genome Biol. Evolut. 2019;11:1602–1617. doi: 10.1093/gbe/evz095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.McBride A.A. Mechanisms and strategies of papillomavirus replication. Biol. Chem. 2017;398:919–927. doi: 10.1515/hsz-2017-0113. [DOI] [PubMed] [Google Scholar]
- 28.McBride A.A. The Papillomavirus E2 proteins. Virology. 2013;445:57–79. doi: 10.1016/j.virol.2013.06.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Coursey T.L., McBride A.A. Hitchhiking of viral genomes on cellular chromosomes. Annu. Rev. Virol. 2019;6:275–296. doi: 10.1146/annurev-virology-092818-015716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Smith J.A., White E.A., Sowa M.E., Powell M.L.C., Ottinger M., Harper J.W., Howley P.M. Genome-wide siRNA screen identifies SMCX, EP400, and Brd4 as E2-dependent regulators of human papillomavirus oncogene expression. Proc. Natl. Acad. Sci. U.S.A. 2010;107:3752–3757. doi: 10.1073/pnas.0914818107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Dreer M., Fertey J., van de Poel S., Straub E., Madlung J., Macek B., Iftner T., Stubenrauch F. Interaction of NCOR/SMRT repressor complexes with papillomavirus E8^E2C proteins inhibits viral replication. PLoS Pathog. 2016;12 doi: 10.1371/journal.ppat.1005556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Sakakibara N., Chen D., McBride A.A. Papillomaviruses use recombination-dependent replication to vegetatively amplify their genomes in differentiated cells. PLoS Pathog. 2013;9 doi: 10.1371/journal.ppat.1003321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Doorbar J. The E4 protein; structure, function and patterns of expression. Virology. 2013;445:80–98. doi: 10.1016/j.virol.2013.07.008. [DOI] [PubMed] [Google Scholar]
- 34.Burk R.D., Harari A., Chen Z. Human papillomavirus genome variants. Virology. 2013;445:232–243. doi: 10.1016/j.virol.2013.07.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Chen X.S., Garcea R.L., Goldberg I., Casini G., Harrison S.C. Structure of small virus-like particles assembled from the L1 protein of human papillomavirus 16. Mol. Cell. 2000;5:557–567. doi: 10.1016/S1097-2765(00)80449-9. [DOI] [PubMed] [Google Scholar]
- 36.Stanley M., Lowy D.R., Frazer I. Chapter 12: prophylactic HPV vaccines: underlying mechanisms. Vaccine. 2006;24:S106–S113. doi: 10.1016/j.vaccine.2006.05.110. [DOI] [PubMed] [Google Scholar]
- 37.Prabhu P.R., Carter J.J., Galloway D.A. B cell responses upon human papillomavirus (HPV) infection and vaccination. Vaccines. 2022;10:837. doi: 10.3390/vaccines10060837. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Olcese V.A., Chen Y., Schlegel R., Yuan H. Characterization of HPV16 L1 loop domains in the formation of a type-specific, conformational epitope. BMC Microbiol. 2004;4:29. doi: 10.1186/1471-2180-4-29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Shah S.D., Doorbar J., Goldstein R.A. Analysis of host–parasite incongruence in papillomavirus evolution using importance sampling. Mol. Biol. Evol. 2010;27:1301–1314. doi: 10.1093/molbev/msq015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Egawa N., Doorbar J. The low-risk papillomaviruses. Virus Res. 2017;231:119–127. doi: 10.1016/j.virusres.2016.12.017. [DOI] [PubMed] [Google Scholar]
- 41.Frattini M.G., Lim H.B., Laimins L.A. In vitro synthesis of oncogenic human papillomaviruses requires episomal genomes for differentiation-dependent late expression. Proc. Natl. Acad. Sci. U.S.A. 1996;93:3062–3067. doi: 10.1073/pnas.93.7.3062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Bedell M.A., Hudson J.B., Golub T.R., Turyk M.E., Hosken M., Wilbanks G.D., Laimins L.A. Amplification of human papillomavirus genomes in vitro is dependent on epithelial differentiation. J. Virol. 1991;65:2254–2260. doi: 10.1128/jvi.65.5.2254-2260.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Stanley M.A., Browne H.M., Appleby M., Minson A.C. Properties of a non-tumorigenic human cervical keratinocyte cell line. Int. J. Cancer. 1989;43:672–676. doi: 10.1002/ijc.2910430422. [DOI] [PubMed] [Google Scholar]
- 44.Maglennon G.A., McIntosh P., Doorbar J. Persistence of viral DNA in the epithelial basal layer suggests a model for papillomavirus latency following immune regression. Virology. 2011;414:153–163. doi: 10.1016/j.virol.2011.03.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.McBride A.A., Warburton A. The role of integration in oncogenic progression of HPV-associated cancers. PLoS Pathog. 2017;13 doi: 10.1371/journal.ppat.1006211. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Stanley M.A. Epithelial cell responses to infection with human papillomavirus. Clin. Microbiol. Rev. 2012;25:215–222. doi: 10.1128/CMR.05028-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Roden R.B.S., Stern P.L. Opportunities and challenges for human papillomavirus vaccination in cancer. Nat. Rev. Cancer. 2018;18:240–254. doi: 10.1038/nrc.2018.13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Stanley M.A., Pett M.R., Coleman N. HPV: from infection to cancer. Biochem. Soc. Trans. 2007;35:1456–1460. doi: 10.1042/BST0351456. [DOI] [PubMed] [Google Scholar]
- 49.Moore P.S., Chang Y. Why do viruses cause cancer? Highlights of the first century of human tumour virology. Nat. Rev. Cancer. 2010;10:878–889. doi: 10.1038/nrc2961. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Mirabello L., Clarke M.A., Nelson C.W., Dean M., Wentzensen N., Yeager M., Cullen M., Boland J., NCI HPV Workshop. Schiffman M., Burk R.D. The intersection of HPV epidemiology, genomics and mechanistic studies of HPV-mediated carcinogenesis. Viruses. 2018;10:80. doi: 10.3390/v10020080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Syrjänen K.J., Pyrhönen S. Immunoperoxidase demonstration of human papilloma virus (HPV) in dysplastic lesions of the uterine cervix. Arch. Gynecol. 1982;233:53–61. doi: 10.1007/BF02110679. [DOI] [PubMed] [Google Scholar]
- 52.Schiffman M., Doorbar J., Wentzensen N., de Sanjosé S., Fakhry C., Monk B.J., Stanley M.A., Franceschi S. Carcinogenic human papillomavirus infection. Nat. Rev. Dis. Prim. 2016;2 doi: 10.1038/nrdp.2016.86. [DOI] [PubMed] [Google Scholar]
- 53.Kukimoto I., Maehama T., Sekizuka T., Ogasawara Y., Kondo K., Kusumoto-Matsuo R., Mori S., Ishii Y., Takeuchi T., Yamaji T., Takeuchi F., Hanada K., Kuroda M. Genetic variation of human papillomavirus type 16 in individual clinical specimens revealed by deep sequencing. PLoS One. 2013;8 doi: 10.1371/journal.pone.0080583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Hirose Y., Onuki M., Tenjimbayashi Y., Mori S., Ishii Y., Takeuchi T., Tasaka N., Satoh T., Morisada T., Iwata T., Miyamoto S., Matsumoto K., Sekizawa A., Kukimoto I. Within-host variations of human papillomavirus reveal APOBEC signature mutagenesis in the viral genome. J. Virol. 2018;92:e00017–e00018. doi: 10.1128/JVI.00017-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Schiffman M., Herrero R., DeSalle R., Hildesheim A., Wacholder S., Cecilia Rodriguez A., Bratti M.C., Sherman M.E., Morales J., Guillen D., Alfaro M., Hutchinson M., Wright T.C., Solomon D., Chen Z., Schussler J., Castle P.E., Burk R.D. The carcinogenicity of human papillomavirus types reflects viral evolution. Virology. 2005;337:76–84. doi: 10.1016/j.virol.2005.04.002. [DOI] [PubMed] [Google Scholar]
- 56.Bzhalava D., Guan P., Franceschi S., Dillner J., Clifford G. A systematic review of the prevalence of mucosal and cutaneous human papillomavirus types. Virology. 2013;445:224–231. doi: 10.1016/j.virol.2013.07.015. [DOI] [PubMed] [Google Scholar]
- 57.Flores E.R., Lambert P.F. Evidence for a switch in the mode of human papillomavirus type 16 DNA replication during the viral life cycle. J. Virol. 1997;71:7167–7179. doi: 10.1128/jvi.71.10.7167-7179.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Roerink S.F., van Schendel R., Tijsterman M. Polymerase theta-mediated end joining of replication-associated DNA breaks in C. elegans. Genome Res. 2014;24:954–962. doi: 10.1101/gr.170431.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Warren C.J., Santiago M.L., Pyeon D. APOBEC3: friend or foe in human papillomavirus infection and oncogenesis? Annu. Rev. Virol. 2022;9:16.1–16.21. doi: 10.1146/annurev-virology-092920-030354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Fryxell K.J., Zuckerkandl E. Cytosine deamination plays a primary role in the evolution of mammalian isochores. Mol. Biol. Evol. 2000;17:1371–1383. doi: 10.1093/oxfordjournals.molbev.a026420. [DOI] [PubMed] [Google Scholar]
- 61.Chen Z., Utro F., Platt D., DeSalle R., Parida L., Chan P.K.S., Burk R.D. K-mer analyses reveal different evolutionary histories of alpha, Beta, and Gamma papillomaviruses. IJMS. 2021;22:9657. doi: 10.3390/ijms22179657. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.King K.M., Rajadhyaksha E.V., Tobey I.G., Van Doorslaer K. Synonymous nucleotide changes drive papillomavirus evolution. Tumour Virus Res. 2022;14 doi: 10.1016/j.tvr.2022.200248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Rowson K.E.K., Mahy B.W.J. Human papova (wart) virus. Bacteriol. Rev. 1967;31:110–131. doi: 10.1128/br.31.2.110-131.1967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Meyers C., Frattini M.G., Hudson J.B., Laimins L.A. Biosynthesis of human papillomavirus from a continuous cell line upon epithelial differentiation. Science. 1992;257:971–973. doi: 10.1126/science.1323879. [DOI] [PubMed] [Google Scholar]
- 65.Fausch S.C., Da Silva D.M., Eiben G.L., Le Poole C., Kast W.M. HPV protein/peptide vaccines: from animal models to clinical trials. Front. Biosci. 2003;8:s81–s91. doi: 10.2741/1009. [DOI] [PubMed] [Google Scholar]
- 66.Van Doorslaer K. Evolution of the Papillomaviridae. Virology. 2013;445:11–20. doi: 10.1016/j.virol.2013.05.012. [DOI] [PubMed] [Google Scholar]
- 67.Kimura M. Evolutionary rate at the molecular level. Nature. 1968;217:624–626. doi: 10.1038/217624a0. [DOI] [PubMed] [Google Scholar]
- 68.Kimura M. Cambridge University Press; 1983. The Neutral Theory of Molecular Evolution. [Google Scholar]
- 69.Ong C.-K., Chan S.-Y., Campo M.S., Fujinaga K., Mavromara-Nazos P., Labropoulou V., Pfister H., Tay S.-K., ter Meulen J., Villa L.L., Bernard H.-U. Evolution of human papillomavirus type 18: an ancient phylogenetic root in Africa and intratype diversity reflect coevolution with human ethnic groups. J. Virol. 1993;67:6424–6431. doi: 10.1128/jvi.67.11.6424-6431.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Rector A., Lemey P., Tachezy R., Mostmans S., Ghim S.-J., Van Doorslaer K., Roelke M., Bush M., Montali R.J., Joslin J., Burk R.D., Jenson A.B., Sundberg J.P., Shapiro B., Van Ranst M. Ancient papillomavirus-host co-speciation in Felidae. Genome Biol. 2007;8:R57. doi: 10.1186/gb-2007-8-4-r57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Jenkins G.M., Rambaut A., Pybus O.G., Holmes E.C. Rates of molecular evolution in RNA viruses: a quantitative phylogenetic analysis. J. Mol. Evol. 2002;54:156–165. doi: 10.1007/s00239-001-0064-3. [DOI] [PubMed] [Google Scholar]
- 72.Hanada K., Suzuki Y., Gojobori T. A large variation in the rates of synonymous substitution for RNA viruses and its relationship to a diversity of viral infection and transmission modes. Mol. Biol. Evol. 2004;21:1074–1080. doi: 10.1093/molbev/msh109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Jónsson H., Sulem P., Kehr B., Kristmundsdottir S., Zink F., Hjartarson E., Hardarson M.T., Hjorleifsson K.E., Eggertsson H.P., Gudjonsson S.A., Ward L.D., Arnadottir G.A., Helgason E.A., Helgason H., Gylfason A., Jonasdottir A., Jonasdottir A., Rafnar T., Frigge M., Stacey S.N., Magnusson O. Th, Thorsteinsdottir U., Masson G., Kong A., Halldorsson B.V., Helgason A., Gudbjartsson D.F., Stefansson K. Parental influence on human germline de novo mutations in 1,548 trios from Iceland. Nature. 2017;549:519–522. doi: 10.1038/nature24018. [DOI] [PubMed] [Google Scholar]
- 74.Nei M., Li W.-H. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proceed. National Acad. Sci. USA. 1979;76:5269–5273. doi: 10.1073/pnas.76.10.5269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Zhao L., Illingworth C.J.R. Measurements of intrahost viral diversity require an unbiased diversity metric. Virus Evolution. 2019;5:vey041. doi: 10.1093/ve/vey041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Nei M., Gojobori T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 1986;3:418–426. doi: 10.1093/oxfordjournals.molbev.a040410. [DOI] [PubMed] [Google Scholar]
- 77.Nelson C.W., Hughes A.L. Within-host nucleotide diversity of virus populations: insights from next-generation sequencing. Infect. Genet. Evol. 2015;30:1–7. doi: 10.1016/j.meegid.2014.11.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Hughes A.L. Oxford University Press; New York, NY: 1999. Adaptive Evolution of Genes and Genomes. [Google Scholar]
- 79.Kryazhimskiy S., Plotkin J.B. The Population Genetics of dN/dS. PLoS Genet. 2008;4 doi: 10.1371/journal.pgen.1000304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Holmes E.C., Lipman D.J., Zamarin D., Yewdell J.W. Comment on ‘large-scale sequence analysis of avian influenza isolates’. Science. 2006;313 doi: 10.1126/science.1131729. 1573b–1573b. [DOI] [PubMed] [Google Scholar]
- 81.Nelson C.W., Ardern Z., Wei X. OLGenie: estimating natural selection to predict functional overlapping genes. Mol. Biol. Evol. 2020;37:2440–2449. doi: 10.1093/molbev/msaa087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Wei X., Zhang J. A simple method for estimating the strength of natural selection on overlapping genes. Genome Biol. Evolut. 2015;7:381–390. doi: 10.1093/gbe/evu294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Dillner J. Mapping of linear epitopes of human papillomavirus type 16: the E1, E2, E4, E5, E6 and E7 open reading frames. Int. J. Cancer. 1990;46:703–711. doi: 10.1002/ijc.2910460426. [DOI] [PubMed] [Google Scholar]
- 84.Lehtinen M., Hibma M.H., Stellato G., Kuoppala T., Paavonen J. Human T helper cell epitopes overlap B cell and putative cytotoxic T cell epitopes in the E2 protein of human papillomavirus type 16. Biochem. Biophys. Res. Commun. 1995;209:541–546. doi: 10.1006/bbrc.1995.1535. [DOI] [PubMed] [Google Scholar]
- 85.Félez-Sánchez M., Trösemeier J.-H., Bedhomme S., González-Bravo M.I., Kamp C., Bravo I.G. Cancer, warts, or asymptomatic infections: clinical presentation matches codon usage preferences in human papillomaviruses. Genome Biol. Evol. 2015;7:2117–2135. doi: 10.1093/gbe/evv129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Zhang J., Yang J.-R. Determinants of the rate of protein sequence evolution. Nat. Rev. Genet. 2015;16:409–420. doi: 10.1038/nrg3950. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Hughes A.L., Hughes M.A.K. Patterns of nucleotide difference in overlapping and non-overlapping reading frames of papillomavirus genomes. Virus Res. 2005;113:81–88. doi: 10.1016/j.virusres.2005.03.030. [DOI] [PubMed] [Google Scholar]
- 88.Narechania A., Terai M., Burk R.D. Overlapping reading frames in closely related human papillomaviruses result in modular rates of selection within E2. J. Gen. Virol. 2005;86:1307–1313. doi: 10.1099/vir.0.80747-0. [DOI] [PubMed] [Google Scholar]
- 89.Jiang M., Xi L.F., Edelstein Z.R., Galloway D.A., Olsem G.J., Lin W.C.-C., Kiviat N.B. Identification of recombinant human papillomavirus type 16 variants. Virology. 2009;394:8–11. doi: 10.1016/j.virol.2009.08.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Nikolaidis M., Tsakogiannis D., Bletsa G., Mossialos D., Kottaridi C., Iliopoulos I., Markoulatos P., Amoutzias G.D. HPV16-Genotyper: a computational tool for risk-assessment, lineage genotyping and recombination detection in HPV16 sequences, based on a large-scale evolutionary analysis. Diversity. 2021;13:497. doi: 10.3390/d13100497. [DOI] [Google Scholar]
- 91.Narechania A., Chen Z., DeSalle R., Burk R.D. Phylogenetic incongruence among oncogenic genital alpha human papillomaviruses. J. Virol. 2005;79:15503–15510. doi: 10.1128/JVI.79.24.15503-15510.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Daugherty M.D., Malik H.S. Rules of engagement: molecular insights from host-virus arms races. Annu. Rev. Genet. 2012;46:677–700. doi: 10.1146/annurev-genet-110711-155522. [DOI] [PubMed] [Google Scholar]
- 93.Tenthorey J.L., Emerman M., Malik H.S. Evolutionary landscapes of host-virus arms races. Annu. Rev. Immunol. 2022;40:271–294. doi: 10.1146/annurev-immunol-072621-084422. [DOI] [PubMed] [Google Scholar]
- 94.Schiller J.T., Lowy D.R. Understanding and learning from the success of prophylactic human papillomavirus vaccines. Nat. Rev. Microbiol. 2012;10:681–692. doi: 10.1038/nrmicro2872. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Zinkernagel R.M., Hengartner H. Antiviral immunity. Immunol. Today. 1997;18:258–260. doi: 10.1016/S0167-5699(97)80017-5. [DOI] [PubMed] [Google Scholar]
- 96.Trottier H., Franco E.L. The epidemiology of genital human papillomavirus infection. Vaccine. 2006;24 doi: 10.1016/j.vaccine.2005.09.054. S4–S15. [DOI] [PubMed] [Google Scholar]
- 97.Leo P.J., Madeleine M.M., Wang S., Schwartz S.M., Newell F., Pettersson-Kymmer U., Hemminki K., Hallmans G., Tiews S., Steinberg W., Rader J.S., Castro F., Safaeian M., Franco E.L., Coutlée F., Ohlsson C., Cortes A., Marshall M., Mukhopadhyay P., Cremin K., Johnson L.G., Garland S., Tabrizi S.N., Wentzensen N., Sitas F., Little J., Cruickshank M., Frazer I.H., Hildesheim A., Brown M.A. Defining the genetic susceptibility to cervical neoplasia—a genome-wide association study. PLoS Genet. 2017;13 doi: 10.1371/journal.pgen.1006866. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Zehbe I., Tachezy R., Mytilineos J., Voglino G., Mikyskova I., Delius H., Marongiu A., Gissmann L., Wilander E., Tommasino M. Human papillomavirus 16 E6 polymorphisms in cervical lesions from different European populations and their correlation with human leukocyte antigen class II haplotypes. Int. J. Cancer. 2001;94:711–716. doi: 10.1002/ijc.1520. [DOI] [PubMed] [Google Scholar]
- 99.Zehbe I., Mytilineos J., Wikström I., Henriksen R., Edler L., Tommasino M. Association between human papillomavirus 16 E6 variants and human leukocyte antigen class I polymorphism in cervical cancer of Swedish women. Hum. Immunol. 2003;64:538–542. doi: 10.1016/S0198-8859(03)00033-8. [DOI] [PubMed] [Google Scholar]
- 100.Ivansson E., Juko-Pecirep I., Erlich H., Gyllensten U. Pathway-based analysis of genetic susceptibility to cervical cancer in situ: HLA-DPB1 affects risk in Swedish women. Gene Immun. 2011;12:605–614. doi: 10.1038/gene.2011.40. [DOI] [PubMed] [Google Scholar]
- 101.Shi Y., Li L., Hu Z., Li S., Wang S., Liu J., Wu C., He L., Zhou J., Li Z., Hu T., Chen Y., Jia Y., Wang S., Wu L., Cheng X., Yang Z., Yang R., Li X., Huang K., Zhang Q., Zhou H., Tang F., Chen Z., Shen J., Jiang J., Ding H., Xing H., Zhang S., Qu P., Song X., Lin Z., Deng D., Xi L., Lv W., Han X., Tao G., Yan L., Han Z., Li Z., Miao X., Pan S., Shen Y., Wang H., Liu D., Gong E., Li Z., Zhou L., Luan X., Wang C., Song Q., Wu S., Xu H., Shen J., Qiang F., Ma G., Liu L., Chen X., Liu J., Wu J., Shen Y., Wen Y., Chu M., Yu J., Hu X., Fan Y., He H., Jiang Y., Lei Z., Liu C., Chen J., Zhang Y., Yi C., Chen S., Li W., Wang D., Wang Z., Di W., Shen K., Lin D., Shen H., Feng Y., Xie X., Ma D. A genome-wide association study identifies two new cervical cancer susceptibility loci at 4q12 and 17q12. Nat. Genet. 2013;45:918–922. doi: 10.1038/ng.2687. [DOI] [PubMed] [Google Scholar]
- 102.Chen D., Juko-Pecirep I., Hammer J., Ivansson E., Enroth S., Gustavsson I., Feuk L., Magnusson P.K.E., McKay J.D., Wilander E., Gyllensten U. Genome-wide association study of susceptibility loci for cervical cancer, JNCI. J. National Cancer Instit. 2013;105:624–633. doi: 10.1093/jnci/djt051. [DOI] [PubMed] [Google Scholar]
- 103.Lynch M. Sinauer Associates, Inc. Publishers; Sunderland, MA: 2007. The Origins of Genome Architecture. [Google Scholar]
- 104.Nowak M.A. Belknap/Harvard; Canada: 2006. Evolutionary Dynamics. [Google Scholar]
- 105.Zhou J., Sun X.Y., Stenzel D.J., Frazer’ I.H. Expression of vaccinia recombinant HPV 16 L1 and L2 ORF proteins in epithelial cells is suffcient for assembly of HPV virion-like particles. Virology. 1991;185:251–257. doi: 10.1016/0042-6822(91)90772-4. [DOI] [PubMed] [Google Scholar]
- 106.Kirnbauer R., Booy F., Cheng N., Lowy D.R., Schiller J.T. Papillomavirus L1 major capsid protein self-assembles into virus-like particles that are highly immunogenic. Proc. Natl. Acad. Sci. U.S.A. 1992;89:12180–12184. doi: 10.1073/pnas.89.24.12180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Kirnbauer R., Taub J., Greenstone H., Roden R., Dürst M., Gissmann L., Lowy D.R., Schiller J.T. Efficient self-assembly of human papillomavirus type 16 L1 and L1-L2 into virus-like particles. J. Virol. 1993;67 doi: 10.1128/JVI.67.12.6929-6936.1993. 6929–6926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Smith B., Chen Z., Reimers L., van Doorslaer K., Schiffman M., DeSalle R., Herrero R., Yu K., Wacholder S., Wang T., Burk R.D. Sequence imputation of HPV16 genomes for genetic association studies. PLoS One. 2011;6 doi: 10.1371/journal.pone.0021375. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Sun M., Gao L., Liu Y., Zhao Y., Wang X., Pan Y., Ning T., Cai H., Yang H., Zhai W., Ke Y. Whole genome sequencing and evolutionary analysis of human papillomavirus type 16 in Central China. PLoS One. 2012;7 doi: 10.1371/journal.pone.0036577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.van der Weele P., Meijer C.J.L.M., King A.J. Whole-genome sequencing and variant analysis of human papillomavirus 16 infections. J. Virol. 2017;91 doi: 10.1128/JVI.00844-17. e00844-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 111.van der Weele P., Meijer C.J.L.M., King A.J. High whole-genome sequence diversity of human papillomavirus type 18 isolates. Viruses. 2018;10:68. doi: 10.3390/v10020068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 112.Cullen M., Boland J.F., Schiffman M., Zhang X., Wentzensen N., Yang Q., Chen Z., Yu K., Mitchell J., Roberson D., Bass S., Burdette L., Machado M., Ravichandran S., Luke B., Machiela M.J., Andersen M., Osentoski M., Laptewicz M., Wacholder S., Feldman A., Raine-Bennett T., Lorey T., Castle P.E., Yeager M., Burk R.D., Mirabello L. Deep sequencing of HPV16 genomes: a new high-throughput tool for exploring the carcinogenicity and natural history of HPV16 infection. Papillomavirus Res. 2015;1:3–11. doi: 10.1016/j.pvr.2015.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 113.Mirabello L., Yeager M., Yu K., Clifford G.M., Xiao Y., Zhu B., Cullen M., Boland J.F., Wentzensen N., Nelson C.W., Raine-Bennett T., Chen Z., Bass S., Song L., Yang Q., Steinberg M., Burdett L., Dean M., Roberson D., Mitchell J., Lorey T., Franceschi S., Castle P.E., Walker J., Zuna R., Kreimer A.R., Beachler D.C., Hildesheim A., Gonzalez P., Porras C., Burk R.D., Schiffman M. HPV16 E7 genetic conservation is critical to carcinogenesis. Cell. 2017;170:1164–1174. doi: 10.1016/j.cell.2017.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 114.Pinheiro M., Harari A., Schiffman M., Clifford G.M., Chen Z., Yeager M., Cullen M., Boland J.F., Raine-Bennett T., Steinberg M., Bass S., Xiao Y., Tenet V., Yu K., Zhu B., Burdett L., Turan S., Lorey T., Castle P.E., Wentzensen N., Burk R.D., Mirabello L. Phylogenomic analysis of human papillomavirus type 31 and cervical carcinogenesis: a study of 2093 viral genomes. Viruses. 2021;13:1948. doi: 10.3390/v13101948. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 115.Pinheiro M., Gage J.C., Clifford G.M., Demarco M., Cheung L.C., Chen Z., Yeager M., Cullen M., Boland J.F., Chen X., Raine‐Bennett T., Steinberg M., Bass S., Befano B., Xiao Y., Tenet V., Walker J., Zuna R., Poitras N.E., Gold M.A., Dunn T., Yu K., Zhu B., Burdett L., Turan S., Lorey T., Castle P.E., Wentzensen N., Burk R.D., Schiffman M., Mirabello L. Association of HPV35 with cervical carcinogenesis among women of African ancestry: evidence of viral‐host interaction with implications for disease intervention. Int. J. Cancer. 2020;147:2677–2686. doi: 10.1002/ijc.33033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 116.Pereira F.L., Soares S.C., Dorella F.A., Leal C.A.G., Figueiredo H.C.P. Evaluating the efficacy of the new Ion PGM Hi-Q Sequencing Kit applied to bacterial genomes. Genomics. 2016;107:189–198. doi: 10.1016/j.ygeno.2016.03.004. [DOI] [PubMed] [Google Scholar]
- 117.Rothberg J.M., Hinz W., Rearick T.M., Schultz J., Mileski W., Davey M., Leamon J.H., Johnson K., Milgrew M.J., Edwards M., Hoon J., Simons J.F., Marran D., Myers J.W., Davidson J.F., Branting A., Nobile J.R., Puc B.P., Light D., Clark T.A., Huber M., Branciforte J.T., Stoner I.B., Cawley S.E., Lyons M., Fu Y., Homer N., Sedova M., Miao X., Reed B., Sabina J., Feierstein E., Schorn M., Alanjary M., Dimalanta E., Dressman D., Kasinskas R., Sokolsky T., Fidanza J.A., Namsaraev E., McKernan K.J., Williams A., Roth G.T., Bustillo J. An integrated semiconductor device enabling non-optical genome sequencing. Nature. 2011;475:348–352. doi: 10.1038/nature10242. [DOI] [PubMed] [Google Scholar]
- 118.Pastrana D.V., Peretti A., Welch N.L., Borgogna C., Olivero C., Badolato R., Notarangelo L.D., Gariglio M., FitzGerald P.C., McIntosh C.E., Reeves J., Starrett G.J., Bliskovsky V., Velez D., Brownell I., Yarchoan R., Wyvill K.M., Uldrick T.S., Maldarelli F., Lisco A., Sereti I., Gonzalez C.M., Androphy E.J., McBride A.A., Van Doorslaer K., Garcia F., Dvoretzky I., Liu J.S., Han J., Murphy P.M., McDermott D.H., Buck C.B. Metagenomic discovery of 83 new human papillomavirus types in patients with immunodeficiency. mSphere. 2018;3 doi: 10.1128/mSphereDirect.00645-18. e00645-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 119.Tirosh O., Conlan S., Deming C., Lee-Lin S.-Q., Huang X., NISC Comparative Sequencing Program. Su H.C., Freeman A.F., Segre J.A., Kong H.H. Expanded skin virome in DOCK8-deficient patients. Nat. Med. 2018;24:1815–1821. doi: 10.1038/s41591-018-0211-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 120.de Villiers E.-M., Fauquet C., Broker T.R., Bernard H.-U., zur Hausen H. Classification of papillomaviruses. Virology. 2004;324:17–27. doi: 10.1016/j.virol.2004.03.033. [DOI] [PubMed] [Google Scholar]
- 121.Van Doorslaer K. Revisiting papillomavirus taxonomy: a proposal for updating the current classification in line with evolutionary evidence. Viruses. 2022;14:2308. doi: 10.3390/v14102308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Chan S.Y., Delius H., Halpern A.L., Bernard H.U. Analysis of genomic sequences of 95 papillomavirus types: uniting typing, phylogeny, and taxonomy. J. Virol. 1995;69:3074–3083. doi: 10.1128/jvi.69.5.3074-3083.1995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Kogure G., Onuki M., Hirose Y., Yamaguchi-Naka M., Mori S., Iwata T., Kondo K., Sekizawa A., Matsumoto K., Kukimoto I. Whole-genome analysis of human papillomavirus 67 isolated from Japanese women with cervical lesions. Virol. J. 2022;19:157. doi: 10.1186/s12985-022-01894-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 124.Dürst M., Gissmann L., Ikenberg H., zur Hausen H. A papillomavirus DNA from a cervical carcinoma and its prevalence in cancer biopsy samples from different geographic regions. Proc. Natl. Acad. Sci. U.S.A. 1983;80:3812–3815. doi: 10.1073/pnas.80.12.3812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 125.Castellsagué X., Klaustermeier J., Carrilho C., Albero G., Sacarlal J., Quint W., Kleter B., Lloveras B., Ismail M.R., de Sanjosé S., Bosch F.X., Alonso P., Menéndez C. Vaccine-related HPV genotypes in women with and without cervical cancer in Mozambique: burden and potential for prevention. Int. J. Cancer. 2007;122:1901. doi: 10.1002/ijc.23292. –1904. [DOI] [PubMed] [Google Scholar]
- 126.Okolo C., Franceschi S., Adewole I., Thomas J.O., Follen M., Snijders P.J., Meijer C.J., Clifford G.M. Human papillomavirus infection in women with and without cervical cancer in Ibadan, Nigeria. Infect. Agents Cancer. 2010;5:24. doi: 10.1186/1750-9378-5-24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 127.Guan P., Howell-Jones R., Li N., Bruni L., de Sanjosé S., Franceschi S., Clifford G.M. Human papillomavirus types in 115,789 HPV-positive women: a meta-analysis from cervical infection to cancer. Int. J. Cancer. 2012;131:2349–2359. doi: 10.1002/ijc.27485. [DOI] [PubMed] [Google Scholar]
- 128.Denny L., Adewole I., Anorlu R., Dreyer G., Moodley M., Smith T., Snyman L., Wiredu E., Molijn A., Quint W., Ramakrishnan G., Schmidt J. Human papillomavirus prevalence and type distribution in invasive cervical cancer in sub-Saharan Africa: cervical cancer in sub-Saharan Africa. Int. J. Cancer. 2014;134:1389–1398. doi: 10.1002/ijc.28425. [DOI] [PubMed] [Google Scholar]
- 129.Clifford G.M., de Vuyst H., Tenet V., Plummer M., Tully S., Franceschi S. Effect of HIV infection on human papillomavirus types causing invasive cervical cancer in Africa. JAIDS J. Acquired Immune Deficiency Syndrom. 2016;73:332–339. doi: 10.1097/QAI.0000000000001113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 130.The Cancer Genome Atlas Research Network Integrated genomic and molecular characterization of cervical cancer. Nature. 2017;543:378–384. doi: 10.1038/nature21386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 131.Klingelhutz A.J., Roman A. Cellular transformation by human papillomaviruses: lessons learned by comparing high- and low-risk viruses. Virology. 2012;424:77–98. doi: 10.1016/j.virol.2011.12.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 132.Auslander N., Wolf Y.I., Shabalina S.A., Koonin E.V. A unique insert in the genomes of high-risk human papillomaviruses with a predicted dual role in conferring oncogenic risk. F1000Res. 2019;8:1000. doi: 10.12688/f1000research.19590.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 133.Ho L., Chan S.-Y., Chow V., Chong T., Tay S.-K., Villa L.L., Bernard H.-U. Sequence variants of human papillomavirus type 16 in clinical samples permit verification and extension of epidemiological studies and construction of a phylogenetic tree. J. Clin. Microbiol. 1991;29:1765–1772. doi: 10.1128/jcm.29.9.1765-1772.1991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 134.Clifford G.M., Tenet V., Georges D., Alemany L., Pavón M.A., Chen Z., Yeager M., Cullen M., Boland J.F., Bass S., Steinberg M., Raine-Bennett T., Lorey T., Wentzensen N., Walker J., Zuna R., Schiffman M., Mirabello L. Human papillomavirus 16 sub-lineage dispersal and cervical cancer risk worldwide: whole viral genome sequences from 7116 HPV16-positive women. Papillomavirus Res. 2019;7:67–74. doi: 10.1016/j.pvr.2019.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 135.Nicolás-Párraga S., Alemany L., de Sanjosé S., Bosch F.X., Bravo I.G. RIS HPV TT and HPV VVAP Study Groups, Differential HPV16 variant distribution in squamous cell carcinoma, adenocarcinoma and adenosquamous cell carcinoma: HPV16 variants in different cervical cancer histologies. Int. J. Cancer. 2017;140:2092–2100. doi: 10.1002/ijc.30636. [DOI] [PubMed] [Google Scholar]
- 136.Pimenoff V.N., de Oliveira C.M., Bravo I.G. Transmission between archaic and modern human ancestors during the evolution of the oncogenic human papillomavirus 16. Mol. Biol. Evol. 2017;34:4–19. doi: 10.1093/molbev/msw214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 137.Chen Z., DeSalle R., Schiffman M., Herrero R., Wood C.E., Ruiz J.C., Clifford G.M., Chan P.K.S., Burk R.D. Niche adaptation and viral transmission of human papillomaviruses from archaic hominins to modern humans. PLoS Pathog. 2018;14 doi: 10.1371/journal.ppat.1007352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 138.Hildesheim A., Schiffman M., Bromley C., Wacholder S., Herrero R., Rodriguez A.C., Bratti M.C., Sherman M.E., Scarpidis U., Lin Q.-Q., Terai M., Bromley R.L., Buetow K., Apple R.J., Burk R.D. Human papillomavirus type 16 variants and risk of cervical cancer. JNCI J. Nat. Cancer Instit. 2001;93:315–318. doi: 10.1093/jnci/93.4.315. [DOI] [PubMed] [Google Scholar]
- 139.Pientong C., Wongwarissara P., Ekalaksananan T., Swangphon P., Kleebkaow P., Kongyingyoes B., Siriaunkgul S., Tungsinmunkong K., Suthipintawong C. Association of human papillomavirus type 16 long control region mutation and cervical cancer. Virol. J. 2013;10:30. doi: 10.1186/1743-422X-10-30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 140.Xi L.F., Koutsky L.A., Hildesheim A., Galloway D.A., Wheeler C.M., Winer R.L., Ho J., Kiviat N.B. vol. 16. 2007. pp. 4–10. (Risk for High-Grade Cervical Intraepithelial Neoplasia Associated with Variants of Human Papillomavirus Types 16 and 18, Cancer Epidemiology, Biomarkers & Prevention). [DOI] [PubMed] [Google Scholar]
- 141.Schiffman M., Rodriguez A.C., Chen Z., Wacholder S., Herrero R., Hildesheim A., Desalle R., Befano B., Yu K., Safaeian M., Sherman M.E., Morales J., Guillen D., Alfaro M., Hutchinson M., Solomon D., Castle P.E., Burk R.D. A population-based prospective study of carcinogenic human papillomavirus variant lineages, viral persistence, and cervical neoplasia. Cancer Res. 2010;70:3159–3169. doi: 10.1158/0008-5472.CAN-09-4179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 142.Cornet I., Gheit T., Iannacone M.R., Vignat J., Sylla B.S., Del Mistro A., Franceschi S., Tommasino M., Clifford G.M., IARC HPV Variant Study Group HPV16 genetic variation and the development of cervical cancer worldwide. Br. J. Cancer. 2013;108:240–244. doi: 10.1038/bjc.2012.508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 143.Gheit T., Cornet I., Clifford G.M., Iftner T., Munk C., Tommasino M., Kjaer S.K. Risks for persistence and progression by human papillomavirus type 16 variant lineages among a population-based sample of Danish women, cancer epidemiology. Biomarkers Prevent. 2011;20:1315–1321. doi: 10.1158/1055-9965.EPI-10-1187. [DOI] [PubMed] [Google Scholar]
- 144.Zehbe I., Voglino G., Delius H., Wilander E., Tommasino M. Risk of cervical cancer and geographical variations of human papillomavirus 16 E6 polymorphisms. Lancet. 1998;352:1441–1442. doi: 10.1016/S0140-6736(05)61263-9. [DOI] [PubMed] [Google Scholar]
- 145.Zuna R.E., Moore W.E., Shanesmith R.P., Dunn S.T., Wang S.S., Schiffman M., Blakey G.L., Teel T. Association of HPV16 E6 variants with diagnostic severity in cervical cytology samples of 354 women in a US population. Int. J. Cancer. 2009;125:2609–2613. doi: 10.1002/ijc.24706. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 146.Sichero L., Ferreira S., Trottier H., Duarte-Franco E., Ferenczy A., Franco E.L., Villa L.L. High grade cervical lesions are caused preferentially by non-European variants of HPVs 16 and 18. Int. J. Cancer. 2007;120:1763–1768. doi: 10.1002/ijc.22481. [DOI] [PubMed] [Google Scholar]
- 147.Berumen J., Ordoñez R.M., Lazcano E., Salmeron J., Galvan S.C., Estrada R.A., Yunes E., Garcia-Carranca A., Gonzalez-Lira G., Madrigal-de la Campa A. Asian-American variants of human papillomavirus 16 and risk for cervical cancer: a case-control study. JNCI J. Nat. Cancer Instit. 2001;93:1325–1330. doi: 10.1093/jnci/93.17.1325. [DOI] [PubMed] [Google Scholar]
- 148.Freitas L.B., Chen Z., Muqui E.F., Boldrini N.A.T., Miranda A.E., Spano L.C., Burk R.D. Human papillomavirus 16 non-European variants are preferentially associated with high-grade cervical lesions. PLoS One. 2014;9 doi: 10.1371/journal.pone.0100746. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 149.Mirabello L., Yeager M., Cullen M., Boland J.F., Chen Z., Wentzensen N., Zhang X., Yu K., Yang Q., Mitchell J., Roberson D., Bass S., Xiao Y., Burdett L., Raine-Bennett T., Lorey T., Castle P.E., Burk R.D., Schiffman M. HPV16 sublineage associations with histology-specific cancer risk using HPV whole-genome sequences in 3200 women. J. National Cancer Instit. 2016;108:djw100. doi: 10.1093/jnci/djw100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 150.Burk R.D., Terai M., Gravitt P.E., Brinton L.A., Kurman R.J., Barnes W.A., Greenberg M.D., Hadjimichael O.C., Fu L., McGowan L., Mortel R., Schwartz P.E., Hildesheim A. Distribution of human papillomavirus types 16 and 18 variants in squamous cell carcinomas and adenocarcinomas of the cervix. Cancer Res. 2003;63:7215–7220. [PubMed] [Google Scholar]
- 151.Quint K.D., de Koning M.N.C., van Doorn L.-J., Quint W.G.V., Pirog E.C. HPV genotyping and HPV16 variant analysis in glandular and squamous neoplastic lesions of the uterine cervix. Gynecol. Oncol. 2010;117:297–301. doi: 10.1016/j.ygyno.2010.02.003. [DOI] [PubMed] [Google Scholar]
- 152.Rabelo-Santos S.H., Villa L.L., Derchain S.F., Ferreira S., Sarian L.O.Z., Ângelo-Andrade L.A.L., do Amaral Westin M.C., Zeferino L.C. Variants of human papillomavirus types 16 and 18: histological findings in women referred for atypical glandular cells or adenocarcinoma in situ in cervical smear. Int. J. Gynecol. Pathol. 2006;25:393–397. doi: 10.1097/01.pgp.0000215302.17029.0c. [DOI] [PubMed] [Google Scholar]
- 153.Nicolás-Párraga S., Alemany L., de Sanjosé S., Bosch F.X., Bravo I.G. RIS HPV TT and HPV VVAP Study Groups, Differential HPV16 variant distribution in squamous cell carcinoma, adenocarcinoma and adenosquamous cell carcinoma: HPV16 variants in different cervical cancer histologies. Int. J. Cancer. 2017;140:2092. doi: 10.1002/ijc.30636. –2100. [DOI] [PubMed] [Google Scholar]
- 154.De Boer M.A., Peters L.A.W., Aziz M.F., Siregar B., Cornain S., Vrede M.A., Jordanova E.S., Fleuren G.J. Human papillomavirus type 18 variants: histopathology and E6/E7 polymorphisms in three countries. Int. J. Cancer. 2005;114:422–425. doi: 10.1002/ijc.20727. [DOI] [PubMed] [Google Scholar]
- 155.Lizano M., De la Cruz-Hernández E., Carrillo-García A., García-Carrancá A., Ponce de Leon-Rosales S., Dueñas-González A., Hernández-Hernández D.M., Mohar A. Distribution of HPV16 and 18 intratypic variants in normal cytology, intraepithelial lesions, and cervical cancer in a Mexican population. Gynecol. Oncol. 2006;102:230–235. doi: 10.1016/j.ygyno.2005.12.002. [DOI] [PubMed] [Google Scholar]
- 156.Xi L.F., Kiviat N.B., Hildesheim A., Galloway D.A., Wheeler C.M., Ho J., Koutsky L.A. vol. 98. Journal of the National Cancer Institute; 2006. pp. 1045–1052. (Human Papillomavirus Type 16 and 18 Variants: Race-Related Distribution and Persistence, JNCI). [DOI] [PubMed] [Google Scholar]
- 157.Lopera E.A., Baena A., Florez V., Montiel J., Duque C., Ramirez T., Borrero M., Cordoba C.M., Rojas F., Pareja R., Bedoya A.M., Bedoya G., Sanchez G.I. Unexpected inverse correlation between Native American ancestry and Asian American variants of HPV16 in admixed Colombian cervical cancer cases. Infect. Genet. Evol. 2014;28:339–348. doi: 10.1016/j.meegid.2014.10.014. [DOI] [PubMed] [Google Scholar]
- 158.Junes-Gill K., Sichero L., Maciag P.C., Mello W., Noronha V., Villa L.L. Human papillomavirus type 16 variants in cervical cancer from an admixtured population in Brazil. J. Med. Virol. 2008;80:1639–1645. doi: 10.1002/jmv.21238. [DOI] [PubMed] [Google Scholar]
- 159.Chen Z., Terai M., Fu L., Herrero R., DeSalle R., Burk R.D. Diversifying selection in human papillomavirus type 16 lineages based on complete genome analyses. J. Virol. 2005;79:7014–7023. doi: 10.1128/JVI.79.11.7014-7023.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 160.DeFilippis V.R., Ayala F.J., Villarreal L.P. Evidence of diversifying selection in human papillomavirus type 16 E6 but not E7 oncogenes. J. Mol. Evol. 2002;55:491–499. doi: 10.1007/s00239-002-2344-y. [DOI] [PubMed] [Google Scholar]
- 161.Carvajal-Rodríguez A. Detecting recombination and diversifying selection in human alpha-papillomavirus. Infect. Genet. Evol. 2008;8:689–692. doi: 10.1016/j.meegid.2008.07.002. [DOI] [PubMed] [Google Scholar]
- 162.Lang Kuhs K.A., Faden D.L., Chen L., Smith D.K., Pinheiro M., Wood C.B., Davis S., Yeager M., Boland J.F., Cullen M., Steinberg M., Bass S., Wang X., Liu P., Mehrad M., Tucker T., Lewis J.S., Ferris R.L., Mirabello L. Genetic variation within the human papillomavirus type 16 genome is associated with oropharyngeal cancer prognosis. Ann. Oncol. 2022;33:638–648. doi: 10.1016/j.annonc.2022.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 163.Chen A.A., Gheit T., Franceschi S., Tommasino M., Clifford G.M. Human papillomavirus 18 genetic variation and cervical cancer risk worldwide. J. Virol. 2015;89:10680–10687. doi: 10.1128/JVI.01747-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 164.Arias-Pulido H., Peyton C.L., Torrez-Martínez N., Anderson D.N., Wheeler C.M. Human papillomavirus type 18 variant lineages in United States populations characterized by sequence analysis of LCR-E6, E2, and L1regions. Virology. 2005;338:22–34. doi: 10.1016/j.virol.2005.04.022. [DOI] [PubMed] [Google Scholar]
- 165.Chen Z., DeSalle R., Schiffman M., Herrero R., Burk R.D. Evolutionary dynamics of variant genomes of human papillomavirus types 18, 45, and 97. J. Virol. 2009;83:1443–1455. doi: 10.1128/JVI.02068-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 166.Xi L.F., Schiffman M., Koutsky L.A., Hulbert A., Lee S.-K., DeFilippis V., Shen Z., Kiviat N.B. Association of human papillomavirus type 31 variants with risk of cervical intraepithelial neoplasia grades 2-3. Int. J. Cancer. 2012;131:2300–2307. doi: 10.1002/ijc.27520. [DOI] [PubMed] [Google Scholar]
- 167.Lou H., Boland J.F., Li H., Burk R., Yeager M., Anderson S.K., Wentzensen N., Schiffman M., Mirabello L., Dean M. HPV16 E7 nucleotide variants found in cancer-free subjects affect E7 protein expression and transformation. Cancers. 2022;14:4895. doi: 10.3390/cancers14194895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 168.Zhu B., Xiao Y., Yeager M., Clifford G., Wentzensen N., Cullen M., Boland J.F., Bass S., Steinberg M.K., Raine-Bennett T., Lee D., Burk R.D., Pinheiro M., Song L., Dean M., Nelson C.W., Burdett L., Yu K., Roberson D., Lorey T., Franceschi S., Castle P.E., Walker J., Zuna R., Schiffman M., Mirabello L. Mutations in the HPV16 genome induced by APOBEC3 are associated with viral clearance. Nat. Commun. 2020;11:886. doi: 10.1038/s41467-020-14730-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 169.Warren C.J., Van Doorslaer K., Pandey A., Espinosa J.M., Pyeon D. Role of the host restriction factor APOBEC3 on papillomavirus evolution. Virus Evol. 2015;1:vev015. doi: 10.1093/ve/vev015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 170.Warren C.J., Pyeon D. APOBEC3 in papillomavirus restriction, evolution and cancer progression. Oncotarget. 2015;6:39385–39386. doi: 10.18632/oncotarget.6324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 171.Holmes E.C., Grenfell B.T. Discovering the phylodynamics of RNA viruses. PLoS Comput. Biol. 2009;5 doi: 10.1371/journal.pcbi.1000505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 172.Lauring A.S. Within-host viral diversity: a window into viral evolution. Annu. Rev. Virol. 2020;7:63–81. doi: 10.1146/annurev-virology-010320-061642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 173.Grubaugh N.D., Gangavarapu K., Quick J., Matteson N.L., De Jesus J.G., Main B.J., Tan A.L., Paul L.M., Brackney D.E., Grewal S., Gurfield N., Van Rompay K.K.A., Isern S., Michael S.F., Coffey L.L., Loman N.J., Andersen K.G. An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. Genome Biol. 2019;20:8. doi: 10.1186/s13059-018-1618-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 174.Holmes E.C. Oxford University Press; New York: 2009. The Evolution and Emergence of RNA Viruses. [Google Scholar]
- 175.Loeb L.A., Essigmann J.M., Kazazi F., Zhang J., Rose K.D., Mullins J.I. Lethal mutagenesis of HIV with mutagenic nucleoside analogs. Proc. Natl. Acad. Sci. USA. 1999;96:1492–1497. doi: 10.1073/pnas.96.4.1492. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 176.Koh G., Degasperi A., Zou X., Momen S., Nik-Zainal S. Mutational signatures: emerging concepts, caveats and clinical applications. Nat. Rev. Cancer. 2021;21:619–637. doi: 10.1038/s41568-021-00377-7. [DOI] [PubMed] [Google Scholar]
- 177.Alexandrov L.B., Kim J., Haradhvala N.J., Huang M.N., Tian Ng A.W., Wu Y., Boot A., Covington K.R., Gordenin D.A., Bergstrom E.N., Islam S.M.A., Lopez-Bigas N., Klimczak L.J., McPherson J.R., Morganella S., Sabarinathan R., Wheeler D.A., Mustonen V., Getz G., Rozen S.G., Stratton M.R., Consortium P.C.A.W.G. The repertoire of mutational signatures in human cancer. Nature. 2020;578:94–101. doi: 10.1038/s41586-020-1943-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 178.Vartanian J.-P., Guétard D., Henry M., Wain-Hobson S. Evidence for editing of human papillomavirus DNA by APOBEC3 in benign and precancerous lesions. Science. 2008;320:230–233. doi: 10.1126/science.1153201. [DOI] [PubMed] [Google Scholar]
- 179.Kidd J.M., Newman T.L., Tuzun E., Kaul R., Eichler E.E. Population stratification of a common APOBEC gene deletion polymorphism. PLoS Genet. 2007;3:e63. doi: 10.1371/journal.pgen.0030063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 180.Pillai S.K., Wong J.K., Barbour J.D. Turning up the volume on mutational pressure: is more of a good thing always better? (A case study of HIV-1 Vif and APOBEC3) Retrovirology. 2008;5:26. doi: 10.1186/1742-4690-5-26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 181.Sadler H.A., Stenglein M.D., Harris R.S., Mansky L.M. APOBEC3G contributes to HIV-1 variation through sublethal mutagenesis. J. Virol. 2010;84:7396–7404. doi: 10.1128/JVI.00056-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 182.Eigen M., Schuster P. The hypercycle: a principle of natural self-organization. Part A: emergence of the hypercycle. Naturwissenschaften. 1977;64:541–565. doi: 10.1007/BF00450633. [DOI] [PubMed] [Google Scholar]
- 183.Eigen M. On the nature of virus quasispecies. Trends Microbiol. 1996;4:216–218. doi: 10.1016/0966-842X(96)20011-3. [DOI] [PubMed] [Google Scholar]
- 184.Ferber M.J., Thorland E.C., Brink A.A., Rapp A.K., Phillips L.A., McGovern R., Gostout B.S., Cheung T.H., Chung T.K.H., Fu W.Y., Smith D.I. Preferential integration of human papillomavirus type 18 near the c-myc locus in cervical carcinoma. Oncogene. 2003;22:7233–7242. doi: 10.1038/sj.onc.1207006. [DOI] [PubMed] [Google Scholar]
- 185.Bodelon C., Untereiner M.E., Machiela M.J., Vinokurova S., Wentzensen N. Genomic characterization of viral integration sites in HPV-related cancers. Int. J. Cancer. 2016;139:2001–2011. doi: 10.1002/ijc.30243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 186.Zhou L., Qiu Q., Zhou Q., Li J., Yu M., Li K., Xu L., Ke X., Xu H., Lu B., Wang H., Lu W., Liu P., Lu Y. Long-read sequencing unveils high-resolution HPV integration and its oncogenic progression in cervical cancer. Nat. Commun. 2022;13:2563. doi: 10.1038/s41467-022-30190-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 187.Symer D.E., Akagi K., Geiger H.M., Song Y., Li G., Emde A.-K., Xiao W., Jiang B., Corvelo A., Toussaint N.C., Li J., Agrawal A., Ozer E., El-Naggar A.K., Du Z., Shewale J.B., Stache-Crain B., Zucker M., Robine N., Coombes K.R., Gillison M.L. Diverse tumorigenic consequences of human papillomavirus integration in primary oropharyngeal cancers. Genome Res. 2022;32:55–70. doi: 10.1101/gr.275911.121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 188.R Core Team . 2018. A Language and Environment for Statistical Computing.https://www.R-project.org/ [Google Scholar]
- 189.Cardone G., Moyer A.L., Cheng N., Thompson C.D., Dvoretzky I., Lowy D.R., Schiller J.T., Steven A.C., Buck C.B., Trus B.L. 2014. Electron Cryo-Microscopy of Human Papillomavirus Type 16 Capsid. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 190.Cardone G., Moyer A.L., Cheng N., Thompson C.D., Dvoretzky I., Lowy D.R., Schiller J.T., Steven A.C., Buck C.B., Trus B.L. Maturation of the human papillomavirus 16 capsid. mBio. 2014;5:e01104–e01114. doi: 10.1128/mBio.01104-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 191.wwPDB consortium. Burley S.K., Berman H.M., Bhikadiya C., Bi C., Chen L., Costanzo L.D., Christie C., Duarte J.M., Dutta S., Feng Z., Ghosh S., Goodsell D.S., Green R.K., Guranovic V., Guzenko D., Hudson B.P., Liang Y., Lowe R., Peisach E., Periskova I., Randle C., Rose A., Sekharan M., Shao C., Tao Y.-P., Valasatava Y., Voigt M., Westbrook J., Young J., Zardecki C., Zhuravleva M., Kurisu G., Nakamura H., Kengaku Y., Cho H., Sato J., Kim J.Y., Ikegawa Y., Nakagawa A., Yamashita R., Kudou T., Bekker G.-J., Suzuki H., Iwata T., Yokochi M., Kobayashi N., Fujiwara T., Velankar S., Kleywegt G.J., Anyango S., Armstrong D.R., Berrisford J.M., Conroy M.J., Dana J.M., Deshpande M., Gane P., Gáborová R., Gupta D., Gutmanas A., Koča J., Mak L., Mir S., Mukhopadhyay A., Nadzirin N., Nair S., Patwardhan A., Paysan-Lafosse T., Pravda L., Salih O., Sehnal D., Varadi M., Vařeková R., Markley J.L., Hoch J.C., Romero P.R., Baskaran K., Maziuk D., Ulrich E.L., Wedell J.R., Yao H., Livny M., Ioannidis Y.E. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 2019;47 doi: 10.1093/nar/gky949. D520–D528. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 192.Nelson C.W., Moncla L.H., Hughes A.L. SNPGenie: estimating evolutionary parameters to detect natural selection using pooled next-generation sequencing data. Bioinformatics. 2015;31:3709–3711. doi: 10.1093/bioinformatics/btv449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 193.Nei M., Kumar S. Oxford University Press; New York, NY: 2000. Molecular Evolution and Phylogenetics. [Google Scholar]
- 194.García-Vallvé S., Alonso Á., Bravo I.G. Papillomaviruses: different genes have different histories. Trends Microbiol. 2005;13:514–521. doi: 10.1016/j.tim.2005.09.003. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All sequence data are cited and publicly available on GenBank.




