Skip to main content
iScience logoLink to iScience
. 2021 Mar 26;24(4):102368. doi: 10.1016/j.isci.2021.102368

Landscape of oncoviral genotype and co-infection via human papilloma and hepatitis B viral tumor in situ profiling

Adrian Bubie 1, Fabien Zoulim 5, Barbara Testoni 5, Brett Miles 3, Marshall Posner 4, Augusto Villanueva 1,2, Bojan Losic 1,2,6,
PMCID: PMC8050859  PMID: 33889830

Summary

The role of oncoviral genotype and co-infection driving oncogenesis remains unclear. We have developed a scalable, high throughput tool for sensitive and precise oncoviral genotype deconvolution. Using tumor RNA sequencing data, we applied it to 537 virally infected liver, cervical, and head and neck tumors, providing the first comprehensive integrative landscape of tumor-viral gene expression, viral antigen immunogenicity, patient survival, and mutational profiling organized by tumor oncoviral genotype. We find that HBV and HPV genotype and co-infection serve as significant predictors of patient survival and immune activation. Finally, we demonstrate that HPV genotype is more associated with viral oncogene expression than cancer type, implying that expression may be similar across episomal and stochastic integration-based infections. While oncoviral infections are known risk factors for oncogenesis, viral genotype and co-infection are shown to strongly associate with disease progression, patient survival, mutational signatures, and putative tumor neoantigen immunogenicity, facilitating novel clinical associations with infections.

Subject Areas: Genomics, Virology, Cancer

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • ViralMine parses oncoviral genotypes and co-infection from in situ tumor data

  • Oncoviral genotyping of TCGA CESC, HNSC, and LIHC cohorts

  • Tumor fitness, immunogenicity, and mutational signatures associate with oncoviral genotype


Genomics; Virology; Cancer

Introduction

Chronic infection with hepatitis B virus (HBV) and human papillomavirus (HPV) are well-known oncogenic risk factors, with strong viral genotype associations (Castellsagué, 2008; An et al., 2018). Given that HBV-related hepatocellular carcinoma (HCC) and HPV-related head and neck and cervical cancer incidence is on the rise globally (Vaccarella et al., 2013; Simard et al., 2014; Zhu et al., 2016), scalably exploiting tumor RNA sequencing (RNA-seq) data to accurately infer detailed viral signatures is clinically urgent. Although averaged infection phenotypes such as viral load and predominant genotype have been previously characterized and shown to be strong prognostic factors in cancer development (Schiffman et al., 2007; Schiffman et al., 2009; Cuzick and Wheeler, 2016; Pazgan-Simon et al., 2018; Zapatka et al., 2020), the effects of more granular measures such as exon-level viral expression or the ratio of expressed viral genotypes (co-infection) have not yet been fully mapped out in the host tumor microenvironment. This leaves key facets of these DNA oncoviral infections unknown, creating a clinical blind spot for the development of potential new anti-oncoviral therapeutic options.

Here we present a new in situ tool to comprehensively characterize DNA oncoviruses, which we applied to 1,230 tumor samples spanning across liver, cervical, and head and neck cancers. Although HBV and HPV infect highly disparate cancers via different mechanisms, their strong genotype-specific association with oncogenic risk, relatively unknown co-infection rates (Vermeulen et al., 2007; Chaturvedi et al., 2011; Senapati et al., 2017), and disease progression associations can be naturally combined in an integrative, viral genotype-centric, study leveraging tumor RNA-seq data. Indeed, our tool ViralMine, extracts and quantifies viral RNA from high-depth coverage tumor sequencing with high-fidelity sequence recovery, allowing for precise and accurate viral genotype deconvolution and viral exon-level expression analysis (Figure 1A) within the context of the tumor microenvironment. Previous studies have adopted similar negative selection strategies to extract viral sequences from tumor RNA profiling (Tang et al., 2013; Cao et al., 2016; Cancer Genome Atlas Research Network et al., 2017a; Zapatka et al., 2020), whereas our data-driven method goes considerably further by facilitating scalable viral gene expression characterization and deconvolution of complex viral co-infection patterns.

Figure 1.

Figure 1

ViralMine workflow and study design

(A) Workflow summary of our tool, ViralMine. RNA sequencing reads unaligned to the human reference are assembled into contigs and aligned against viral reference databases. Virus genotype, tumor co-infection status, and viral expression are quantified using alignment and contig read support information.

(B) Validation and discovery cohorts for viral genotyping. Sample groups included in each analysis indicated.

We found that viral genotype is associated with overall patient survival, cancer-specific expression signaling, and tumor immunogenicity in HCC. Similarly, viral genotype was linked to significant expression differences in host oncogenesis and immune response pathways in HPV related cervical cancer, and striated patient survival (see results). We found that HPV co-infection creates a notable increase in average putative neoantigen immunogenicity, indicating a potential emergent property of co-infection within cervical tumors. Finally, a comparison of HPV gene expression in head and neck cancers and cervical cancers revealed significant variation across viral tumor genotype but not cancer type.

Results

ViralMine: in silico cross-cohort genotyping validation and co-infection discovery

In order to validate the key genotyping capability of ViralMine (Figure 1A, see transparent methods), we obtained tumor RNA-seq data for the “core set” of cervical cancers from the The Cancer Genome Atlas (TCGA) with previous HPV infection information (n = 178, 169 HPV+) and a group of 50 Chinese patients with HCC screened for HBV infection to ethnicity-match the Asian dominated TCGA LIHC cohort (GSE65485, n = 50, 44 HBV+) (Figure 1B) (Dong et al., 2015; Cancer Genome Atlas Research Network et al., 2017a), for a total of 213 virally infected samples. We inferred viral genotypes in both cohorts against the existing genotype information, obtaining perfect sensitivity in both datasets, and specificities of 0.97 and 0.95 in the HPV+ cervical cancers and HBV-associated HCC tumors, respectively (Figure S1). These results demonstrate that the overall performance of ViralMine in the cervical TCGA (CESC) core set cohort nearly perfectly matched consensus calls made using a combination of different assays, including MassArray, BioBloom Tools, and PathSeq RNA-Sequencing inference techniques (Cancer Genome Atlas Research Network et al., 2017a). We also observed robust average performance using read downsampling with patient randomization, indicating that even relatively low viral expression is sufficient for accurate genotyping with ViralMine (Figure S9), although we do note that immunotherapy treatments that stimulate viral clearance may have an adverse effect on viral sequence recovery (Figure S10).

Applying ViralMine to large-scale tumor expression datasets, especially those with missing or incomplete viral information, we exhaustively screened liver cancer patients from the TCGA (LIHC, n = 334) for HBV infection and found 115 positive patients. This included 71 patients not previously reported HBV+ (Cancer Genome Atlas Research Network, 2017b) and 44 patients who were, with the majority (85/115, 74%) infected with HBV genotype C (Figures 2A and 2B). We also screened a different dataset of 21 patients with HCC for combined HBV integration and genotyping analysis (Figures 1B and 2B; Table S5) (GSE94660, n = 21) and found 11 patients positive for HBV genotype C, 9 HBV for genotype B, and 1 patient HBV- (Yoo et al., 2017). Similarly, we analyzed HPV-infected tumors across the entire set of cervical cancers from the TCGA (CESC, n = 304), finding 285 HPV+ (93.7%), with HPV genotypes for the virally infected tumors indicated in Figure 1C. Among the cervical histological subtypes, neither HPV+ adenocarcinomas nor squamous cell carcinomas were found to be correlated with a specific viral genotype (Figure S2A). We also screened cancers from the head and neck TCGA cohort (HNSC, n = 521), finding 73 HPV+ (14%), of which the vast majority (81%) were HPV16, which restricted inter-tumoral genotype comparisons (Figures 2A and 2B).

Figure 2.

Figure 2

Cohort and oncoviral infection overview

(A) Viral genotype, expression, and key clinical phenotype and somatic mutation information for the three TCGA HPV and HBV cohorts. Percentage patients with denoted somatic mutation indicated by barplot.

(B) Summary table of viral infections by genotype across cohorts.

Co-infections, of HPV genotypes (see transparent methods for details; Figure S11) were found in 92 of the 285 HPV+ cervical cancers (32.5%) with 82 infected with two HPV genotypes (29%) and 10 with three HPV genotypes (3.5%), yielding higher co-infection rates compared with previous co-infection surveys, considering much smaller cohorts of cervical lesions (Figures 2B and S3A; Table S5) (Vermeulen et al., 2007; Senapati et al., 2017). As with primary HPV infection genotypes, these surprisingly high co-infection rates among CESC samples were not preferentially associated with either adenocarcinomas or squamous cell carcinoma subtypes (Figure S2B). In the TCGA head and neck HPV-associated tumors, 8 of 73 were co-infected (11%) with two HPV genotypes (Figures 2B and S3B). In HBV-related HCCs in the TCGA cohort, 14 tumors were co-infected with more than one HBV genotype (12%), of which 12 had two genotypes and 2 had three concurrent genotypes (Figures 2B and S3C). Although it is technically possible to conflate a recombinant virus with distinct viral genotypes only looking at RNA data, typical recovered viral contigs (average of 500–1,000 bp) were of a length to suggest that we are quantifying the latter.

We also noted a lack of correlation between patient viral load and somatic tumor mutational burden (TMB), and between viral genotype and viral load, across cancer types (Figures 2A and S4), indicating viral genotype classification as an independent readout. However, we found no significant associations between viral genotype and several expression-based immune activity markers for either HPV or HBV, potentially signaling that viral genotype alone does not have a significant effect on tumor immune activity.

HBV genotype C associated with molecular signatures of aggressive HCC

In order to test the hypothesis that particular viral genotypes are associated with unique downstream onco-expression signaling, we performed gene set enrichment analysis (GSEA) of the differentially expressed genes between HBV genotype C- (n = 77) and HBV genotype B- (n = 18) associated HCCs of TCGA. We found HBV genotype C tumors were enriched in pathways involved in cell proliferation, tumor recurrence, and tumor growth, whereas cell survival pathways and liver-specific genes sets were downregulated compared with patients with HBV B tumors (false discovery rate (FDR) <0.0001) (Figure 3A; Table S1). Interestingly, for tumors with HBV co-infection (n = 14), genes in pathways involved in 3′ UTR translation, peptide chain elongation, and miRNA deregulation were significantly downregulated compared with tumors with a single HBV genotype (n = 100; FDR<0.0001) (Figure S5; Table S1), indicating that regulatory instability may increase with multi-genotypic HBV infections in the tumor. Integration analysis using only expressed transcripts among HBV+ genotype C and B HCC patient groups (n = 172) identified HBV C-preferred integration loci in known HCC driver genes CCNE1 and KMT2B (Huang et al., 2012; Cancer Genome Atlas Research Network, 2017b), whereas total average rate of integration among the two tumor groups (1.81 integrations per HBV C tumor, 1.85 per HBV B tumor) was similar (two-tailed t test, p = 0.9) (Figure S6). Finally, although the APOBEC3 pathway has been shown to activate in response to HBV infection in liver malignancies (Vartanian et al., 2010), we did not find any significant differential expression associated with HBV genotype within HBV+ tumors.

Figure 3.

Figure 3

HBV genotype affects HCC expression and outcome

(A) Barcode plots indicating gene up- and downregulation in cancer proliferation and survival pathways given HBV genotype. Cancer progression is significantly upregulated and cancer survival is significantly downregulated in HBV C-like patients compared with HBV B patients. Bars along the bottom of the plot indicate individual pathway gene enrichment scores.

(B) Prediction error curves for Cox proportional hazard models of LIHC patients from the TCGA. Survival prediction error for each model, listed in the upper left corner, is calculated across the number of patients at any given survival time and plotted compared with a reference model using no clinical covariates for prediction. Integrated Brier scores (IBS) for each model calculated for survival times encompassing 80% of patient events are listed in the inset table. Significance values comparing each model against the reference curve are indicated next to each score.

Given the genotype-driven molecular differences found above, we sought to determine if the predictive impact of viral genotype in patient survival was significant. We constructed Cox proportional hazard models in HBV-related HCC from TCGA and, using bootstrap resampling to control overfitting, computed robust time-dependent Brier scores for models using clinical tumor stage, tumor vascular invasion, and tumor HBV genotype. Comparing the resulting survival models (Figure 3B), we found both tumor stage and HBV genotype significantly reduced prediction error against the naive reference model, whereas vascular invasion did not. Survival prediction was additionally improved using a model including both tumor stage and HBV genotype terms, significantly so over the HBV genotype-only model (likelihood ratio test, p < 0.001) and almost so over tumor stage alone (p = 0.101). HBV genotype is a remarkably strong predictor of overall patient survival.

Cervical cancer HPV genotype and co-infection status differentiate survival and possess unique molecular fingerprint

We surveyed differentially expressed host genes between HPV genotype 16- (HPV16, n = 173) and HPV genotype 18- (HPV18, n = 39) infected cervical tumors of TCGA, representing the two most dominant genotypes. Via GSEA, we found that pathways in tumor vasculature and endothelial growth are enriched in HPV16- over HPV18-infected tumors, whereas TNF-signal-regulated apoptosis is downregulated (FDR<0.01) (Figure 4A; Table S1). Similarly, we tested for the effect of co-infection and found that in contrast to single HPV genotype cervical tumors (n = 193), those infected with multiple HPV genotypes (n = 92) are enriched in non-IFR3 antiviral activation of lymphocytes and B cell antigen activation pathways (FDR<0.05) (Figure 4B), suggesting preferential activation of alternative immune regulatory pathways in HPV co-infected tumors. We also carried out viral integration analysis finding no significant sites of preferential integration across either HPV genotype (Cao et al., 2016) or co-infection status (Figure S7) (Tang et al., 2013; Hu et al., 2015; Cancer Genome Atlas Research Network et al., 2017a). Finally, whereas it was previously shown that APOBEC3 expression (associated with antiviral activity) is upregulated in HPV-associated head and neck cancers but not cervical cancers (Zapatka et al., 2020), we found that APOBEC3 is significantly upregulated in cervical tumors with HPV16 over HPV18, and in tumors with a singular HPV infection over those with multiple HPV genotypes, controlling for HPV genotype (HPV18, HPV45) (Figure 4D). Thus the increase in APOBEC activity seen in HPV+ patients (Zapatka et al., 2020) is actually further dependent on HPV genotype and co-infection rather than viral infection status alone. Increased APOBEC3 activation has been shown to be linked with further tumor mutagenesis (Burns et al., 2013), signaling that HPV16 infection may further drive cervical tumorigenesis.

Figure 4.

Figure 4

Cervical cancer expression and outcome varies with HPV co-infection and genotype

(A) Barcode plots indicating pathway enrichment of tumor vasculature and tumor endothelial cell growth in patients with HPV16 cervical tumors compared with HPV18-infected tumors. Bars indicate individual pathway gene up- and downregulation.

(B) Cervical cancers with HPV co-infections are upregulated in non-IRF3 antiviral lymphocyte activation and downregulated in B-cell antigen activation, compared with tumors with a single HPV infection genotype.

(C) Prediction error curves for Cox proportional hazard survival models of patients with HPV+ cervical cancer from the TCGA (CESC), comparing the effect of HPV genotype, viral expression, and co-infection status (HPV_Terms) with tumor mutational burden (TMB) as tumor clinical stage (Tumor_Stg_only, TS) is maintained as a covariate. Brier scores for each model at survival times encompassing 80% of patient events are included in the inset table, with significance values comparing each model with the reference curve.

(D) Normalized APOBEC3 expression between cervical tumors with HPV16 and HPV18 (left) infections, and cervical tumors with HPV18 or HPV45 single genotypes against HPV18/45 co-infections (right). Comparison by two-sided Wilcoxon rank-sum test (pgenotype = 1.4 × 10−4; pco-infection = 0.05).

In order to quantify the predictive effect of viral genotype and co-infection status on patient survival, we built Cox proportional hazard models of overall survival in the CESC TCGA using predictors clinical tumor stage, HPV molecular terms (tumor HPV genotype, viral expression, and viral co-infection status), and patient TMB. As before, we computed and compared time-dependent Brier scores between models to compare prediction error (Figure 4C). Although all models slightly but significantly reduced prediction error with respect to the naive reference model, we note that it is driven by the relatively small number of events (deaths) among patients with cervical cancer. However, the addition of HPV molecular terms to clinical tumor stage (TS + HPV_Terms) significantly reduced prediction error compared with tumor stage alone (Tumor_Stg_only), indicating that additional predictive survival power is encoded by HPV phenotype (likelihood ratio test, p = 0.023). Furthermore, we found this genotype-driven model performs on par with a model using tumor stage and patient TMB (TS + TMB) (likelihood ratio test, p = 1) as predictors. Taken together, our results confirm that HPV16 modestly but significantly associates with poorer survival (Hang et al., 2017) and that HPV phenotype is a reasonable predictor of survival, adding significant predictive power beyond tumor staging and tumor mutation burden alone.

Viral genotypes associate with distinct mutational signatures and phenotypes

To further parse the association of overall tumor mutation burden and viral genotypes by specific mutation type, we derived single base-pair substitution (SBS)-based mutational signatures (Alexandrov et al., 2013) for the LIHC, CESC, and HNSC TCGA by stratifying patients based on tumor HBV genotype and HPV viral clade, respectively. As shown by Alexandrov et al., these signatures are key descriptors of cancer phenotypes and can serve as prognostic and predictive biomarkers (Trucco et al., 2019; Alexandrov et al., (2013); Trucco et al., 2019). Using their algorithm (see transparent methods), we found signatures linked to tobacco smoking and those of unknown etiology (see Single Base Substitution (SBS) Signatures, Alexandrov et al., 2013) were preferentially enriched in the HBV B-associated liver cancer mutational profile and were absent from the HBV C profile (Figure 5A; average enrichments in Table S2). Both patient groups were enriched in SBS22 and SBS24, linked to carcinogenic aristolochic acid and aflatoxin B1 exposures, respectively, as previously reported enriched across HBV+ LIHC patients (Cancer Genome Atlas Research Network, 2017b). In cervical cancers, however, we found that tumor mutational profiles associated with HPV a9 infections (including genotypes 16, 31, 33, 35, 52, and 58) are enriched in signatures SBS3, SBS9, SBS26, and SBS29, whereas these signatures are absent from the HPV a7 (including genotypes 18, 39, 45, 59, and 68) mutational profile (Figure 5A; Table S2). Signatures SBS3 and SBS26 are both linked to defective DNA damage repair pathways, whereas SBS29 is associated with exposure to chewing tobacco and SBS9 serves as a signature of hypermutation in lymphoid cells. On the other hand, signature SBS40 (of unknown etiology) is enriched only in HPV a7 tumors. Although no head and neck tumors were associated with HPV genotypes in the a7 clade, we found head and neck tumors associated with HPV a9 infections were actually enriched in the majority of the same signatures as the cervical HPV a9 tumors (SBS3, 9, 29 and 36; Figure 5A; Table S2), with exceptions for signatures SBS36, linked to defective base excision repair, and SBS26.

Figure 5.

Figure 5

Mutation profile signatures unique to viral genotypes

(A) Table of single base pair substitution (SBS) COSMIC global mutational signatures present in mutational profiles of patients in viral genotype subgroups. Colored boxes indicate presence of the SBS signature within the de novo mutational profiles for viral genotype groups in the labeled columns. Etiologies for each signature are listed to the left.

(B) Signature contributions across representative patient mutational profiles, organized by viral expression in the tumor. Each bar represents one patient of the given viral genotype group, with color fill corresponding to the proportion of the patient's total mutations attributed to the indicated SBS signature.

In order to clearly visualize any trends in mutational signatures scaling with overall viral burden, we selected representative patients spanning the range of total viral expression from each tumor genotype (Figure 5B). We found no association between viral expression and mutational signature enrichment in patients, across any HBV or HPV cohort (pmin > ~0.7), or across patient TMB (pmin > ~0.5) (Figure S8). Our results suggest that viral genotype does indeed delineate specific mutational signatures, regardless of total viral expression or tumor mutation burden, across both HPV-associated cervical and head and neck cancers and HBV-associated HCC liver cancer. Apart from enriching in mutational signatures with well-established functional associations (SBS1, 4, 5, 26), other strong genotype-specific enrichments are for mutational signatures of unknown etiology (SBS17, 40, 46), suggesting potentially new functional viral associations and hypotheses.

HBV genotype and HPV co-infection drive differential tumor immunogenicity

Given the evidence that HBV genotype serves as a significant predictor of patient survival in HCC, we hypothesized that a lower tumor antigen immunogenicity might drive worse outcomes in patients with HBV C-associated HCC. Inferring HLA types calculated from HBV + patient tumor RNA-seq in the TCGA LIHC, we estimated the MHC-I binding affinities for a total of 37,222 and 142,539 unique tumor neoantigens across patients with HBV B and HBV C, respectively (Table S6). Comparing predicted neoantigen MHC-I binding affinities reveals a significantly higher binding affinity bias (lower ic50) for HBV B-related tumor neoantigens over peptides from HBV C-related HCCs (p < 0.0001, Figure 6A). Indeed, we found that the ratio of immunogenic tumor mutations (somatic mutations with at least one predicted neoantigen with ic50 < 500 nM) to total tumor mutations is greater among HBV B-infected tumors than HBV C tumors (p < 0.001; Figure 6A, inset), suggesting that there is a higher occurrence of immunogenic mutations in HBV B HCCs, and, consequently, TIL recruitment likelihood by tumor neoantigens from these tumors is greater on average than in HBV C-related tumors.

Figure 6.

Figure 6

Viral genotype and co-infection modulate tumor neoantigen immunogenicity

(A) Tumor neoantigen MHC-I binding affinity cumulative density functions for LIHC TCGA patients infected with HBV B (solid line) and HBV C (dashed line). Smaller values of ic50 indicate stronger binding affinity of tumor neoantigen with T cell MHC-I complex. Dotted lines labeled at ic50 of 1,000 and 500 demarcate “weak” and “strong” antigen binding affinity thresholds, respectively. Significance by one-sided KS-test indicates cumulative density function of HBV B neoantigens is significantly greater (more immunogenic) than HBV C neoantigens. Inset box-and-violin plot compares the immunogenic mutation burden frequency (IMBF), the ratio of mutations generating at least one strongly immunogenic neoantigen (ic50 < 500) to total mutational burden, between HBV B-associated tumors and HBV C tumors.

(B) Neoantigen MHC-I binding affinity cumulative density functions for CESC TCGA patients with cervical HPV18 (solid line) and HPV45 (long dash line) infection and HPV18/45 (dotted line) co-infection. ic50 binding affinity thresholds marked as above. Significance by one-sided KS-test indicates co-infected tumor neoantigen cumulative density function is significantly greater than either neoantigen binding affinity distribution for tumors with single HPV infections. Inset box-and-violin plot compares IMBF between the three non- and co-infected groups.

In the TCGA CESC cohort, we tested whether HPV co-infection was associated with a modulated effect on tumor antigen immunogenicity. Controlling for HPV genotype, we compared the predicted tumor neoantigen binding affinities of cervical cancers infected with HPV18 and HPV45 and those with co-infections of both HPV18 and HPV45 (totals of 5,108; 13,289; and 37,995 antigens, respectively; Table S7). Neoantigens from tumors with HPV co-infections had greater average binding affinity over tumors with either single HPV type alone (HPV18 versus HPV18/45, p < 0.001; HPV45 versus HPV18/45, p < 0.001; Figure 6B), whereas there was no significant difference in binding affinity distributions for singly infected tumors (HPV18 versus HPV45, p = 0.8). The ratio of immunogenic mutations to total TMB was found to be higher in co-infected tumors than in both HPV18 and HPV45 tumors (Figure 6B, inset), although only significantly compared with HPV18 (p = 0.038; p = 0.12), whereas there was no significant difference in ratios between patients with HPV18 and HPV45 (p = 0.45). In summary, we find intriguing preliminary in silico evidence suggesting a differential immunogenicity among tumors infected across HBV genotypes and HPV co-infection.

HPV gene expression is more closely correlated with genotype than to cancer type

In order to examine the effect of cancer cell type in viral expression from the genotype-specific perspective, we compared cervical and head and neck cancer at both the total HPV expression level and across HPV genes by leveraging high-quality viral whole-transcript recovery from ViralMine (Figure 7). Although there was no statistically significant difference in total HPV expression between HPV clades across cervical and head and neck tumors (Figure 7, panel 9), we found normalized HPV gene expression for cervical tumors with HPV infections in the a7 (n = 73) clade is lower across E1, E2, E5, E7, and L1 genes compared with tumors with HPV a9 (n = 205) infections (Figure 7, panels 1, 2, 4, 6, 7; pmax < 0.05). Crucially, HPV gene expression is significantly upregulated in head and neck tumors with a9 clade infections (n = 69) over CESC a7 tumors in the E1, E2, E5, and E7 HPV genes (Figure 7, panels 1, 2, 4, 6; pmax < 0.05), indicating HPV genotype may actually have a stronger effect on the expression in these specific viral genes than cancer cell type. It is important to note that we could not include HNSC tumors with HPV a7 infections in our comparisons as no TCGA HNSC was infected with HPV genotypes in the a7 clade.

Figure 7.

Figure 7

HPV gene expression varies across genotype but not cancer type

Box-and-violin plots of RPKM-normalized HPV gene expression (panels 1–8) and total HPV expression in the tumor (panel 9, bottom right) between HPV a7 and a9 clades, across the TCGA cervical and head and neck cancer patients. Significant difference in gene expression between cancer types and HPV clade determined by Wilcoxon rank-sum test.

Discussion

Using multiple publically available independent datasets across three distinct oncovirally associated cancer types, we have unraveled key correlates of HPV and HBV genotype and co-infection in oncogenic viral and host expression signaling, patient survival, tumor mutational signatures, and immune response. Apart from demonstrating that large-scale, high-fidelity in situ oncoviral profiling is feasible, we build upon previous studies (Burd, 2003; Chan et al., 2004; Bisceglie and Di Bisceglie, 2009; Dahlstrom and Sturgis, 2015; Hang et al., 2017) demonstrating associations with cancer risk by showing that genotype and co-infection modalities can also affect facets of tumor evolution.

Revisiting known associations of HBV genotype C with increased risk of HCC (Chan et al., 2004; Liang et al., 2010), we find that HBV genotype is also a significant predictor of overall HCC patient survival in the LIHC, with HBV genotype C driving poor survival. Remarkably, the overall effect of genotype in predicting survival is much stronger than that of tumor vascular invasion or overall tumor mutational burden, and comparable to tumor staging itself. A possible mechanistic hint for the relative lethality of genotype C lies in the significantly lowered burden of immunogenic mutations compared with genotype B. This suggests that HBV genotype B is potentially more likely to induce a host anti-tumoral immune response, whereas HBV genotype C may evade detection and control. Additionally, this may be exacerbated by higher viral load in surrounding normal liver tissue in HBV genotype C-infected patients compared with HBV genotype B-infected patients (Lin et al., 2015; An et al., 2018; Pazgan-Simon et al., 2018) due to lower rates of seroclearance, compounding the effect of the more muted immunogenic tumor peptidome in HBV genotype C with an increase in viral antigen targets in the surrounding tissue. Furthermore, although no etiologically linked mutational signatures were found differentially enriched across HBV genotypes, the unique enrichment of SBS17 in HBV genotype B-associated HCC has previously been correlated with increased immunogenic tumor neoantigen load (Rizvi et al., 2015; Van Hoeck et al., 2019). However, as we found tumor expression signatures for immune markers do not correlate with HBV genotype (Figure S4), this effect may play a relatively minor role in determining tumor immune infiltration. Taken together, a broader picture of preferential immune surveillance and activation in HBV genotype B over HBV genotype C tumors starts to emerge.

Cervical tumors infected with HPV16, and more broadly strains from the a9 clade, were found enriched in tumor growth pathways over those with HPV18 (HPV a7), strongly suggesting these genotypes may be related to more aggressive tumors (Hang et al., 2017). The likely upregulation of APOBEC3 in HPV16-infected tumors furthers the hypothesis of a more mutagenic tumor phenotype (Burns et al., 2013), although we did not find higher TMB among patients with HPV16 relative to other cervical tumors, and new evidence suggests that viral APOBEC3-specific mutations may increase rates of viral clearance and align with benign infections (Zhu et al., 2020). Relative enrichment of tumor hypermutation (SBS9) and defective DNA damage repair (SBS3) signatures in HPV a9-infected tumors provides further suggestive evidence for this association with HPV a9-induced tumors. Unlike HBV in HCC, however, HPV genotype on its own was not found to be a significant survival correlate. Instead, a combination of viral load, viral co-infection status, and HPV genotype significantly improved survival prediction over tumor clinical stage in patients with cervical cancer, illustrating the potential value of a more complex viral signature compared with simple infection presence or consensus genotype. Strikingly, although most cancers result from chronic infection by a single high-risk genotype of HPV (Schiffman et al., 2007; Castellsagué, 2008), we see that nearly a third of the HPV+ cervical cohort has at least two separate genotypes robustly expressed, producing an additive effect on putative tumor immunogenicity by enlarging the tumor-associated peptidome of patients with multiple HPV genotypes, as seen with HPV18/HPV45 co-infected patients. Tracking tumor viral co-infections thus might identify patients who would benefit from immunotherapy or similar treatment, as has been previously investigated with intra-genotype immune-response stratification of HPV16 tumors (Lu et al., 2019). Whether any of these novel associations with co-infection are enhanced or diminished by inter-clade viral infections or are duplicated in other HPV-associated cancers such as head and neck requires further investigation.

We attempted to address how these fine-grained properties of HPV vary across two very different tumor types by screening the HNSC TCGA cohort for HPV presence and genotype, but finding only 14% (73 patients) HPV+, of which all but one belonged to the a9 clade. Enrichments in mutational signatures linked to DNA damage repair and hypermutation (SBS3, SBS9) found in HPV a9 cervical cancers were likewise enriched for HPV a9 head and neck cancers, whereas SBS40 present in the HPV a7 cervical cancer was similarly absent. This indicates that these phenotypes may be related to HPV infection, and specifically to infections of the a9 clade, even across cell types. Furthermore, we noted that whereas overall HPV expression was not significantly different across cervical and head and neck cancers, significant differences in expression were observed between viral clades at the viral gene level, especially in oncovirally linked genes E7 and E2. As cervical tumors have typically higher reported viral integration rates than head and neck tumors (Agoston et al., 2010; Tang et al., 2013; McBride and Warburton, 2017; Nulton et al., 2018), these differences may partially be explained by rates of HPV integration, which has been shown to decrease E2 and E5 expression. However, as gene expression profiles across the same HPV clade but different cancer types do not significantly vary, this further suggests HPV genotype plays an outsized role in defining virally linked molecular signatures across disparate cell types.

We present a detailed view of key cancer types through the lens of their oncoviral cofactors by applying a scalable, highly accurate method of DNA viral gene expression, genotyping, and co-infection typing on in situ tumor RNA-seq data. By easily applying our method to large publically available datasets, we demonstrate that our initial analyses are sufficiently powered to reveal novel genotype-, co-infection-, and viral gene expression-specific signals, which we hope can be built upon to help inform the next generation of clinical guidelines in assessing and treating not only HPV- and HBV-related cancers but other DNA virus (e.g., Epstein-Barr and Burkitt lymphoma)-related tumors.

Limitations of the study

A key caveat to these conclusions in HBV-related HCC is that about 90% patients among our HCC liver cohorts are of Asian descent, meaning only HBV genotypes B and C predominate. This necessitates a broader analysis with a more diverse patient group to confirm the trends we observe across other HBV genotypes. We were additionally limited in validating out HPV and HBV co-infection status among patient tumors as there is no readily available RNA-seq data from a virally co-infected tumor dataset for us to compare against, and thus relied on statistical inference paired with historical demographic studies on co-infection frequency.

Resource availability

Lead contact

For questions and correspondence, please contact Dr. Bojan Losic (bojan.losic@mssm.edu).

Materials availability

This study did not generate new unique reagents or materials.

Data and code availability

RNA sequencing data are available through GDAC and GEO as referenced in the methods. ViralMine and associated viral reference databases are available for download (https://github.com/LosicLab/ViralMine). Aggregated data tables used to produce figures and analysis are included in this article under supplemental information. Additional analysis code is available upon reasonable request to the corresponding author.

Methods

All methods can be found in the accompanying Transparent Methods supplemental file.

Acknowledgments

This work was supported in part through the computational resources and staff expertise provided by Scientific Computing at the Icahn School of Medicine at Mount Sinai.

Author contribution

A.B, A.V., and B.L. wrote the manuscript. M.P., B.M., B.T., A.V., and F.Z. consulted and edited the text. A.B. and B.L. generated the figures and performed the analysis. B.L. supervised the analysis.

Declaration of interests

The authors declare no competing interests.

Published: April 23, 2021

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.isci.2021.102368.

Supplemental information

Document S1. Transparent methods and Figures S1–S11
mmc1.pdf (3.1MB, pdf)
Document S2. Tables S1–S7
mmc2.zip (20.4MB, zip)

References

  1. Agoston E.S., Robinson S.J., Mehra K.K., Birch C., Semmel D., Mirkovic J., Haddad R.I., Posner M.R., Kindelberger D., Krane J.F. Polymerase chain reaction detection of HPV in squamous carcinoma of the oropharynx. Am. J. Clin. Pathol. 2010;134:36–41. doi: 10.1309/AJCP1AAWXE5JJCLZ. [DOI] [PubMed] [Google Scholar]
  2. Alexandrov L.B., Nik-Zainal S., Wedge D.C., Aparicio S.A., Behjati S., Biankin A.V., Bignell G.R., Bolli N., Borg A., Børresen-Dale A.L. Signatures of mutational processes in human cancer. Nature. 2013;500:415–421. doi: 10.1038/nature12477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. An P., Xu J., Yu Y., Winkler C.A. Host and viral genetic variation in HBV-related hepatocellular carcinoma. Front. Genet. 2018;9:261. doi: 10.3389/fgene.2018.00261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bisceglie A.M.D., Di Bisceglie A.M. Hepatitis B and hepatocellular carcinoma. Hepatology. 2009:S56–S60. doi: 10.1002/hep.22962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Burd E.M. Human papillomavirus and cervical cancer. Clin. Microbiol. Rev. 2003;16:1–17. doi: 10.1128/CMR.16.1.1-17.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Burns M.B., Temiz N.A., Harris R.S. Evidence for APOBEC3B mutagenesis in multiple human cancers. Nat. Genet. 2013;45:977–983. doi: 10.1038/ng.2701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Cancer Genome Atlas Research Network Integrated genomic and molecular characterization of cervical cancer. Nature. 2017;543:378–384. doi: 10.1038/nature21386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cancer Genome Atlas Research Network Comprehensive and integrative genomic characterization of hepatocellular carcinoma. Cell. 2017;169:1327–1341.e23. doi: 10.1016/j.cell.2017.05.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cao S., Wendl M.C., Wyczalkowski M.A., Wylie K., Ye K., Jayasinghe R., Xie M., Wu S., Niu B., Grubb R. Divergent viral presentation among human tumors and adjacent normal tissues. Sci. Rep. 2016;6:28294. doi: 10.1038/srep28294. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Castellsagué X. Natural history and epidemiology of HPV infection and cervical cancer. Gynecol. Oncol. 2008;110:S4–S7. doi: 10.1016/j.ygyno.2008.07.045. [DOI] [PubMed] [Google Scholar]
  11. Chan H.L., Hui A.Y., Wong M.L., Tse A.M., Hung L.C., Wong V.W., Sung J.J. Genotype C hepatitis B virus infection is associated with an increased risk of hepatocellular carcinoma. Gut. 2004;53:1494–1498. doi: 10.1136/gut.2003.033324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Chaturvedi A.K., Katki H.A., Hildesheim A., Rodríguez A.C., Quint W., Schiffman M., Van Doorn L.J., Porras C., Wacholder S., Gonzalez P. Human papillomavirus infection with multiple types: pattern of coinfection and risk of cervical disease. J. Infect. Dis. 2011;203:910–920. doi: 10.1093/infdis/jiq139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Cuzick J., Wheeler C. Papillomavirus research; 2016. Need for Expanded HPV Genotyping for Cervical Screening; pp. 112–115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Dahlstrom K.R., Sturgis E.M. Epidemiology of oral HPV infection and HPV-associated head and neck cancer. HPV and Head and Neck Cancers. 2015:13–39. doi: 10.1007/978-81-322-2413-6_2. [DOI] [Google Scholar]
  15. Dong H., Zhang L., Qian Z., Zhu X., Zhu G., Chen Y., Xie X., Ye Q., Zang J., Ren Z., Ji Q. Identification of HBV-MLL4 integration and its molecular basis in Chinese hepatocellular carcinoma. PLoS One. 2015;10:e0123175. doi: 10.1371/journal.pone.0123175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hang D., Jia M., Ma H., Zhou J., Feng X., Lyu Z., Yin J., Cui H., Yin Y., Jin G. Independent prognostic role of human papillomavirus genotype in cervical cancer. BMC Infect. Dis. 2017;17:391. doi: 10.1186/s12879-017-2465-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Huang J., Deng Q., Wang Q., Li K.Y., Dai J.H., Li N., Zhu Z.D., Zhou B., Liu X.Y., Liu R.F. Exome sequencing of hepatitis B virus–associated hepatocellular carcinoma. Nat. Genet. 2012;44:1117–1121. doi: 10.1038/ng.2391. [DOI] [PubMed] [Google Scholar]
  18. Hu Z., Zhu D., Wang W., Li W., Jia W., Zeng X., Ding W., Yu L., Wang X., Wang L. Genome-wide profiling of HPV integration in cervical cancer identifies clustered genomic hot spots and a potential microhomology-mediated integration mechanism. Nat. Genet. 2015;47:158–163. doi: 10.1038/ng.3178. [DOI] [PubMed] [Google Scholar]
  19. Liang T.J., Mok K.T., Liu S.I., Huang S.F., Chou N.H., Tsai C.C., Chen I.S., Yeh M.H., Chen Y.C., Wang B.W. Hepatitis B genotype C correlated with poor surgical outcomes for hepatocellular carcinoma. J. Am. Coll. Surg. 2010;211:580–586. doi: 10.1016/j.jamcollsurg.2010.06.020. [DOI] [PubMed] [Google Scholar]
  20. Lin C.L., Kao J.H., H. Kao J. Hepatitis B virus genotypes and variants. Cold Spring Harb. Perspect. Med. 2015;5:a021436. doi: 10.1101/cshperspect.a021436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Lu X., Jiang L., Zhang L., Zhu Y., Hu W., Wang J., Ruan X., Xu Z., Meng X., Gao J. Immune signature-based subtypes of cervical squamous cell carcinoma tightly associated with human papillomavirus type 16 expression, molecular features, and clinical outcome. Neoplasia. 2019;21:591–601. doi: 10.1016/j.neo.2019.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. McBride A.A., Warburton A. The role of integration in oncogenic progression of HPV-associated cancers. PLoS Pathog. 2017;13:e1006211. doi: 10.1371/journal.ppat.1006211. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Nulton T.J., Kim N.K., DiNardo L.J., Morgan I.M., Windle B. Patients with integrated HPV16 in head and neck cancer show poor survival. Oral Oncol. 2018;80:52–55. doi: 10.1016/j.oraloncology.2018.03.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Pazgan-Simon M., Simon K.A., Jarowicz E., Rotter K., Szymanek-Pasternak A., Zuwała-Jagiełło J. Hepatitis B virus treatment in hepatocellular carcinoma patients prolongs survival and reduces the risk of cancer recurrence. Clin. Exp. Hepatol. 2018;4:210–216. doi: 10.5114/ceh.2018.78127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Rizvi N.A., Hellmann M.D., Snyder A., Kvistborg P., Makarov V., Havel J.J., Lee W., Yuan J., Wong P., Ho T.S. Cancer immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science. 2015;348:124–128. doi: 10.1126/science.aaa1348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Schiffman M., Castle P.E., Jeronimo J., Rodriguez A.C., Wacholder S. Human papillomavirus and cervical cancer. Lancet. 2007;370:890–907. doi: 10.1016/s0140-6736(07)61416-0. [DOI] [PubMed] [Google Scholar]
  27. Schiffman M., Clifford G., Buonaguro F.M. Classification of weakly carcinogenic human papillomavirus types: addressing the limits of epidemiology at the borderline. Infect. Agents Cancer. 2009;4:8. doi: 10.1186/1750-9378-4-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Senapati R., Nayak B., Kar S.K., Dwibedi B. HPV genotypes co-infections associated with cervical carcinoma: special focus on phylogenetically related and non-vaccine targeted genotypes. PLoS One. 2017;12:e0187844. doi: 10.1371/journal.pone.0187844. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Simard E.P., Torre L.A., Jemal A. International trends in head and neck cancer incidence rates: differences by country, sex and anatomic site. Oral Oncol. 2014;50:387–403. doi: 10.1016/j.oraloncology.2014.01.016. [DOI] [PubMed] [Google Scholar]
  30. Tang K.W., Alaei-Mahabadi B., Samuelsson T., Lindh M., Larsson E. The landscape of viral expression and host gene fusion and adaptation in human cancer. Nat. Commun. 2013;4:2513. doi: 10.1038/ncomms3513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Trucco L.D., Mundra P.A., Hogan K., Garcia-Martinez P., Viros A., Mandal A.K., Macagno N., Gaudy-Marqueste C., Allan D., Baenke F. Ultraviolet radiation-induced DNA damage is prognostic for outcome in melanoma. Nat. Med. 2019;25:221–224. doi: 10.1038/s41591-018-0265-6. [DOI] [PubMed] [Google Scholar]
  32. Vaccarella S., Lortet-Tieulent J., Plummer M., Franceschi S., Bray F. Worldwide trends in cervical cancer incidence: impact of screening against changes in disease risk factors. Eur. J. Cancer. 2013;49:3262–3273. doi: 10.1016/j.ejca.2013.04.024. [DOI] [PubMed] [Google Scholar]
  33. Van Hoeck A., Tjoonk N., van Boxtel R., Cuppen E. Portrait of a cancer: mutational signature analyses for cancer diagnostics. BMC Cancer. 2019;19:457. doi: 10.1186/s12885-019-5677-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Vartanian J.P., Henry M., Marchio A., Suspène R., Aynaud M.M., Guétard D., Cervantes-Gonzalez M., Battiston C., Mazzaferro V., Pineau P. Massive APOBEC3 editing of hepatitis B viral DNA in cirrhosis. PLoS Pathog. 2010;6:e1000928. doi: 10.1371/journal.ppat.1000928. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Vermeulen C.F., Jordanova E.S., Szuhai K., Kolkman-Uljee S., Vrede M.A., Peters A.A., Schuuring E., Fleuren G.J. Physical status of multiple human papillomavirus genotypes in flow-sorted cervical cancer cells. Cancer Genet. Cytogenet. 2007;175:132–137. doi: 10.1016/j.cancergencyto.2007.02.009. [DOI] [PubMed] [Google Scholar]
  36. Yoo S., Wang W., Wang Q., Fiel M.I., Lee E., Hiotis S.P., Zhu J. A pilot systematic genomic comparison of recurrence risks of hepatitis B virus-associated hepatocellular carcinoma with low- and high-degree liver fibrosis. BMC Med. 2017;15:214. doi: 10.1186/s12916-017-0973-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Zapatka M., Borozan I., Brewer D.S., Iskar M., Grundhoff A., Alawi M., Desai N., Sültmann H., Moch H., au fnm. The landscape of viral associations in human cancers. Nat. Genet. 2020;52:320–330. doi: 10.1038/s41588-019-0558-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Zhu B., Xiao Y., Yeager M., Clifford G., Wentzensen N., Cullen M., Boland J.F., Bass S., Steinberg M.K., Raine-Bennett T. Mutations in the HPV16 genome induced by APOBEC3 are associated with viral clearance. Nat. Commun. 2020;11:886. doi: 10.1038/s41467-020-14730-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Zhu R.X., Seto W.K., Lai C.L., Yuen M.F. Epidemiology of hepatocellular carcinoma in the asia-pacific region. Gut Liver. 2016;10:332–339. doi: 10.5009/gnl15257. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Transparent methods and Figures S1–S11
mmc1.pdf (3.1MB, pdf)
Document S2. Tables S1–S7
mmc2.zip (20.4MB, zip)

Data Availability Statement

RNA sequencing data are available through GDAC and GEO as referenced in the methods. ViralMine and associated viral reference databases are available for download (https://github.com/LosicLab/ViralMine). Aggregated data tables used to produce figures and analysis are included in this article under supplemental information. Additional analysis code is available upon reasonable request to the corresponding author.


Articles from iScience are provided here courtesy of Elsevier

RESOURCES