Dear Editor,
Human papillomavirus (HPV), hepatitis B virus (HBV) and Epstein–Barr virus (EBV) are the three most oncogenic DNA viruses, contributing to 15 different types of cancer. 1 Although these viruses differ in many aspects, one common key step is the integration of their DNA into the human genome, which could potentially promote carcinogenesis. 2 , 3 , 4 In this study, we developed and performed a novel pipeline (Figures S1–S8, Supplementary Notes 1–3 and Table S1) named viral integration pathway analysis (VIPA) to elucidate the integration mechanism shared by HPV, HBV and EBV, thus gaining a deeper understanding towards the virus‐induced carcinogenesis and the corresponding anticancer therapies.
First, we conducted HPV capture sequencing and identified 1002 HPV integration breakpoints in 24.8% (225/910) non‐cancer HPV infection samples, 588 breakpoints in 38.0% (125/329) cervical precancer samples and 1597 breakpoints in 69.0% (158/227) cancer samples (Figure 1A). The total integration sample proportion was 34.7% (508/1466), and the average integration breakpoints were 6.27 per sample. We observed 24 recurrent integration hotspots (integration positions located within the 500‐kb downstream/ upstream of the gene, n ≥ 5) in our dataset (Figure 1A). Among them, 10 integration hotspots were previously reported, and 14 HPV integration hotspot genes were newly identified (Table S2).
Next, we found that the distribution of HPV integration strains and status in non‐cancer HPV infection, cervical precancer and cancer samples were different (Figure 1B,C). Specifically, HPV16 integration percentage was only 10% (ranked third) in non‐cancer samples but increased to 33.4% (ranked first) in precancer and 55.5% (ranked first) in cancer samples. HPV18 integration percentage was only 3.1% in non‐cancer samples, and 5.8% in precancer samples, and rose to 7.9% (ranked second) in cancer samples.
The average integration events for non‐cancer infection were 4.4, for cervical precancer were 4.7 and for cancer samples were 10.1, indicating that HPV integration increased along with the disease progression (non‐cancer vs. precancer, p = .011; precancer vs. cancer, p < .0001; Wilcox test, False Discovery Rate corrected) and may serve as an early warning biomarker of carcinogenesis (Figure 1C). When applying the average integration events to predict clinical outcomes, the results showed that we could distinguish high‐grade squamous intraepithelial lesion (HSIL)± (including HSIL and Cancer) with an AUC of .722. Further, we found that HPV16 held best prediction performance towards HSIL± with the AUC of .859. Similarly, HPV18 shared comparable prediction performance towards HSIL± with the AUC of .819 (Figure 1D).
Further, motivated by the aim of finding common integration features among HPV, HBV and EBV, we collected the capture sequencing data of the three viruses. Together, we detected 4390 integration breakpoints for HPV, 4010 integration breakpoints for HBV and 174 integration breakpoints for EBV (Tables S3–S5). Intriguingly, 21 integration genes were shared by all three viruses (Table S6), indicating the potential roles of these genomic loci in oncogenic viruses‐related cancers.
Next, we explored the viral integration patterns using identified human–viral junctional sequences (defined by ≥30‐bp human and viral sequences at the integration sites) from expanded integration datasets (Table S7 and Supplementary Notes 4 and 5). Previous studies have indicated that the integrations of three viruses were mediated by microhomology (MH) 4 , 5 , 6 , 7 (Figure S9). However, it is not clear how the lateral microhomologies (defined as microhomologies with short‐distance from the junction sites) mediate the integration process (Figure 2A–C). Inspired by the new understandings towards alternative end‐joining, 8 , 9 we speculated that synthesis‐dependent end‐joining (SD‐EJ) pathway may participate in the integration process to generate multiple types of breakpoints (Figure S10), including apparent blunt joining (Figure 2A), short insertion (Figure 2B) and junctional microhomologies (Figure 2C). We validated integration structures using the nanopore sequencing of Ca Ski DNA and Sanger sequencing of Ca Ski, HepG2.2.15 and Raji (Figures S11 and S12).
We analysed the roles of SD‐EJ using computational simulation (Figure S13) in 4341 human–HPV junctional sequences (Table S3), 4010 human–HBV junctional sequences (Table S4) and 169 human–EBV junctional sequences (Table S5). We found that SD‐EJ was significantly enriched for all three viruses (Figure 3A).
Then, the repair models and products of SD‐EJ were further analysed (Figure 3B). The proportions of loop‐out model were 47.9%–61.4% (HPV: 61.4%; HBV: 57.7% and EBV: 47.9%), whereas those of snap‐backs were 38.8%–52.1% (HPV: 38.8%; HBV: 42.3% and EBV: 52.1%). For repair products, junctional MH was the major type, accounting for 89.5% HPV, 91.3% HBV and 88.1% EBV SD‐EJ integration events, followed by apparent blunt join (HPV: 8.4%; HBV: 7.9% and EBV: 10.4%) and short insertion (HPV: 2.0%; HBV: .8% and EBV: 1.5%). The occurrence of junctional MH was significantly higher in the observed group than that in the expected group (Figure 3C, Supplementary Note 6). Conversely, the occurrence of apparent blunt join was significantly lower in the observed group than in the expected group. Of note, the significant enrichment of short insertion was observed in HPV and HBV datasets, whereas there was no significant difference of short insertion between EBV's observed and expected groups (n = 1 vs. n = .14, p = 1, Fisher's exact test) due to relatively small dataset (Figure 3C, Supplementary Note 6).
Finally, we classified integration pathways of each dsDNA virus breakpoint into three categories: (i) SD‐EJ pathway with SD‐EJ structures, followed by (ii) other alt‐EJ pathway with microhomologies overhangs and otherwise (iii) NHEJ pathway without the previous two signatures (Figure 3D). In 10‐bp flanking length, we observed the percentages of SD‐EJ pathway were 59.11% for HPV, 65.04% for HBV and 48.38% for EBV, whereas those of unclassified NHEJs were 37.15% for HPV, 28.29% for HBV and 48.55% for EBV (Figure 3E). The previous data suggested that SD‐EJ repair pathway may play an important role in the integrations of three viruses into human genome.
Together, we report the largest genome‐wide landscape of HPV, HBV and EBV insertional mutageneses. We uncovered HPV, HBV and EBV to share the same common SD‐EJ integration mechanism. Based on our identified integration patterns and the biology features of three viruses, we proposed a new model of the integration process of HPV, HBV and EBV (Figure 4), providing insights into virus‐induced cancer.
FUNDING INFORMATION
This work was supported by the National Science and Technology Major Project of the Ministry of science and technology of China (Grant no. 2018ZX10301402); The National Natural Science Foundation of China (Grant no. 32171465 and 82102392); General Program of Natural Science Foundation of Guangdong Province of China (Grant no. 2021A1515012438); the National Postdoctoral Program for Innovative Talent (Grant no. BX20200398); the China Postdoctoral Science Foundation (Grant no. 2020M672995); Guangdong Basic and Applied Basic Research Foundation (Grant no. 2020A1515110170); the Major projects of Wuhan Municipal Health Commission (Grant no. WX19M02); the National Ten Thousand Plan‐Young Top Talents of China.
CONFLICT OF INTEREST
The authors declare that they have no competing interests.
Supporting information
ACKNOWLEDGEMENTS
We thank the Tianhe Supercomputer Center for computational support and GeneRulor for probe design and partial experiment.
Rui Tian, Yuyan Wang, Weiping Li, Zifeng Cui and Ting Pan contributed equally to this work.
Contributor Information
Yiqin Lu, Email: k_lyq@sina.com.
Xun Tian, Email: tianxun@zxhospital.com.
Zheng Hu, Email: huzheng1998@163.com.
REFERENCES
- 1. Oh JK, Weiderpass E. Infection and cancer: global distribution and burden of diseases. Ann Glob Health. 2014;80:384‐392. [DOI] [PubMed] [Google Scholar]
- 2. Akagi K, Li J, Broutian TR, et al. Genome‐wide analysis of HPV integration in human cancers reveals recurrent, focal genomic instability. Genome Res. 2014;24:185‐199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Sung WK, Zheng H, Li S, et al. Genome‐wide survey of recurrent HBV integration in hepatocellular carcinoma. Nat Genet. 2012;44:765‐769. [DOI] [PubMed] [Google Scholar]
- 4. Xu M, Zhang WL, Zhu Q, et al. Genome‐wide profiling of Epstein‐Barr virus integration by targeted sequencing in Epstein‐Barr virus associated malignancies. Theranostics. 2019;9:1115‐1124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Hu Z, Zhu D, Wang W, et al. Genome‐wide profiling of HPV integration in cervical cancer identifies clustered genomic hot spots and a potential microhomology‐mediated integration mechanism. Nat Genet. 2015;47:158‐163. [DOI] [PubMed] [Google Scholar]
- 6. Zhao LH, Liu X, Yan HX, et al. Genomic and oncogenic preference of HBV integration in hepatocellular carcinoma. Nat Commun. 2016;7:12992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Leeman JE, Li Y, Bell A, et al. Human papillomavirus 16 promotes microhomology‐mediated end‐joining. Proc Natl Acad Sci USA. 2019;116:21573‐21579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Ramsden DA, Carvajal‐Garcia J, Gupta GP. Mechanism, cellular functions and cancer roles of polymerase‐theta‐mediated DNA end joining. Nat Rev Mol Cell Biol. 2022;23:125‐140. [DOI] [PubMed] [Google Scholar]
- 9. Yu AM, McVey M. Synthesis‐dependent microhomology‐mediated end joining accounts for multiple types of repair junctions. Nucleic Acids Res. 2010;38:5706‐5717. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.