Abstract
Kaposi sarcoma herpesvirus (KSHV) is the etiological agent of three malignancies, Kaposi sarcoma (KS), primary effusion lymphoma (PEL) and KSHV-associated multicentric Castelman disease. KSHV infected patients may also have an interleukin six-related KSHV-associated inflammatory cytokine syndrome. KSHV-associated diseases occur in only a minority of chronically KSHV-infected individuals and often in the setting of immunosuppression. Mechanisms by which KSHV genomic variations and systemic co-infections may affect the pathogenic pathways potentially leading to these diseases have not been well characterized in vivo. To date, the majority of comparative genetic analyses of KSHV have been focused on a few regions scattered across the viral genome. We used next-generation sequencing techniques to investigate the taxonomic groupings of viruses from malignant effusion samples from fourteen participants with advanced KSHV-related malignancies, including twelve with PEL and two with KS and elevated KSHV viral load in effusions. The genomic diversity and evolutionary characteristics of nine isolated, near full-length KSHV genomes revealed extensive evidence of mosaic patterns across all these genomes. Further, our comprehensive NGS analysis allowed the identification of two distinct KSHV genome sequences in one individual, consistent with a dual infection. Overall, our results provide significant evidence for the contribution of KSHV phylogenomics to the origin of KSHV subtypes. This report points to a wider scope of studies to establish genome-wide patterns of sequence diversity and define the possible pathogenic role of sequence variations in KSHV-infected individuals.
Keywords: KSHV, virus taxonomy, virology, recombination, genetic diversity
1. Introduction
Kaposi sarcoma herpesvirus [KSHV; also known as human gammaherpesvirus eight (HHV-8)] is the causative agent of a diverse group of disorders including all forms of Kaposi sarcoma (KS) (Chang et al. 1994), primary effusion lymphoma (PEL) (Cesarman et al. 1995) and multicentric Castelman’s disease (MCD) (Soulier et al. 1995) as well as the recently identified KSHV inflammatory cytokine syndrome (KICS) (Polizzotto et al. 2016). Although the majority of KSHV-infected individuals will not develop KSHV-associated diseases, these diseases are more common among immunodeficient individuals. In patients infected with human immunodeficiency virus (HIV), multiple KSHV-associated malignancies can occur concurrently or develop sequentially (Lurain et al. 2019).
PEL is a rare large B cell lymphoma that classically presents with malignant effusions in the pleura, pericardium, or peritoneum that may also involve the bone marrow or cerebrospinal fluid. Since the introduction of antiretroviral therapy the incidence of KS in people living with HIV has declined, yet the median survival of patients with PEL remains poor compared with other HIV-associated lymphomas and no standard therapy is available to date (Boulanger et al. 2005; Goncalves et al. 2017; Lurain et al. 2019). In addition to PEL, effusions may also occur in KS, KSHV-associated MCD and KICS.
In addition to KSHV infection, the complex pathogenesis of KSHV-associated diseases has been linked to environmental, behavioral, and host genetic factors (Pollack and DuBois 1977; Dedicoat and Newton 2003; Brown et al. 2006; Biggar et al. 2007; Powles et al. 2009; Engels et al. 2011; Stolka et al. 2014). Apart from HIV, co-infections with other Herpesviridae like human cytomegalovirus and human herpesvirus 6 (HHV-6) as well as parasitic co-infection with malaria have shown to affect the risk of KSHV-related diseases in vivo and oncogenesis in vitro, with few studies investigating the co-occurrence of these infections in patients with KSHV-related effusions (Newton et al. 2003; Park et al. 2006; Tamburro et al. 2012; Nalwoga et al. 2018). Further, viral genomic variations have been potentially linked to cancer risk and survival in human papillomavirus and EBV (Mirabello et al. 2017; Rader et al. 2019; Xu et al. 2019). Our previous observation of possible associations between miRNA sequence variations and an increased risk of MCD and KICS indicate that KSHV genomic diversity may contribute to disease development (Ray et al. 2012). Yet these potential associations have only been partially explored and remain poorly understood (Mancuso et al. 2008; Ray et al. 2012).
KSHV is a large, double stranded DNA virus with a ∼140 kb long unique coding region including at least eighty-six open reading frames (ORFs) transcribed into a number of protein-coding genes, long non-coding RNAs, and microRNAs (Glenn et al. 1999; Rezaee 2006; Arias et al. 2014). Until most recently, studies of KSHV variability relied largely on the characterization of the hypervariable K1 gene and the K15 locus, with additional gene regions utilized to define the genetic structure and possible recombination breakpoints (Kakoola et al. 2001; Zong et al. 2002; Marshall et al. 2010). Six major (A−F) and several minor subtypes can be distinguished based on K1 amino acid sequence similarity whereas K15 encodes for the P (predominant), M (minor), and N subtype (Cook et al. 1999; Poole et al. 1999; Hayward and Zong 2007).
Most recent advances in next-generation sequencing (NGS) have not only increased the number of available KSHV genomes (Tamburro et al. 2012; Olp et al. 2015; Awazawa et al. 2017; Sallah et al. 2018) but also the detection and characterization of a variety of known and novel pathogens from human clinical samples directly (Nakamura et al. 2009; Svraka et al. 2010). To better understand KSHV genomics and underlying evolutionary mechanisms, we carried out an in-depth analysis of KSHV genomes directly sequenced from effusion samples of patients with KSHV-related malignancies, focusing on samples with high KSHV viral load (VL). Additionally, we examined the viral metagenome of effusion samples in order to identify potential disease altering, co-infecting pathogens through an unbiased, shotgun sequencing approach.
2. Materials and methods
2.1 Samples, DNA preparation, and qPCR assays
Effusion samples were collected at the HIV and AIDS Malignancy Branch (National Cancer Institute (NCI), National Institutes of Health (NIH)) from patients with KSHV-associated diseases undergoing therapeutic or diagnostic thoracentesis, pericardiocentesis, or paracentesis. Patients are known to have contracted HIV through sexual contact, and none had confirmed use of injection drugs. The protocols for the collection and study of these samples were approved by the NCI Institutional Review Board, and all patients gave written informed consent for correlative studies and biospecimen storage in accordance with the Declaration of Helsinki. Fourteen effusion samples with extremely high KSHV VL were selected for sequencing (Table 1). The diagnosis of the disease(s) causing the effusion was made by cytopathology, and the presence of KSHV in the tumor was established by staining for latency-associated nuclear antigen (LANA, ORF73). Effusion samples were centrifuged at 300×g and red blood cells lysed by addition of ACK lysis buffer if needed. After inactivation of lysis buffer with PBS, cellular material was pelleted at 8,000×g and stored at −80 °C until further use. DNA from nucleated cells was extracted using the QIAamp Blood kit (Qiagen, Hilden, Germany) according to the manufacture’s instruction; quantification was performed by fluorometry using the Qubit dsDNA BR Assay Kit (Thermo Fisher Scientific, Waltham, MA, USA). KSHV and EBV DNA VL were measured by real-time qPCR as described before (Mbulaiteye et al. 2006; Uldrick et al. 2011) with ERV-3 used for cell number quantification. In one patient, peripheral blood mononuclear cells (PBMCs) and saliva samples were also examined.
Table 1.
Patient and sample characteristics.
| ID | Gender | Age (years) | Ethnicity | Country of origin | Diagnosis | HIV VL copies/ml | CD4 cells/µl | KSHV copies/106 cells | EBV copies/106 cells | Material type collected |
|---|---|---|---|---|---|---|---|---|---|---|
| FNL001 | M | 26 | African American | USA | PEL and MCD | 7,246 | 91 | 144.0×106 | 3.0×106 | Pleural fluid |
| FNL002 | M | 69 | Caucasian | USA | PEL | <20 | 594 | 77.3×106 | 5.0×103 | Peritoneal fluid |
| FNL003 | M | 22 | African American | USA | PEL and KS | 43 | 32 | 165.5×106 | 2.6×106 | Pleural fluid |
| FNL004 | M | 59 | Caucasian | USA | PEL and KS | Undetectable | 187 | 31.8×106 | 0 | Pleural fluid |
| FNL005 | M | 55 | Caucasian | USA | PEL/KS/MCD | <20 | 363 | 70.1×106 | 5.3×106 | Pleural fluid |
| FNL006 | M | 48 | Caucasian | USA | PEL and KS | <20 | 74 | 844.8×106 | 7.2×106 | Pleural fluid |
| FNL007 | M | 38 | Caucasian | USA | PEL and KS | 10,082 | 401 | 133.3×106 | 203 | Ascites pellet |
| FNL008 | M | 28 | African American | USA | KS and MCD | Undetectable | 14 | 4.9×106 | 385 | Pleural fluid left |
| FNL009 | M | 33 | African American | USA | PEL and KS | Undetectable | 37 | 55.9×106 | 0 | Pleural fluid right |
| FNL010 | M | 37 | Caucasian | USA | KS and KICS | 651 | 16 | 2.0×106 | 7.0×103 | Pleural fluid |
| FNL011 | M | 55 | Caucasian | USA | PEL and KS | Undetectable | 365 | 228.6×106 | 4.5×106 | Pleural fluid right |
| FNL012 | M | 36 | Caucasian | USA | PEL/KS/MCD | 54 | 120 | 2.7×106 | 4.2×103 | Pleural fluid left |
| FNL013 | M | 39 | Caucasian | USA | PEL and KS | 959 | 764 | 1.2×106 | 22.8×103 | Pleural fluid |
| FNL014 | F | 33 | African | Cameroon | PEL | <48 | 15 | 212.7×106 | 115.4×106 | Peritoneal fluid |
F, female; M, male.
2.2 Illumina sequencing
Total DNA was fragmented using a Covaris focused-ultrasonicator (Covaris, Woburn, MA, USA) into fragments of 550 base pair (bp) average length. Barcoded NGS libraries were prepared using the Illumina TruSeq Nano DNA LT Library preparation kit (Illumina, Hayward, CA, USA). Genomic DNA libraries were sequenced as short sequence paired-end 250 bp reads on an Illumina MiSeq with version 2 chemistry.
2.3 Viral genome alignment, consensus sequence generation, and variant calling
Genomic data were processed using the CCBR Pipeliner framework (https://github.com/CCBR/Pipeliner). Short paired-read data were trimmed for the presence of adaptors and low quality using Trimmomatic v0.36 (Bolger et al. 2014) and the following parameter settings: leading: 10; Trailing: 10; Sliding window: 4:20; Minlen: 20. Because the vast majority of reads in these libraries are expected to be derived from the human genome, we generated a hybrid reference genome consisting of the GRCh37 human reference sequence (https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.25) and the KSHV GK18 complete genome sequence (https://www.ncbi.nlm.nih.gov/nuccore/AF148805). Reads were then mapped to this reference genome using BWA-MEM v0.7.15 with default parameter settings (Li 2013). Resulting BAM files were sorted using samtools v1.317 (Li et al. 2009; Li 2011) and PCR duplicates were marked using Picard v2.1.1 (https://broadinstitute.github.io/picard/). Re-alignment around indels and base recalibration was performed using the Genome Analysis Toolkit v.3.4 (GATK; Broad Institute, Cambridge, MA, USA), following the GATK best practices (McKenna et al. 2010; Van der Auwera et al. 2013). Ethical approval was not given for the investigation of the human genome and as such, KSHV-mapped and unmapped reads were extracted from BAM files using samtools v1.317. KSHV consensus sequences were generated in Geneious Prime 2019.0.4, visually inspected and manually corrected for misalignments and polynucleotide runs. Sequence variations and the presence of minority viral genome variants were validated using end-point PCR and the primer sets listed in Supplementary Table S1. In a 50 µl reaction volume containing 25 µl Jumpstart REDTaq ReadyMix Reaction Mix (Sigma-Aldrich, St. Louis, MO, USA), 15 µl nuclease-free water and 2.5 µl of each forward and reverse primer, 5 µl of the template DNA were amplified. Cycling conditions were as follows: 1 min at 94 °C; 35 cycles of 30 s at 94 °C, 30 s at 60 °C, 2 min at 72 °C; followed by 5 min at 72 °C. PCR products were purified using the QIAquick PCR Purification Kit (Qiagen, Hilden, Germany) prior to Sanger sequencing as previously described (Marshall et al. 2007). Finally, full-length FASTA sequences were generated for each sample in which coverage gaps and repeat regions (24,230 − 25,046 [816 bp], 29,997 − 30,056 [129 bp], 118,229 − 118,845 [616 bp], 124,784 − 126,457 [1,673 bp], 137,169 − 137,970 [801 bp]) were masked as missing (N) using Geneious Prime 2019.0.4 and bedtools (Quinlan and Hall 2010; Quinlan 2014). Annotations for repeat, protein and RNA coding regions were transferred from the reference genome GK18 based on sequence homology in Geneious Prime 2019.0.4. In addition, masked regions, low-coverage areas, and ambiguous bases were annotated.
For variant sites, the GATK’s HaplotypeCaller (Van der Auwera et al. 2013) was then run in GVCF mode and with ploidy set to one, and all positions in the GK18 reference contig were then called for each sample using the GenotypeGVCFs tool and the –includeNonVariantSites option. Variant sites were then hard filtered using multiple criteria (QD ≥ 2 and DP ≥ 20 for both single nucleotide variants (SNVs) and indels; FS ≥ 60 for SNVs, FS ≥ 200 for indels) to reduce false positives. The number of genome wide SNVs were identified in a 500 bp sliding window using bedtools.
2.4 Metagenomics analysis
Unmapped paired-reads were screened for additional pathogenic sequences using the Taxonomer metagenomics tool (http://taxonomer.iobio.io). To verify Taxonomer results, any relevant herpesvirus sequences detected were confirmed by mapping reads to the corresponding reference genomes (Supplementary Table S2) using BWA-MEM. Subtyping of EBV and HHV-6 was performed by sequence comparison and BLAST analysis of alignments to either the EBNA2 or U38 gene, respectively. U38 sequence was chosen based on the primers used by Reddy and Manna (2005). Only query sequences with 100 per cent identity to the sequence database were considered typed.
2.5 Phylogenetic sequence and recombination analyses
Within Geneious Prime 2019.0.4, genome sequence alignments were generated using MAFFT with the FFT-NS-2 algorithm (Katoh and Standley 2013) and 1,000 bootstraps to achieve a multiple consensus alignment comprising sixteen previously published sequences in addition to our nine isolated genomes (Supplementary Table S2). For the genomes included in this study, amino acid alignments for all known eighty-six coding regions were also generated and amino acid changes visualized using the Los Alamos National Laboratory Highlighter tool (Keele et al. 2008). To identify genotypic variations in KSHV, K1, and K15 subtype analyses were performed including an extended data set (Supplementary Table S2).
Splitstree (Huson and Bryant 2006) was used to visualize conflicting phylogenetic signals by generating NeighborNet split networks for the multiple consensus alignment, excluding gap sites, and using the Uncorrected_P characters transformation with 1,000 bootstrap replicates. The statistical significance of the observed recombination signals, was obtained using the phi statistic (Bruen et al. 2006). The relatedness of the genomes to sixteen previously published KSHV sequences was assessed by SimPlot (Lole et al. 1999) using a Neighbor-Joining method based on a Kimura two-parameter model with 1,000 bootstrap replicates and a window size of 2,000 and 200 bp step. Possible breakpoints were investigated using five of the recombination detection methods included in Recombination Detection Program 4 (RDP4; Martin et al. 2015). Using the default parameters, a strong evidence for recombination was defined whenever more than two methods concurred (RDP, GENECONV, MaxChi, BootScan, and SiScan).
3. Results
3.1 Characteristics of patients and samples used in this study
To study the viral metagenome as well as the variation of KSHV genomes within KSHV-related effusions, samples from fourteen patients within the HIV and AIDS Malignancy Branch (NCI, NIH) were analyzed. The samples, collected between 2010 and 2015, were included in this study based on their KSHV VL (see below). Among the fourteen patients, twelve had PEL, and ten of these twelve patients had other concurrent KSHV-associated diseases (Table 1). The other two patients had KS and either MCD or KICS with elevated KSHV VL in the effusion. The average age was 41 years (interquartile range [IQR], 33 − 50) with only one cisgender female patient (FNL014). Patients had a wide range of CD4 counts (median 106 cells/µl (IQR, 33 − 365)) and HIV VL (median fifty-one copies/ml (IQR, 26 − 882)). Nine patients were European Americans, four were African Americans, and the one female patient was an African immigrant.
Of the fourteen effusion samples collected between 2010 and 2015, eleven samples were pleural effusions (FNL001, FNL003, FNL004, FNL005, FNL006, FNL008, FNL009, FNL010, FNL011, FNL012, and FNL013) and three were peritoneal effusions (FNL002, FNL007, and FNL014). All samples were identified as KSHV positive by qPCR, and twelve were EBV positive (all but FNL004 and FNL009). The KSHV VL in these samples was extremely high with a median value of 7.4×107 KSHV copies/106 cells (IQR 1.2–16.0×107) when compared with a median value of 14.9×103 EBV copies/106 cells (IQR 1.3–4,140×103).
3.2 NGS and untargeted metagenomic analysis of un-mapped reads
To identify and measure the abundance of microorganisms contained within effusion samples, shotgun sequencing data were acquired for all fourteen samples, with an average number of thirty-two million raw reads per sample. The datasets were analyzed as described in Section 2. Reads were first mapped to the human GRCh37 and KSHV GK18 reference genomes (see Section 3.3) to decrease human background and isolate relevant KSHV reads. Reads which failed to map to either reference (0.5% of post-QC reads) were then used for a broader classification of microorganisms by Taxonomer using a comprehensive pan-organism sequence database (Fig. 1).
Figure 1.
Graphical representation of Taxonomer results for reads not mapped to the human GRCh37 (GCF_000001405.25) or KSHV GK18 (AF148805) genome of fourteen effusion samples. (a) High level taxonomic classification of sequencing data is presented as the relative abundance of sequences within each effusion sample. (b) Viral sequencing reads classified at the species or subtype level for twelve of the fourteen effusion samples.
Of the reads characterized, an average of 87.6 per cent were classified as human across all samples, followed by viral (7.8%), bacterial (2.6%), phage (1.2%), ambiguous (0.6%), fungal (0.3%), and other (<0.0%) (Fig. 1a;Supplementary Table S3). In twelve of the samples, viral sequences other than phages were recovered (Fig. 1b), with four samples containing evidence of the single-stranded DNA virus, torque teno virus (TTV or transfusion transmitted virus). Since its identification in 1997 in serum of patients with high serum aminotransferase levels, it is now estimated that ≥90 per cent of adults are positive for TTVs, with little to no evidence of pathogenicity of these viruses (Hino and Miyata 2007). TTV DNA is frequently detected in various sample types and its identification within effusion samples is consistent with previous reports (Gallian et al. 1999; Iriyama et al. 1999; Okamoto et al. 2001; Galmes et al. 2013).
Herpesviridae sequences were identified in nine of the twelve samples. Potential KSHV sequences not aligned to KSHV GK18 were retained in only one of the samples. These sequences were likely not captured by the alignment to the KSHV reference due to low-quality bases. Eight samples were classified as EBV positive by Taxonomer, and all eight had detectable EBV VL as determined by qPCR. However, in an additional four samples with detectable EBV VL no EBV sequences were identified by Taxonomer (FNL002_US, FNL007_US, FNL008_US, and FNL010_US). This is likely due to the low average EBV VL within these samples (3,137 copies/106 cells compared with 17.2×106 copies/106 cells, respectively). One sample was also identified to be human herpesvirus-6 (HHV-6) positive. HHV-6 detection in FNL003 was confirmed by the HIV and AIDS Malignancy Branch as a chromosomally integrated HHV-6 genome (T. Uldrick, personal communication, 27 June 2017), confirming the metagenomics analysis by Taxonomer. The identification of EBV and HHV-6 by Taxonomer was further verified by reference-based alignment of unmapped reads to reference sequences of HHV-6 (NC_001664 (type 6A) or NC_000898 (type 6B)) and EBV (NC_0076053 (type 1) or DQ279927 (type 2)). Due to low coverage, no full genome of these viruses could be acquired (average genome coverage 62%). Yet, HHV-6 as well as EBV subtypes for six of the eight samples could be determined and were verified by BLAST, with sequences matching reference sequences at ≥99 per cent nucleotide identity (Altschul et al. 1990; Johnson et al. 2008) (Table 2).
Table 2.
Sequencing results.
| Genome | Raw Reads (post QC) | KSHV DNA copies* | EBV DNA copies* | KSHV |
EBV |
HHV-6 |
||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mapped Reads | (%) Paired-end reads mappeda | Mean coverage | (%) Genome coverage | K1 | K15 subtype | Mapped Reads | (%) Paired-end reads mappedb | Mean coverage | (%) Genome coverage | EBV Subtype | Mapped Reads | (%) Paired-end reads mappedc | Mean coverage | (%) Genome coverage | HHV-6 Subtype | ||||
| FNL001_US | 32,406,809 | 10.3×106 | 21.8×104 | 64,547 | 0.20 | 92 | 99.9 | A|PB|P | 3,278 | 0.01 | 3.4 | 93.5 | Type 1 | NA | ||||
| FNL002_US | 28,705,803 | 8.8×106 | 570 | 166,145 | 0.58 | 237 | 99.9 | A|P | NA | NA | ||||||||
| FNL003_US | 28,513,621 | 50.2×106 | 78.3×104 | 133,249 | 0.47 | 187 | 99.9 | A|P | 3,352 | 0.01 | 3.4 | 92.7 | Type 2 | 613 | 0.00 | 0.8 | 53.3 | A |
| FNL004_US | 16,734,901 | 6.3×106 | 0 | 33,405 | 0.20 | 50 | 99.9 | C|P | NA | 0.00 | 0.0 | NA | ||||||
| FNL005_US | 26,796,183 | 4.1×106 | 31.0×104 | 38,698 | 0.14 | 56 | 99.9 | A|P | 4,610 | 0.02 | 4.9 | 95.3 | Type 1 | NA | ||||
| FNL006_US | 24,646,469 | 14.5×106 | 12.3×104 | 66,247 | 0.27 | 96 | 99.9 | C|P | 1,899 | 0.01 | 2.1 | 82.8 | Type 2 | NA | ||||
| FNL007_US | 33,021,259 | 11.8×106 | 18 | 131,360 | 0.40 | 190 | 99.9 | C|P | NA | NA | ||||||||
| FNL008_US | 29,661,825 | 81.3×104 | QP | 2,096 | 0.01 | 3 | 91.8 | ND† | NA | NA | ||||||||
| FNL009_US | 66,133,270 | 2.9×106 | 0 | 7,567 | 0.01 | 10 | 99.2 | A|P | NA | NA | ||||||||
| FNL010_US | 20,593,589 | 45.0×104 | 200 | 421 | 0.00 | 1 | 43.5 | ND† | NA | NA | ||||||||
| FNL011_US | 21,756,811 | 5.1×106 | 98.8×103 | 22,195 | 0.10 | 27 | 99.9 | C|P | 1,245 | 0.01 | 1.1 | 63.6 | Type 1 | NA | ||||
| FNL012_US | 33,139,689 | 3.9×106 | 1.3×103 | 5,841 | 0.02 | 8 | 99.4 | ND† | 16 | 0.00 | 0.0 | 1.7 | ND† | NA | ||||
| FNL013_US | 60,517,295 | 97.0×104 | 600 | 3,841 | 0.01 | 6 | 98.4 | ND† | 59 | 0.00 | 0.1 | 6.2 | ND† | NA | ||||
| FNL014_CM | 33,740,485 | 1.1×106 | 60.5×104 | 79,547 | 0.24 | 109 | 99.5 | A5|P | 1,449 | 0.00 | 1.5 | 72.3 | Type 2 | NA | ||||
Per cent of the total paired-end reads mapped to KSHV AF148805 (a), EBV type 1 NC_0076053 or type 2 DQ279927 (b) as well as HHV-6 A NC_001664 or HHV-6 B NC_000898 (c). EBV (EBNA2) and HHV-6 (U38) subtypes were determined based on sequence homology and Nucleotide BLAST results.
Estimated DNA copy numbers used for library preparation; genome coverage depicted as the fraction of the reference genome covered by at least one read.
Not sufficient coverage to determine KSHV or EBV subtypes.
NA, not applicable; ND, not determined.
In ten of the fourteen samples, sequences similar to non-human viruses such as rhesus lymphocryptovirus (rLCV) and Coccolithovirus were further identified by Taxonomer. It cannot be excluded that the detection of rLCV and Coccolithovirus within the samples may indicate a potential for low frequency contaminants contained within the unmapped reads. However, the assignment of reads to rLCV likely occurred due to sequence homology with EBV. Because of the low abundance of these sequences in the samples (<1% of unmapped reads), the analysis of these sequences was not further pursued. Of the samples in which Taxonomer identified viral reads, only three contained sequences for more than one human virus (EBV, HHV-6, KSHV, and TTV), indicating the possibility of co-infection with multiple virus species within the cell populations contained in these effusions.
3.3 Reference-based alignment of KSHV
To examine KSHV diversity within the sample set, reads aligned to the KSHV reference sequence GK18 were isolated and further analyzed. The range of post-QC reads mapped to GK18 was 0.002 − 0.579 per cent with a median KSHV input per library of 4.6×106 copies (range = 0.45×106 to 50.2×106 copies) (Table 2). Near complete KSHV genomic data were obtained directly for nine of the fourteen effusion samples included in this study. For these nine samples, the average genome coverage was 99.79 per cent, with 98.9 per cent of bases covered by ≥5 reads and up to 86.8 per cent of bases covered by ≥20 reads. Although the average genome coverage was lower than observed for samples prepared by target enrichment protocols, all but several known repeat regions (close to or within K4.2, K7, K12, ORF73, and K15) could be correctly aligned to the reference genome. These repeat regions account for <5 per cent of the KSHV genome and were masked before further analyses. Due to a lower median KSHV input of 0.89×106 versus 7.55×106 copies, no alignment could be generated for four of the samples (FNL008_US, FNL010_US, FNL012_US, and FNL013_US).
The generation of a consensus sequence for FNL001_US, in the effusion of an African-American patient with PEL and MCD, failed despite a mean coverage of 92× due to the presence of two different KSHV genomes within the sample (Fig. 2). The generated genome sequence represented a composite of the individual genomes and as such was not included in the final genome analyses. The proportion of these genomes varied based on the coverage depth at each base, but the division was most evident within the K1 region. The higher frequency (consensus) variant was identified to be a B or African subtype whereas the minor frequency variant was characterized as A, a K1 subtype more frequent in Europe and North America. Both genomes were detected at consistently higher rates than the low frequency variants introduced by Illumina sequencing errors. To confirm that NGS correctly identified two different KSHV genomes, we Sanger-sequenced various gene regions across KSHV (Table 3). Using sequence-specific primers (Supplementary Table S1) designed specifically against each region for the consensus and minor genome variant in FNL001_US, the presence of both genomes was verified not only in the effusion but also in saliva and PBMC samples from this patient collected across a forty-eight month period.
Figure 2.
Partial representation of the alignment of FNL001_US to GK18 within the K1 coding region (355-583 bp). Gray bar across the top shows the GK18 genome sequence. Only bases in which reads disagree in reference to the GK18 sequence are shown. Reads belonging to the minor frequency genome (K1 A subtype) are highlighted in pink, whereas reads belonging to the consensus variant (K1 B subtype) are highlighted in blue.
Table 3.
Presence of FNL001_US consensus and minor genome variant.
| K1 |
ORF25 |
ORF48 |
vIRF3 (K10.5) |
ORF75 |
||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Sample type | Consensus | Minor | Consensus | Minor | Consensus | Minor | Consensus | Minor | Consensus | Minor |
| Effusion (07/2012) | + | + | + | + | + | + | + | + | + | + |
| PBMC (10/2012) | + | |||||||||
| Saliva (10/2010) | + | + | + | + | + | |||||
| Saliva (07/2012) | + | + | + | + | + | + | + | |||
| Saliva (12/2012) | + | + | + | |||||||
| Saliva (02/2013) | + | + | + | + | ||||||
| Saliva (05/2014) | + | + | + | + | + | |||||
3.4 KSHV genetic diversity
KSHV genome diversity was quantified by counting variant sites in the assembled genomes compared with the GK18 reference genome within a 500 bp window. Overall, low level variation was detected across the whole KSHV genome at both the nucleotide and amino acid level. Amidst all nine genomes, a total of 954 variant sites were identified (Fig. 3a) relative to GK18. Of these variants, over 95 per cent constituted single-nucleotide variants (SNVs) and <4 per cent indels. Seventy-two percent of variants were classified as synonymous (42%) or non-synonymous SNVs (30%), with fewer than 26 per cent of non-protein coding SNVs. For each of the eighty-six annotated ORFs, divergence was measured by calculating the percent amino acid sequence identity (Fig. 3b) in an alignment composed of the nine genomes and GK18 as the reference. ORF73 was the only coding sequence (CDS) with an internal repeat region and as such had to be excluded. K1 was the most variable ORF across all nine genomes (Fig. 3c), in which the patterns of variability correlated to the K1 subtype (Table 2). In two genomes, SNVs and deletions led to significant changes within the predicted protein sequence. In FNL004_US (Fig. 3e), the ORF8 (envelope glycoprotein) sequence contained three nucleotide substitutions and a 2-bp deletion, resulting in frameshift mutations that lead to the introduction of a stop codon 603 amino acids earlier when compared with the reference sequence. The KSHV genome in FNL014_CM contained the largest deletion within a coding region, a 408 bp in-frame deletion in ORF4, which encodes for a complement-binding protein, led to a significant truncation of the protein sequence from 550 amino acids to 414 amino acids (Fig. 3d). Further, FNL014_CM exhibits the identical cytosine insertion at g.550_551insC in K8 (the KSHV analog of EBV Zta protein) as seen in the previously published genomes from Zambia (Olp et al. 2015) which results in the introduction of a premature stop codon at amino acid 212 (Fig. 3f).
Figure 3.
Number and effect of KSHV SNVs detected within the nine genomes included in this study. (a) A total of 954 nucleotide variants were identified when KSHV sequences were compared with GK18 as a reference. Variants within coding regions constituted ∼72 per cent of all SNVs and were classified as synonymous and non-synonymous. In addition, non-protein coding variants, deletions and insertions were also identified. (b) Distance matrix of eighty-six coding regions of KSHV genomes compared with the GK18 reference sequence. Amino acid (aa) sequence identity (% similarity) is proportionally color-coded ranging from 100 per cent in white to ≤40 per cent in dark blue. ORF73 was excluded as the internal repeat region had to be masked due to low coverage. Highlighter tool plots displaying mismatches in the protein sequence of the most variable coding regions are shown for K1 (c), ORF4 (d), ORF8 (e), and K8 (f). Plots were generated using the Highlighter tool by the Los Alamos National Laboratory.
To visually characterize the frequency and patterns of variations at the DNA level, SNVs were plotted in relation to their presence across the KSHV genome (Fig. 4). Variants were distinguished by their functional implications (shape) and occurrence in one or more sequences of distinct K1 subtypes (Supplementary Table S4). For sequences belonging to either the A or C subtype, only SNVs identified in at least two genomes were considered. As only one of the genomes belonged to the K1 A5 subtype, variants could not be specified as genome or subtype specific and as such all SNVs identified in this sequence were included. Through these criteria, a total of 752 SNVs were incorporated in the graph. As expected, the highest density of variants was found in or around the K1 ORF. Overall, the number of SNVs was higher within the first sixty-eight kilobases of the KSHV genome with only occasional areas of higher frequency toward the three-prime end (459 versus 293 SNVs), such as in and around the regions encoding KSHV microRNAs. The number of protein-coding SNVs (synonymous and non-synonymous) greatly outnumbered the non-coding ones (554 − 198 SNVs).
Figure 4.
Graphic overview of the KSHV genome depicting coding regions (CDS, dark blue), repeat regions (light blue) and RNA coding regions (orange). Bar graph indicates the number of variants of all genomes identified in comparison to the reference GK18. Variants were counted within a 500 bp sliding window with masked regions displayed as light gray boxes. Vertical black lines mark variants that have been found in at least two genomes of the K1 A and C subtypes and variants identified in the K1 A5 genome. The predicted functional relevance of the variants is indicated by different shapes (circle, synonymous; square, non-synonymous; rhombus, non-protein coding). A total of 752 variants are shown.
3.5 Phylogenomic clustering and recombination analysis
Two different methods were used for the detection of potential recombination events. First, a splits network was constructed to identify the phylogenetic relationships between the nine genomes and sixteen representative KSHV genome sequences (Fig. 5a). A split partitions the graph into two distinct parts, in which parallel branches represent recombination events. The genome-wide network showed splits differentiating sequences based on the K15 P or M subtype (blue split) or African origin (orange split). Yet, the only groups supported by less complex relationships contained the Japanese sequences (purple split) and sequences found predominantly in Europe and North America (pink split). Within the network, statistically significant evidence for recombination was detected (Phi test P value < 0.001). In contrast, two splits networks were produced based on the K1 (Fig. 5b) and K15 (Fig. 5c) sequences of representative subtypes which showed no evidence of recombination (P = 1.0). Given that the branching patterns differ between the trees included in this study, these data suggest that the variability within the central region of the KSHV genome contributes considerably to the phylogenetic clustering and recombination observed. The K1 and K15 networks support the previously defined K1 (A-F) and K15 (P, M, N) subtypes (Biggar et al. 2000; Zong et al. 1999). As detailed in Table 2, all participants in this study had sequences belonging to either the K1 A or C subtype, with FNL014_CM identified as an African A5 subtype. The only exception was the consensus sequence of FNL001_US that grouped with the previously defined K1 B subtypes. In contrast, the minority variant also isolated from FNL001 belonged to the K1 A subtype. All genomes belonged to the K15 P or predominant subtype sequence found in the majority of known KSHV sequences to date.
Figure 5.
Phylogenetic NeighborNet splits network analyses of KSHV genomes, K1 and K15 sequences. All networks were generated with 1,000 bootstrap replicates. (a) Alignment comprising twenty-five genome sequences, including representative sequences for available K1 subtypes. Sequences were trimmed and repeat regions were masked prior to analysis. The network presents two main partitions dividing the network into K15 P and M subtypes (blue edges). In addition, African subtypes are separated by another set of parallel edges (orange), also called splits. The Japanese and European groups are supported by uncontradicted splits (purple and pink splits). Phi-test P value=0.0 (b) K1 network of forty-seven sequences, with thirty-nine representative sequences of all known K1 subtypes shows the majority of genomes from this study to belong to the A and C subtype. Two distinct subtypes were identified for FNL001_US (A and B). Phi-test P value=1.0. (c) Splits network of twenty-six K15 sequences distinguishes the three known subtypes. All genomes identified in this study belonged to the P subtype. Phi-test P value=1.0.
Second, the amount of sequence fragmentation indicating the number of recombination crossover events in each genome was visualized using BootScan and SimPlot analyses in a 2,000 bp sliding window using multiple sequence alignments (Supplementary Table S2). Each isolated genome was analyzed in comparison to sixteen previously published whole KSHV genome sequences. Potential recombination events are indicated by signals of ≥70 per cent of permutated trees. For all genomes, BootScan results indicated considerable fragmentation of the genomes (Fig. 6) despite high sequence similarity (≥92%; SimPlot results not shown) consistent with the occurrence of multiple recombination events. Indeed, the shifting relationship of genomes to the sequences included in the analysis implies a high level of recombination as seen, for example, in FNL014_CM (Fig. 6a). Further, fragmentation patterns for genomes of the same K1 subtype were nearly identical within genomes identified as K1 A (Fig. 6b) and K1 C (Fig. 6c). Across genomes with a K1 A subtype (FNL002_US, FNL003_US, FNL005_US, and FNL009_US), the previously described A subtype genomes (BC-1 and BCBL-1) clustered at higher frequency with the query sequences whereas in genomes with a K1 C subtype (FNL004_US, FNL006_US, FNL007_US, FNL011_US) the query sequence clustered with JSC-1, a previously described C subtype. This suggests that sequences with related K1 sequences have similar yet not identical fragmentation patterns. The identification of breakpoints across genomes with similar K1 subtype was also confirmed with RDP4 (Supplementary Table S5), in which different recombination detection algorithms provided statistical evidence for each potential occurrence of recombination.
Figure 6.
KSHV recombination analysis of three representative genomes compared with fifteen published KSHV genomes as well as VG-1 using BootScan analysis (2,000 bp window, 200 bp step). (a−c) BootScan analysis shows highly fragmented genomes. Similar patterns of fragmentation were observed for sequences with K1 A and K1 C subtypes as seen for (b) FNL005_US and (c) FNL004_US, respectively. Repeat regions were masked out (gray bars).
4. Discussion
In this study, we took advantage of whole-genome sequencing techniques to sequence and characterize possible pathogenic sequences directly from KSHV-related effusions. Our KSHV-specific NGS data provide novel evidence for co-infection of an individual by two different KSHV strains. Further, the genome-wide sequence analysis confirmed previous observations of extensive recombination events between KSHV sequences (Olp et al. 2015; Sallah et al. 2018).
These data were generated without targeted enrichment of viral sequences from KSHV-associated effusions. In twelve cases, sequence reads were obtained from effusions in people with PEL. In the other two cases the obtained sequence reads were derived from effusions with elevated KSHV VL but no detected tumor cells on cytopathology in patients with advanced KS and MCD or KICS. The occurrence of KSHV-associated effusions in people living with HIV with a wide range of HIV VLs and CD4+ T-cell counts was consistent with previous reports for KS and PEL (Krown et al. 2008; Lurain et al. 2019). The detection of viral genera within the effusions varied greatly between samples and may either reflect variations in the viral burden within the samples or the analytical limitations of Taxonomer. Viral entities other than KSHV were detected within the nucleated cells contained in effusions. As described in previous studies (Nador et al. 1996; Cesarman and Knowles 1999), EBV and EBV-related sequences were detected more frequently within our effusion samples (eight out of fourteen samples). Yet there is little overlap across the other viral genera identified and by only examining malignant effusions, our study did not incorporate appropriate controls to determine whether these viruses contribute to the pathogenesis of effusions and PEL.
For KSHV, the presence of two distinct KSHV genomes within the effusion sample of FNL001 was confirmed by sequence-specific PCR and provides additional evidence for a dual KSHV infection. Previous reports, which found evidence for dual KSHV infection by comparing LANA and terminal repeat size polymorphisms (Gao et al. 1999; Judde 2000), could not be validated within this study as the experimental and computational approaches could not resolve the complexity of the ORF73 and terminal repeat regions. The identification of both KSHV genomes not only in the effusion sample but also in saliva and PBMCs from the same individual indicated that these sequences were present systematically and not only in isolated compartments. However, our data were not sufficient to discern whether the dual infection was the result of a simultaneous co-infection or subsequent infections by two different KSHV strains, nor if both KSHV genomes were present in the same cell. In samples with lower mean coverage the presence of such dual infections may have not been captured by the sequencing approach utilized within this study. As such it remains to be determined how frequently such infections can be detected using NGS.
The KSHV whole-genome analysis showed that while sequences clustered primarily into groups based on the K15 P or M subtype the splits network did not support the classification into five particular clusters as proposed recently by Sallah et al. (2018). The dataset included within their study centered on the K15 variability commonly observed in African strains and varies considerably when compared with ours in which K15 P subtypes were exclusively observed. Thus, it is very likely that due to bias in available KSHV genomes and the significant differences in computational and analytical approaches the hierarchical clustering presented by Sallah et al. was not observed within our sample set. BootScan and RDP4 analyses of isolated genomes against previously published KSHV sequences showed strong statistical evidence for genomes to contain several recombinant segments indicative of multiple crossover events. Homologous recombination has been shown to be an evolutionary driving force for other herpesviruses such as herpes simplex virus type 1, in which recombination between co-infected genomes has been observed in vitro and in vivo (Szpara et al. 2014; Lee et al. 2015). Several other studies have also identified complex recombination patterns across human gammaherpesvirus genomes (Olp et al. 2015; Palser et al. 2015; Sallah et al. 2018; Zanella et al. 2019), yet the specific mechanisms and frequency of such recombination events in KSHV remain to be determined. A recent study on EBV concluded that the majority of recombination events shared by isolates were introduced by a node ancestor (Zanella et al. 2019), indicating that a great number of these events may not represent recent occurrences of recombination.
The fragmentation patterns observed across sequences with similar K1 subtypes indicated similar phylogenomic relations between these genomes. These particular recombination patterns suggested that distinct patterns in genome variability may result in phylogenetic clustering of genomes based on their geographical origin, as seen with K1, and may allow a more consequential determination of subtypes. These observations added further support to previous studies which identified that the clustering of sequences into different groups or clades not only depends on the information contained within the K1 and K15 regions but also to the low level variation identified across the central region of the genome (Olp et al. 2015; Sallah et al. 2018). Yet, as KSHV sequences included in this study could not be considered to be non-recombinant parental strains, the number of recombination events may be an overestimate due to all sequences potentially being recombinants. Additionally, the decrease in genetic information caused by the omission of repeat regions and low coverage areas may have biased the studies of genetic diversity and recombination patterns.
As described above, low to moderate sequence variability was identified across the genome when the nine sequences were compared with GK18. The highest region of variability across all samples correlated to the K1 gene as seen previously in sequences from Zambia and Uganda (Olp et al. 2015; Sallah et al. 2018). The differences in the protein sequence observed in K1 reflect the phylogenetic classification of sequences into the K1 A, B, and A5 subtype. The increased variability in K15 was not evident within the sample set included in this study, as all genomes as well as the reference GK18 belonged to K15 P subtype. In addition, several coding regions within individual genomes showed considerable changes in the predicted protein sequences, yet there was little overlap with previously reported changes (Olp et al. 2015; Awazawa et al. 2017). Of the non-synonymous mutations identified in KSHV strains from the Miyako Islands, only two SNVs in ORF24 and ORF26 were identical to changes identified within the genomes examined in this study. In sample FNL004_US, multiple single nucleotide changes and a 2-bp deletion resulted in a truncation of the predicted ORF8 protein sequence. The glycoprotein B encoded by ORF8 was found to be important for virus entry through endocytosis (Akula et al. 2003). FNL014_CM contained a 408 bp deletion in ORF4 which was confirmed by PCR. ORF4 is known to act as a regulator of complement activation and is expressed in three alternatively spliced messenger RNA (Spiller et al. 2003). The previously described five-prime splice sites fall within the observed deletion, yet the effect of the possible splicing aberrations could not be determined in silico. Further, FNL014_CM contains nine amino acid substitutions and a single nucleotide insertion in K8 that lead to a frameshift mutation and introduction of premature stop codons. These changes were also observed in sequences from Zambia (data not shown) and imply possible functional properties of African specific K8 sequences. K8 has been shown to be an RNA binding protein implicated in the regulation of gene expression and DNA replication of host and viral RNA targets (Lin et al. 2003; Liu et al. 2018). Overall, in silico predictions of the functional effects of these protein variants have found them all to be deleterious (PROVEAN; data not shown (Choi et al. 2012)) yet their particular effects, such as functional changes or deficiencies of the protein products, as well as their possible contribution to the pathogenicity and development of the KSHV-related malignancies remain to be determined by in vitro and in vivo experiments. It should be noted that we investigated KSHV whole-genome sequences isolated from KSHV-related effusion samples only. As such we cannot rule out that the variations observed within the sequences may have been acquired in the course of cell transformation and do not represent the circulating cell population. A comparison of these sequences to KSHV present in blood or saliva was not feasible with the methodology used in this study but such comparisons will be presented elsewhere.
For NGS data to be informative across different studies, more standardized approaches in data analyses are required. In lack of such standardization, raw data should be made publicly available whenever possible to allow in future studies the analysis of more comprehensive datasets.
Several of our key findings are similar to previous reports despite differences in experimental and methodological approaches: (a) KSHV genomes were directly isolated from effusions without the need of ex vivo expansion of KSHV transformed cells; (b) low level genetic diversity was detected across all nine KSHV genomes with deletions and insertions disrupting the coding regions of a number of lytic genes in distinctive genomes. However, we also report several novel findings, including the first KSHV dual infection within an individual identified through NGS as well as the characterization of the virome contained within effusion samples. Further studies involving a greater number of KSHV genomes collected from symptomatic and asymptomatic individuals should enable the examination of the full extent of recombination and define the significance of possible genotype−phenotype correlations as recently identified for EBV (Feng et al. 2015; Wu et al. 2018).
Data availability
Partial KSHV genomes first described in this study are available in GenBank under the following accession numbers: MN419219 (FNL002_US), MN419220 (FNL003_US), MN419221 (FNL004_US), MN419222 (FNL005_US), MN419223 (FNL006_US), MN419224 (FNL007_US), MN419225 (FNL009_US), MN419226 (FNL011_US), MN419227 (FNL014_CM).
Supplementary Material
Acknowledgments
The authors thank the patients and nursing staff at the NIH Clinical Center, Bethesda (MD, USA), especially Karen Aleman and Kathleen Wyvill. We also thank Senior Illustrator Joseph Meyer and Senior Graphic Designer Allen Kane, Scientific Publications Graphics and Media, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., Frederick (MD, USA) for help with the figures. Special thanks to Dr Eneida Hatcher and Dr David Ott for help with the GenBank submissions and valuable comments.
Funding
This work was supported in whole or in part with federal funds from the Frederick National Laboratory for Cancer Research, under contract number HHSN261200800001E and National Cancer Institute contract 75N91019D00024, and in part by the Intramural Research Program of the National Institutes of Health, National Cancer Institute. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.
Conflict of interest: None declared.
Contributor Information
Elena M Cornejo Castro, Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, P.O. Box B, Frederick, MD 21702, USA.
Vickie Marshall, Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, P.O. Box B, Frederick, MD 21702, USA.
Justin Lack, Advanced Biomedical Computing Center, Leidos Biomedical Research, Inc., Frederick, MD 21702, USA.
Kathryn Lurain, HIV and AIDS Malignancy Branch, National Cancer Institute, 10 Center Dr, Bethesda, MD 20814, USA.
Taina Immonen, Retroviral Evolution Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, P.O. Box B, Frederick, MD 21702, USA.
Nazzarena Labo, Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, P.O. Box B, Frederick, MD 21702, USA.
Nicholas C Fisher, Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, P.O. Box B, Frederick, MD 21702, USA.
Ramya Ramaswami, HIV and AIDS Malignancy Branch, National Cancer Institute, 10 Center Dr, Bethesda, MD 20814, USA.
Mark N Polizzotto, HIV and AIDS Malignancy Branch, National Cancer Institute, 10 Center Dr, Bethesda, MD 20814, USA.
Brandon F Keele, Retroviral Evolution Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, P.O. Box B, Frederick, MD 21702, USA.
Robert Yarchoan, HIV and AIDS Malignancy Branch, National Cancer Institute, 10 Center Dr, Bethesda, MD 20814, USA.
Thomas S Uldrick, HIV and AIDS Malignancy Branch, National Cancer Institute, 10 Center Dr, Bethesda, MD 20814, USA.
Denise Whitby, Viral Oncology Section, AIDS and Cancer Virus Program, Frederick National Laboratory for Cancer Research, P.O. Box B, Frederick, MD 21702, USA.
References
- Akula S. M. et al. (2003) ‘Kaposi's Sarcoma-Associated Herpesvirus (Human Herpesvirus 8) Infection of Human Fibroblast Cells Occurs through Endocytosis’, Journal of Virology, 77: 7978–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Altschul S. F. et al. (1990) ‘Basic Local Alignment Search Tool’, Journal of Molecular Biology, 215: 403–10. [DOI] [PubMed] [Google Scholar]
- Arias C. et al. (2014) ‘KSHV 2.0: A Comprehensive Annotation of the Kaposi's Sarcoma-Associated Herpesvirus Genome Using Next-Generation Sequencing Reveals Novel Genomic and Functional Features’, PLoS Pathogens, 10: e1003847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Awazawa R. et al. (2017) ‘High Prevalence of Distinct Human Herpesvirus 8 Contributes to the High Incidence of Non-Acquired Immune Deficiency Syndrome-Associated Kaposi's Sarcoma in Isolated Japanese Islands’, The Journal of Infectious Diseases, 216: 850–8. [DOI] [PubMed] [Google Scholar]
- Biggar R. J. et al. (2000) ‘Human Herpesvirus 8 in Brazilian Amerindians: A Hyperendemic Population with a New Subtype’, The Journal of Infectious Diseases, 181: 1562–8. [DOI] [PubMed] [Google Scholar]
- Biggar R. J. et al. ; For the HIV/AIDS Cancer Match Study (2007) ‘AIDS-Related Cancer and Severity of Immunosuppression in Persons with AIDS’, Journal of the National Cancer Institute, 99: 962–72. [DOI] [PubMed] [Google Scholar]
- Bolger A. M., Lohse M., Usadel B. (2014) ‘Trimmomatic: A Flexible Trimmer for Illumina Sequence Data’, Bioinformatics, 30: 2114–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boulanger E. et al. (2005) ‘Prognostic Factors and Outcome of Human Herpesvirus 8-Associated Primary Effusion Lymphoma in Patients with AIDS’, Journal of Clinical Oncology, 23: 4372–80. [DOI] [PubMed] [Google Scholar]
- Brown E. E. et al. (2006) ‘Virologic, Hematologic, and Immunologic Risk Factors for Classic Kaposi Sarcoma’, Cancer, 107: 2282–90. [DOI] [PubMed] [Google Scholar]
- Bruen T. C., Philippe H., Bryant D. (2006) ‘A Simple and Robust Statistical Test for Detecting the Presence of Recombination’, Genetics, 172: 2665–81. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cesarman E., Knowles D. M. (1999) ‘The Role of Kaposi's Sarcoma-Associated Herpesvirus (KSHV/HHV-8) in Lymphoproliferative Diseases’, Seminars in Cancer Biology, 9: 165–74. [DOI] [PubMed] [Google Scholar]
- Cesarman E. et al. (1995) ‘Kaposi's Sarcoma-Associated Herpesvirus-like DNA Sequences in AIDS-Related Body-Cavity-Based Lymphomas’, New England Journal of Medicine, 332: 1186–91. [DOI] [PubMed] [Google Scholar]
- Chang Y. et al. (1994) ‘Identification of Herpesvirus-like DNA Sequences in AIDS-Associated Kaposi's Sarcoma’, Science, 266: 1865–9. [DOI] [PubMed] [Google Scholar]
- Choi Y. et al. (2012) ‘Predicting the Functional Effect of Amino Acid Substitutions and Indels’, PLoS One, 7: e46688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cook P. M. et al. (1999) ‘Variability and Evolution of Kaposi's Sarcoma-Associated Herpesvirus in Europe and Africa. International Collaborative Group’, AIDS, 13: 1165–76. [DOI] [PubMed] [Google Scholar]
- Dedicoat M., Newton R. (2003) ‘Review of the Distribution of Kaposi's Sarcoma-Associated Herpesvirus (KSHV) in Africa in Relation to the Incidence of Kaposi's Sarcoma’, British Journal of Cancer, 88: 1–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Engels E. A. et al. (2011) ‘Spectrum of Cancer Risk among US Solid Organ Transplant Recipients’, JAMA, 306: 1891–901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Feng F. T. et al. (2015) ‘A Single Nucleotide Polymorphism in the Epstein-Barr Virus Genome is Strongly Associated With a High Risk of Nasopharyngeal Carcinoma’, Chinese Journal of Cancer, 34: 563–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gallian P. et al. (1999) ‘TT Virus Infection in French Hemodialysis Patients: Study of Prevalence and Risk Factors’, Journal of Clinical Microbiology, 37: 2538–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Galmes J. et al. (2013) ‘Potential Implication of New Torque Teno Mini Viruses in Parapneumonic Empyema in Children’, European Respiratory Journal, 42: 470–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gao S. J. et al. (1999) ‘Molecular Polymorphism of Kaposi's Sarcoma-Associated Herpesvirus (Human Herpesvirus 8) Latent Nuclear Antigen: Evidence for a Large Repertoire of Viral Genotypes and Dual Infection with Different Viral Genotypes’, The Journal of Infectious Diseases, 180: 1466–76. [DOI] [PubMed] [Google Scholar]
- Glenn M. et al. (1999) ‘Identification of a Spliced Gene from Kaposi's Sarcoma-Associated Herpesvirus Encoding a Protein with Similarities to Latent Membrane Proteins 1 and 2A of Epstein-Barr Virus’, Journal of Virology, 73: 6953–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goncalves P. H. et al. (2017) ‘Kaposi Sarcoma Herpesvirus-Associated Cancers and Related Diseases’, Current Opinion in HIV and Aids, 12: 47–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hayward G. S., Zong J. C. (2007) ‘Modern Evolutionary History of the Human KSHV Genome’, Current Topics in Microbiology and Immunology, 312: 1–42. [DOI] [PubMed] [Google Scholar]
- Hino S., Miyata H. (2007) ‘Torque Teno Virus (TTV): Current Status’, Reviews in Medical Virology, 17: 45–57. [DOI] [PubMed] [Google Scholar]
- Huson D. H., Bryant D. (2006) ‘Application of Phylogenetic Networks in Evolutionary Studies’, Molecular Biology and Evolution, 23: 254–67. [DOI] [PubMed] [Google Scholar]
- Iriyama M. et al. (1999) ‘The Prevalence of TT Virus (TTV) Infection and Its Relationship to Hepatitis in Children’, Medical Microbiology and Immunology, 188: 83–9. [DOI] [PubMed] [Google Scholar]
- Johnson M. et al. (2008) ‘NCBI BLAST: A Better Web Interface’, Nucleic Acids Research, 36(Web Server): W5–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Judde J. G. (2000) ‘Monoclonality or Oligoclonality of Human Herpesvirus 8 Terminal Repeat Sequences in Kaposi's Sarcoma and Other Diseases’, Journal of the National Cancer Institute, 92: 729–36. [DOI] [PubMed] [Google Scholar]
- Kakoola D. N. et al. (2001) ‘Recombination in Human Herpesvirus-8 Strains from Uganda and Evolution of the K15 Gene’, Journal of General Virology, 82: 2393–404. [DOI] [PubMed] [Google Scholar]
- Katoh K., Standley D. M. (2013) ‘MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability’, Molecular Biology and Evolution, 30: 772–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keele B. F. et al. (2008) ‘Identification and Characterization of Transmitted and Early Founder Virus Envelopes in Primary HIV-1 Infection’, Proceedings of the National Academy of Sciences United States of America, 105: 7552–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krown S. E. et al. (2008) ‘More on HIV-Associated Kaposi's Sarcoma’, New England Journal of Medicine, 358: 535–6; author reply 36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee K. et al. (2015) ‘Recombination Analysis of Herpes Simplex Virus 1 Reveals a Bias toward GC Content and the Inverted Repeat Regions’, Journal of Virology, 89: 7214–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H. (2011) ‘A Statistical Framework for SNP Calling, Mutation Discovery, Association Mapping and Population Genetical Parameter Estimation from Sequencing Data’, Bioinformatics, 27: 2987–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li H. (2013) ‘Aligning Sequence Reads, Clone Sequences and Assembly Contigs With BWA-MEM’, <https://arxiv.org/abs/1303.3997> accessed 2 Sept 2017.
- Li H. et al. ; 1000 Genome Project Data Processing Subgroup (2009) ‘The Sequence Alignment/Map Format and SAMtools’, Bioinformatics, 25: 2078–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin C. L. et al. (2003) ‘Kaposi's Sarcoma-Associated Herpesvirus Lytic Origin (ori-Lyt)-Dependent DNA Replication: Identification of the ori-Lyt and Association of K8 bZip Protein With the Origin’, Journal of Virology, 77: 5578–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu D., Wang Y., Yuan Y. (2018) ‘Kaposi's Sarcoma-Associated Herpesvirus K8 is an RNA Binding Protein That Regulates Viral DNA Replication in Coordination With a Noncoding RNA’, Journal of Virology, 92: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lole K. S. et al. (1999) ‘Full-Length Human Immunodeficiency Virus Type 1 Genomes from Subtype C-Infected Seroconverters in India, With Evidence of Intersubtype Recombination’, Journal of Virology, 73: 152–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lurain K. et al. (2019) ‘Viral, Immunologic, and Clinical Features of Primary Effusion Lymphoma’, Blood, 133: 1753–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mancuso R. et al. (2008) ‘HHV8 a Subtype is Associated with Rapidly Evolving Classic Kaposi's Sarcoma’, Journal of Medical Virology, 80: 2153–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marshall V. et al. (2010) ‘Kaposi Sarcoma (KS)-Associated Herpesvirus microRNA Sequence Analysis and KS Risk in a European AIDS-KS Case Control Study’, The Journal of Infectious Diseases, 202: 1126–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marshall V. et al. (2007) ‘Conservation of Virally Encoded microRNAs in Kaposi Sarcoma–Associated Herpesvirus in Primary Effusion Lymphoma Cell Lines and in Patients with Kaposi Sarcoma or Multicentric Castleman Disease’, The Journal of Infectious Diseases, 195: 645–59. [DOI] [PubMed] [Google Scholar]
- Martin D. P. et al. (2015) ‘RDP4: Detection and Analysis of Recombination Patterns in Virus Genomes’, Virus Evolution, 1: vev003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mbulaiteye S. M. et al. (2006) ‘High Levels of Epstein-Barr Virus DNA in Saliva and Peripheral Blood from Ugandan Mother-Child Pairs’, The Journal of Infectious Diseases, 193: 422–6. [DOI] [PubMed] [Google Scholar]
- McKenna A. et al. (2010) ‘The Genome Analysis Toolkit: A MapReduce Framework for Analyzing Next-Generation DNA Sequencing Data’, Genome Research, 20: 1297–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mirabello L. et al. (2017) ‘HPV16 E7 Genetic Conservation is Critical to Carcinogenesis’, Cell, 170: 1164–74.e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nador R. G. et al. (1996) ‘Primary Effusion Lymphoma: A Distinct Clinicopathologic Entity Associated with the Kaposi's Sarcoma-Associated Herpes Virus’, Blood, 88: 645–56. [PubMed] [Google Scholar]
- Nakamura S. et al. (2009) ‘Direct Metagenomic Detection of Viral Pathogens in Nasal and Fecal Specimens Using an Unbiased High-Throughput Sequencing Approach’, PLoS One, 4: e4219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nalwoga A. et al. (2018) ‘Relationship between Anemia, Malaria Coinfection, and Kaposi Sarcoma-Associated Herpesvirus Seropositivity in a Population-Based Study in Rural Uganda’, The Journal of Infectious Diseases, 218: 1061–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Newton R. et al. ; The Uganda Kaposi's Sarcoma Study Group (2003) ‘Infection with Kaposi's Sarcoma-Associated Herpesvirus (KSHV) and Human Immunodeficiency Virus (HIV) in Relation to the Risk and Clinical Presentation of Kaposi's Sarcoma in Uganda’, British Journal of Cancer, 89: 502–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Okamoto H. et al. (2001) ‘Heterogeneous Distribution of TT Virus of Distinct Genotypes in Multiple Tissues from Infected Humans’, Virology, 288: 358–68. [DOI] [PubMed] [Google Scholar]
- Olp L. N. et al. (2015) ‘Whole-Genome Sequencing of Kaposi's Sarcoma-Associated Herpesvirus from Zambian Kaposi's Sarcoma Biopsy Specimens Reveals Unique Viral Diversity’, Journal of Virology, 89: 12299–308. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Palser A. L. et al. (2015) ‘Genome Diversity of Epstein-Barr Virus from Multiple Tumor Types and Normal Infection’, Journal of Virology, 89: 5222–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Park M. Y. et al. (2006) ‘Interactions Among Four Proteins Encoded by the Human Cytomegalovirus UL112-113 Region Regulate Their Intranuclear Targeting and the Recruitment of UL44 to Prereplication Foci’, Journal of Virology, 80: 2718–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Polizzotto M. N. et al. (2016) ‘Clinical Features and Outcomes of Patients With Symptomatic Kaposi Sarcoma Herpesvirus (KSHV)-Associated Inflammation: Prospective Characterization of KSHV Inflammatory Cytokine Syndrome (KICS)’, Clinical Infectious Diseases, 62: 730–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pollack M. S., DuBois D. (1977) ‘Possible Effects of non-HLA Antibodies in Common Typing Sera on HLA Antigen Frequency Data in Leukemia’, Cancer, 39: 2348–54. [DOI] [PubMed] [Google Scholar]
- Poole L. J. et al. (1999) ‘Comparison of Genetic Variability at Multiple Loci across the Genomes of the Major Subtypes of Kaposi's Sarcoma-Associated Herpesvirus Reveals Evidence for Recombination and for Two Distinct Types of Open Reading Frame K15 Alleles at the Right-Hand End’, Journal of Virology, 73: 6646–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Powles T. et al. (2009) ‘The Role of Immune Suppression and HHV-8 in the Increasing Incidence of HIV-Associated Multicentric Castleman's Disease’, Annals of Oncology, 20: 775–9. [DOI] [PubMed] [Google Scholar]
- Quinlan A. R. (2014) ‘BEDTools: The Swiss-Army Tool for Genome Feature Analysis’, Current Protoc Bioinformatics, 47: 11.12.1–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quinlan A. R., Hall I. M. (2010) ‘BEDTools: A Flexible Suite of Utilities for Comparing Genomic Features’, Bioinformatics, 26: 841–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rader J. S. et al. (2019) ‘Genetic Variations in Human Papillomavirus and Cervical Cancer Outcomes’, International Journal of Cancer, 144: 2206–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ray A. et al. (2012) ‘Sequence Analysis of Kaposi Sarcoma-Associated Herpesvirus (KSHV) microRNAs in Patients with Multicentric Castleman Disease and KSHV-Associated Inflammatory Cytokine Syndrome’, The Journal of Infectious Diseases, 205: 1665–76. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reddy S., Manna P. (2005) ‘Quantitative Detection and Differentiation of Human Herpesvirus 6 Subtypes in Bone Marrow Transplant Patients by Using a Single Real-Time Polymerase Chain Reaction Assay’, Biology of Blood and Marrow Transplantation, 11: 530–41. [DOI] [PubMed] [Google Scholar]
- Rezaee S. A. (2006) ‘Kaposi's Sarcoma-Associated Herpesvirus Immune Modulation: An Overview’, Journal of General Virology, 87(Pt 7): 1781–804. [DOI] [PubMed] [Google Scholar]
- Sallah N. et al. (2018) ‘Genome-Wide Sequence Analysis of Kaposi Sarcoma-Associated Herpesvirus Shows Diversification Driven by Recombination’, The Journal of Infectious Diseases, 218: 1700–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Soulier J. et al. (1995) ‘Kaposi's Sarcoma-Associated Herpesvirus-Like DNA Sequences in Multicentric Castleman's Disease’, Blood, 86: 1276–80. [PubMed] [Google Scholar]
- Spiller O. B. et al. (2003) ‘Complement Regulation by Kaposi's Sarcoma-Associated Herpesvirus ORF4 Protein’, Journal of Virology, 77: 592–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stolka K. et al. (2014) ‘Risk Factors for Kaposi's Sarcoma Among HIV-Positive Individuals in a Case Control Study in Cameroon’, Cancer Epidemiology, 38: 137–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Svraka S. et al. (2010) ‘Metagenomic Sequencing for Virus Identification in a Public-Health Setting’, Journal of General Virology, 91: 2846–56. [DOI] [PubMed] [Google Scholar]
- Szpara M. L. et al. (2014) ‘Evolution and Diversity in Human Herpes Simplex Virus Genomes’, Journal of Virology, 88: 1209–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tamburro K. M. et al. (2012) ‘Vironome of Kaposi Sarcoma Associated Herpesvirus-Inflammatory Cytokine Syndrome in an AIDS Patient Reveals co-Infection of Human Herpesvirus 8 and Human Herpesvirus 6A’, Virology, 433: 220–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Uldrick T. S. et al. (2011) ‘High-Dose Zidovudine plus Valganciclovir for Kaposi Sarcoma Herpesvirus-Associated Multicentric Castleman Disease: A Pilot Study of Virus-Activated Cytotoxic Therapy’, Blood, 117: 6977–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van der Auwera G. A. et al. (2013) ‘From FastQ Data to High Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline’, Current Protocol in Bioinformatics, 43: 11.10.1−33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu S. et al. (2018) ‘Conservation and Polymorphism of EBV RPMS1 Gene in EBV-Associated Tumors and Healthy Individuals from Endemic and Non-Endemic Nasopharyngeal Carcinoma Areas in China’, Virus Research, 250: 75–80. [DOI] [PubMed] [Google Scholar]
- Xu M. et al. (2019) ‘Genome Sequencing Analysis Identifies Epstein-Barr Virus Subtypes Associated With High Risk of Nasopharyngeal Carcinoma’, Nature Genetics, 51: 1131–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zanella L. et al. (2019) ‘A Reliable Epstein-Barr Virus Classification Based on Phylogenomic and Population Analyses’, Scientific Reports, 9: 9829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zong J. et al. (2002) ‘Genotypic Analysis at Multiple Loci across Kaposi's Sarcoma Herpesvirus (KSHV) DNA Molecules: Clustering Patterns, Novel Variants and Chimerism’, Journal of Clinical Virology, 23: 119–48. [DOI] [PubMed] [Google Scholar]
- Zong J. C. et al. (1999) ‘High-Level Variability in the ORF-K1 Membrane Protein Gene at the Left End of the Kaposi's Sarcoma-Associated Herpesvirus Genome Defines Four Major Virus Subtypes and Multiple Variants or Clades in Different Human Populations’, Journal of Virology, 73: 4156–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Partial KSHV genomes first described in this study are available in GenBank under the following accession numbers: MN419219 (FNL002_US), MN419220 (FNL003_US), MN419221 (FNL004_US), MN419222 (FNL005_US), MN419223 (FNL006_US), MN419224 (FNL007_US), MN419225 (FNL009_US), MN419226 (FNL011_US), MN419227 (FNL014_CM).






