Abstract
While traditional methods for studying large DNA viruses allow the creation of individual mutants, CRISPR/Cas9 can be used to rapidly create thousands of mutant dsDNA viruses in parallel, enabling the pooled screening of entire viral genomes. Here, we applied this approach to Kaposi’s sarcoma-associated herpesvirus (KSHV) by designing a sgRNA library containing all possible ~22,000 guides targeting the 154 kilobase viral genome, corresponding to one cut site approximately every 8 base pairs. We used the library to profile viral sequences involved in transcriptional activation of late genes, whose regulation involves several well characterized features including dependence on viral DNA replication and a known set of viral transcriptional activators. Upon phenotyping all possible Cas9-targeted viruses for transcription of KSHV late genes we recovered these established regulators and identified a new required factor (ORF46), highlighting the utility of the screening pipeline. By performing targeted deep sequencing of the viral genome to distinguish between knock-out and in-frame alleles created by Cas9, we identify the DNA binding but not catalytic domain of ORF46 to be required for viral DNA replication and thus late gene expression. Our pooled Cas9 tiling screen followed by targeted deep viral sequencing represents a two-tiered screening paradigm that may be widely applicable to dsDNA viruses.
Author summary
While most cancers are caused by mutations to the cell’s DNA, others are instead induced by viruses known as oncogenic viruses or oncoviruses. These cancers provide unique challenges but also opportunities for treatment and prevention. Here, we study a human oncogenic virus called Kaposi’s sarcoma-associated herpesvirus. It, along with related oncoviruses, relies on a unique form of transcription to express essential viral genes, and we sought to further understand how this class of genes is regulated, potentially leading to therapeutic vulnerabilities. Here we use CRISPR/Cas9 to create and measure the ability of thousands of mutant viruses to express this key class of viral genes. This allowed us to identify a novel viral gene that is required, but this gene is known to have multiple functions. By combining this CRISPR/Cas9 approach with direct viral sequencing, we were able to identify the specific function that is required. While our results contribute specifically to the study of this unique step in the viral life cycle, our methods can be applied to many open questions in the study of these oncoviruses.
Introduction
Human oncogenic viruses are a major cause of cancer, with recent estimates that 15% of all cancers are associated with a viral infection [1]. Gammaherpesviruses are a class of double-stranded DNA viruses that include both the oncogenic Epstein-Barr virus (EBV)–a ubiquitous infection associated with a number of malignancies [2]–and Kaposi’s sarcoma-associated herpesvirus (KSHV), a major cause of cancer in AIDS and other immunocompromised patients [3,4]. While many of the viral factors involved in replication and transcription of the KSHV genome have been identified, numerous non-coding regions, transcripts, and ORFs have no ascribable function [5–8].
During the lytic stage of KSHV replication, the virus expresses its genes in kinetically regulated waves, culminating in the formation and release of infectious virions. These include immediate early genes that regulate reactivation of the virus, early genes that include components required for viral DNA replication, and finally late genes, which encode for factors such as capsid proteins and glycoproteins involved in virion formation. Immediate early and early genes are driven by host-like promoters, while late gene transcription in KSHV and related gamma- as well as betaherpesviruses has several unique features including truncated promoters and modified TATA boxes [9–15]. They also require at least six virally encoded transcriptional activators (vTA) [16,17] and, perhaps most notably, their transcription is dependent on viral DNA replication [16,18,19].
The genomes of herpesviruses are double-stranded DNA that replicate in the nucleus and are thus targetable by CRISPR/Cas9. Indeed, CRISPR-based targeting of herpesviruses has been explored as a way to suppress infection [20–25]. CRISPR has also been used to make functional insertions and knockouts on the viral genome [26–28] or to perform pooled functional screens [29,30] in much the same way as on the mammalian genome. The use of CRISPR to identify essential domains of proteins [31,32] has yet to be applied to dsDNA viruses, but has particular potential given that viral proteins are frequently multifunctional.
Here, we applied a CRISPR approach to probe for functional regulators of KSHV late gene expression by creating a library of sgRNAs that tiles the KSHV genome, corresponding to one sgRNA for every 8 base pairs. Using this tiling library, we systematically screened for regulators of viral late gene transcription. In addition to capturing the majority of known components of this system, we identify novel viral regulators of late gene expression along with higher resolution data highlighting essential domains of these proteins. In particular, we validate that KSHV ORF46 is required for viral DNA replication and thus also for late gene transcription. Targeted deep sequencing of the ORF46 locus provided additional mechanistic insight by identifying the mutations created by Cas9 and implicating an essential role for its DNA binding but not its catalytic domain, which we confirmed through reverse genetics experiments. Collectively, this pipeline provides a framework to create and phenotype thousands of viral mutants to gain high-resolution functional information on KSHV and other dsDNA viral genomes.
Results
BAC16 K8.1pr-mIFP2 infected iSLK cells enables fluorescent readout of late gene activity
We first developed a reporter line that could be used to measure the expression from a late gene promoter. We infected the human renal carcinoma cell line iSLK [33,34] with a modified version of the BAC16 KSHV genome containing a far-red fluorescent protein (mIFP2) driven by the promoter of the viral late gene K8.1 as well as a constitutive EGFP (S1A Fig). The iSLK cell line contains a doxycycline-induced copy of ORF50 –the major lytic transactivator–allowing us to reactivate KSHV upon treatment with doxycycline and sodium butyrate. As expected, expression of the IFP late gene reporter peaked at late times post reactivation and was sensitive to inhibitors of viral DNA replication (Fig 1A, 1B, and 1C). We did note some reactivation-dependent autofluorescence in this channel from wildtype BAC16 (Fig 1A, 1B, and 1C), which may limit the sensitivity of the assays.
Separately, we confirmed our ability to create functional mutants in KSHV using CRISPR. We lentivirally delivered Cas9 along with sgRNAs targeting various regions of the KSHV genome into a BAC16-infected iSLK line and used a supernatant transfer assay to evaluate their impact on virion production (S1B Fig). Although the guides targeting the essential viral loci elicited the strongest reduction in infectious virion production, sgRNAs targeting some nonessential regions of the viral genome did appear to have a functional effect above background, perhaps through DNA damage on the viral genome (Figs 1D and S1C).
Pooled tiling screen of KSHV genome for modifiers of late gene expression
Cas9 was stably introduced into to the iSLK line infected with the late gene reporter virus to screen an sgRNA library containing all possible ~22,000 sgRNAs targeting the viral genome (S1 Data). After lentiviral delivery of the library, cells were chemically induced to enter the lytic cycle and sorted by FACS based on their reporter expression level (S1D Fig). Composition of high and low late gene expression fractions were quantified by next generation sequencing of the sgRNA locus (Fig 2A and S2 Data), where sgRNAs targeting viral ORFs essential for the expression of late genes were enriched in the late gene low fraction. Enrichment of each guide was reproducible between two replicates, representing a phenotype every ~8 bp across the KSHV genome (S2A and S2B Fig).
By comparing the distribution of sgRNA phenotypes targeting each KSHV ORF to the background of negative control sgRNAs, we reproducibly identified many ORFs required for late gene expression (Figs 2B, 2C, and S2C and S3 Data and S4 Data). These include all six known components of the viral DNA replication machinery (ORF6, ORF9, ORF40/41, ORF44, ORF56, and ORF59) along with 4 of the 6 vTA components (ORF18, ORF24, ORF34, and ORF66). In addition, we identified ORF57, which is known to be required for the efficient expression of ORF6 and other DNA replication components [35]. A viral tegument protein, ORF45 [36], showed mixed results in validation and was not pursued for further characterization (Fig 2D and 2E). Guides targeting ORF50, the viral reactivation factor, were unexpectedly enriched in late gene expression, but this phenotype was not reproduced in validation experiments, which showed a strong defect in reactivation as measured by expression of the HaloTag-fused ORF68 early gene and a moderate defect in late gene expression (S2D and S2E Fig). ORF46, a DNA repair enzyme [37] not previously associated with late gene transcription, also showed severe defects in late gene expression which we subsequently validated with individual guides (Fig 2D and 2E). Together, this demonstrates the identification of both known and novel viral modifiers of late gene expression. This includes both direct activators of late gene transcription (e.g., the vTAs) as well as potentially indirect regulators (e.g., DNA replication machinery).
Deep viral sequencing identifies ORF46 requirement for viral DNA replication
To further investigate the role of ORF46 in late gene expression, we deep sequenced several viral loci to identify and phenotype mutations created by Cas9 nuclease activity (Fig 3A). This approach enables us to distinguish between Cas9-created mutations that result in a knockout versus an amino acid change and to screen for functional defects associated with a given mutation using phenotypes such as viral DNA replication and virion production (S3A Fig). We amplified and sequenced ORF46 – along with one negative control locus, ORF21, and two components of the vTA complex, ORF18 and ORF34 – from cells in three different viral states. These included (1) latently infected cells (where mutations in these genes should have little effect), (2) cells 48 hours post reactivation, at which time the viral genome is being replicated, and (3) from supernatant 72 hours post reactivation when virions are released into the media (Fig 3A and S5 Data and S1 Table).
By comparing the frequencies of out-of-frame mutations between these conditions, we can directly measure the effect of each gene knockout on both viral production and viral DNA replication. We observed a depletion in ORF46 knockout alleles in cells undergoing viral DNA replication as well as a depletion of knockout alleles from viral DNA in the supernatant (Figs 3B and S3B). In contrast, knockout alleles of vTA components ORF34 and ORF18 were only depleted in supernatant samples, consistent with their specific requirement after viral DNA replication (Figs 3B and S3B). Knockout mutations in a nonessential control, ORF21, are mildly depleted in both samples (Figs 3B and S3B); this is consistent with the previously observed effect of Cas9 activity on virion production (Fig 1D). The relative depletion of ORF46 knockouts from replicating viral DNA suggests that ORF46 acts prior to late gene expression and is required for viral DNA replication.
To independently test the requirement of ORF46 for viral DNA replication, we delivered the panel of sgRNAs from Fig 2D into an independent KSHV-infected Cas9+ iSLK line; the sgRNA panel included targets for both ORF57 and ORF9, both of which are required for viral DNA replication, as well as for ORF46 and ORF45. None of the genes targeted, including ORF46, was required for early gene expression, as measured by expression of the HaloTag-fused ORF68 early gene (Fig 3C). Guides targeting ORF46 interfered with incorporation of EdU, suggesting a viral DNA replication defect (Fig 3D), which was confirmed for the ORF46 sgRNAs by qPCR of viral DNA (Fig 3E). Guides targeting ORF45 showed inconclusive effects. Together, these data support a specific requirement of ORF46 for viral DNA replication, which is bolstered by data with the ORF46 homolog in EBV [38]. Notably, ORF46 disruption phenocopied disruption of the viral DNA polymerase ORF9, demonstrating the key role ORF46 plays in viral DNA replication.
ORF46 residues involved in DNA binding but not catalytic activity are required for viral DNA replication
Further analysis of the viral deep-sequencing data allowed us to identify in-frame mutations that do not cause knockouts but instead result in deletions or insertions at the amino acid level in each locus observed. By grouping these mutations by where they target along the gene body, we further refined what regions of these proteins are essential for their role in the viral life cycle, analogous to previous work on the human genome [31]. For ORF18 and ORF34, which are direct late gene transcriptional activators, we examined the relative depletion of in-frame alleles between cells undergoing viral DNA replication and cells releasing virions into the supernatant (Fig 4A and 4B). For ORF46, which likely indirectly activates late genes through its role in DNA replication, we examined the relative depletion of in-frame alleles between cells containing latent virus and those undergoing viral DNA replication (Fig 4C).
Indeed, depletion of in-frame alleles in ORF18 corresponded with previously identified residues required for interaction with other vTA complex members ORF30, ORF31, and ORF66 (Fig 4A and S6 Data) [39], and several regions observed in ORF34 correspond to previously phenotyped truncation mutants with defects in binding the vTAs ORF24 or ORF66 (Fig 4B) [40]. In ORF46, we noted that these in-frame mutations near the C-terminus were depleted from replicating DNA samples relative to the rest of the coding region (Fig 4C). This suggests that the DNA-binding domain encoded in the C-terminus is required for viral DNA replication but that the catalytic domain is not. We also observed signal both upstream and downstream of the coding region of ORF46, which may correspond to disruption of the adjacent ORF45 or perturbation to regulatory elements of ORF46.
To test the hypothesis that the role of ORF46 is dependent on its DNA binding activity, we individually mutated residues previously shown for EBV to be required for DNA binding (H210L) or catalysis (Q87L, D88N) in ORF46 using BAC mutagenesis [38]. Reactivation of these mutant viruses confirmed that the DNA binding domain but not the catalytic domain of ORF46 is required for replication of viral DNA and the production of virus (Figs 4D and S3C). Importantly, we also observed that the mutants still express early genes, supporting a defect specific to vDNA replication (S3D Fig). Together these data orthogonally confirm our prediction that ORF46 and its DNA binding domain are required for viral DNA replication.
Discussion
Late gene expression in oncogenic gammaherpesviruses is a mechanistically unique, essential and potentially targetable viral process. Here we applied an exhaustive tiling approach to KSHV to identify viral proteins required for expression of viral late genes. We identified most of the known late gene regulators and confirmed that one additional protein, ORF46, is required for this process. Targeted deep sequencing of the viral mutants created by Cas9 suggested that the DNA binding domain of ORF46 is required for viral DNA replication, a prediction we validated through viral mutagenesis. This two-level screening approach is broadly applicable to connecting phenotype to genotype for a variety of processes involved in the KSHV lifecycle–as well as more broadly for other dsDNA viruses.
Compact viral genomes present both unique challenges and opportunities for genomics-based approaches to study viral phenotypes. On one hand, the high functional density in a relatively small genome allows approaches such as ours to perturb every element. For example, here we tile all ~80 ORFs and their cis regions with fewer guides than previous studies have used to tile just three human genes [41]. On the other hand, viral loci often encode multifunctional proteins and can also encompass overlapping ORFs and noncoding regulatory sequences, which means that some mutations may impact multiple viral features [5]. This is true for our study as well, as the ORF46 DNA binding domain also overlaps with the promoter and TSS of the adjacent gene ORF45. Thus, although the CRISPR nuclease is less likely to perturb non-coding elements [41], some of the signal we see may be from effects on ORF45 transcription (Fig 4C). For high-throughput experiments, screening for additional phenotypes may be required to separate the multiple functions encoded by the virus.
ORF46 encodes for a uracil-DNA N-glycosylase (UNG)–a DNA repair enzyme responsible for the removal of uracil from DNA [37]–and UNG homologs in EBV, human cytomegalovirus (HCMV), and vaccinia virus are all required for viral DNA replication [38,42–44]. Notably, the DNA binding domain but not UNG activity is required for DNA replication in both EBV [38] and KSHV, suggesting that UNGs may play conserved structural rather than enzymatic roles in genome replication for multiple dsDNA viruses.
Beyond ORF46, our initial screen highlighted additional viral genes with more modest effects than the canonical late gene regulators (S4 Data). While their roles in late gene expression remain unvalidated, they may represent a deeper cast that either play a minor role or whose signal is limited technically. Technical limitations such as poor sgRNA performance or number may also explain the weak signal from smaller genes such as ORF30 and ORF31, two additional components of the vTA complex. An additional limitation is that less than 25% of cells detectably express the late gene reporter (Fig 1A); while this percentage is consistent with previous reports [45,46], cell models allowing for more efficient late gene expression or detection should further boost the dynamic range and sensitivity. While individual sgRNAs targeting the major transactivator ORF50 demonstrated the expected early-stage block to lytic cycle progression (Figs 1D, S2D, and S2E), it was surprising that they were positively enriched in the late gene screen (Fig 2B and 2C). This may represent an artifact of the screen design, as they target both the ORF50 on the viral genome and the dox-inducible ORF50 integrated in the iSLK genome. It is also possible that ORF50 has bifunctional roles in promoting reactivation but suppressing late gene expression prior to DNA replication, although additional work would be needed to evaluate this hypothesis. Additionally, positive signals were detected for KSHV genes required for latency, such as ORF73 (LANA) and the GFP/HygroR cassette (Fig 2B and 2C), but their relevancy to late gene expression is unclear given that the screen was performed under strong selection for latency maintenance. We have previously observed that guides targeting ORF75 caused a similar defect in latency maintenance.
The tiling library is highly modular, as it can be used for example with base editors to create base pair mutations, CRISPRi to induce transcriptional repression or dCas9 to block transcription factor binding, as well as in combination with other reporters of viral activity. Overall, the tiling approach is a powerful method for functional interrogation of KSHV and other DNA viruses with the added potential to highlight essential domains and unlock a wealth of information about viral biology.
Methods
Plasmids and oligos
Sequences of primers and sgRNAs used are listed in S7 Data. pMD2.G (Addgene plasmid # 12259), pMDLg/pRRE (Addgene plasmid # 12251) and pRSV-Rev (Addgene plasmid # 12253) were gifts from Didier Trono. pMCB320 was a gift from Michael Bassik (Addgene plasmid # 89359). lentiCas9-Blast was a gift from Feng Zhang (Addgene plasmid # 52962).
Cell culture
HEK293T and iSLK cells were grown in DMEM (Gibco, + glut, + glucose, -pyruvate) with 10% FBS (Peak Serum), pen-strep (Gibco; 10,1000 I.U./mL), and additional 2 mM glutamine (Gibco) or 1X Glutamax (Gibco). Cells were maintained at 37°C degrees and 5% CO2 in a humidity-controlled incubator. iSLK cells were maintained in 1 μg/mL puromycin and 50 μg/mL G418, with infected iSLK cells additionally maintained in 125–200 μg/mL Hygromycin. Cas9+ cells were maintained in 10 μg/mL blasticidin. 0.05% Trypsin (Gibco) was used to passage cells.
Creation and characterization of late gene reporter cell line
A late gene reporter was designed using 100 bp upstream of the K8.1 ORF driving a mIFP2 fluorescent cassette. 150 bp of homology to the BAC region of BAC16 was included along with a KAN resistance cassette to allow the insertion into BAC16 genome [33]. The viral genome was then purified using the Macherey-Nagel NucleoBond BAC 100 kit and transfected into HEK293T cells using PolyJet (SignaGen). Transfected cells were then cocultured with uninfected iSLK cells and reactivated using 0.33 mM sodium butyrate and 20 ng/mL PMA. Infected iSLK cells were then selected using 1 μg/mL puromycin, 50 μg/mL G418, and 200 μg/mL hygromycin.
To test the kinetics of late-gene reporter expression, iSLK-BAC16-K8.1pr-mIFP2 cells were reactivated using 1 mM sodium butyrate and 5 μg/mL doxycycline, and timepoints were collected every 24 hours. Cells were then fixed in 4% PFA and quantified using flow cytometry (BD LSRFortessa). To test the sensitivity of the late gene reporter-infected lines, cells were reactivated as above and additionally treated with viral polymerase inhibitors 500 μM PAA or 100 μM CDV; late-gene reporter activity of fixed cells was quantified by flow cytometry 72 hours post reactivation (BD LSRFortessa). To test the production of virus in these cells, supernatant from reactivated samples was 0.45 um filtered and added to uninfected HEK293T cells. After 24 hours, HEK293T cells were fixed, and GFP was quantified by flow cytometry (BD Accuri C6 Plus). Replicates were independent reactivations on separate days.
Creation and characterization of Cas9 positive line
To create a Cas9 positive iSLK-BAC16-K8.1pr-mIFP2 line, Cas9-blast (Addgene #52962) was inserted lentivirally: Cas9-blastR and 3rd generation lentiviral components were transfected into HEK293T cells using PEI (Polysciences). Supernatant was harvested, 45um filtered, and added to iSLK-BAC16-K8.1pr-mIFP2 line for three days. Cells were then selected using 1 μg/mL blasticidin, 1 μg/mL puromycin, 200 μg/mL hygromycin, and 50 μg/mL G418.
To demonstrate the editing activity, three independent sgRNAs were designed targeting two noncoding regions of the BAC and four viral genes. sgRNAs were designed using IDT Custom Alt-R designer. As controls, three safe-targeting sgRNAs were used that target the host in predicted non-functional regions [47]. sgRNAs were cloned into a mU6-driven guide expression plasmid (Addgene #89359) and delivered lentivirally at high MOI. After 10 days, cells were reactivated by 5 μg/mL doxycycline and 1 mM sodium butyrate, and viral supernatant was collected and 0.45 um filtered after 72 hours. Supernatant was transferred to uninfected HEK293T cells, and viral infectivity was measured using GFP by flow cytometry (BD Accuri C6 Plus). Replicates were independent reactivations on separate days.
Design of KSHV tiling library
A KSHV tiling library was designed using custom python scripts. Each NGG PAM was identified in the reference genome [33] (S1 and S3 Data) along with the corresponding 19 bp guide. Only one copy of sgRNAs that targeted the KSHV genome more than once was included. The library was cloned using a modified protocol from Deans et al 2016[48]. sgRNAs were synthesized along with primer binding sites and BstXI/BlpI restriction sites using Twist Biosciences. The library was amplified using corresponding primers for 10 cycles. Amplified library was PCR purified (Qiagen MinElute) and restricted using BstXI and BlpI. 34 bp fragment was purified in water using a native 20% PAGE gel and Spin-X centrifuge tube filters (Costar). Fragment was isoproponal precipitated and ligated into a sgRNA expression plasmid (Addgene #89359) using T4 ligase overnight at 16 degrees and electroporated into Lucigen Endura cells (1800V, 600 ohm, 10μF, 1mm) before plating on 4 assay plates (reduced 75% CARB). Colonies were grown at 37 degrees C overnight, pooled, and maxi prepped (Macherey-Nagel).
Screen and analysis of late gene screen
Cas9+ iSLK-BAC16-K8.1pr-mIFP2 cells were infected lentivirally with the KSHV tiling library. After 10 days, cells were reactivated using 1 mM sodium butyrate and 5 μg/mL doxycycline. Two days post reactivation, cells were trypsinized, washed 2x with dPBS, and fixed in 4% PFA for 10 minutes before being washed 2x in dPBS again. Cells were then sorted based on mIFP2 expression using a BD Aria, sorting for top 25% expression vs bottom 50% expression (BD Aria). Sorted cells were then unfixed overnight in 50 μg/mL proteinase K (Promega) plus 150 mM NaCl at 65°C. Genomic DNA was extracted using the Qiagen blood mini kit. As described in Deans et al 2016 [48], the sgRNA locus was amplified using primers and Herculase II (Agilent) with 20 cycles. Nextera-index adapters were then ligated by PCR using 18 cycles. ~285 bp product was gel extracted and quantified with Qubit and an Agilent Bioanalyzer before sequencing on a HiSeq 4000.
Average enrichment of sgRNAs from two replicates were calculated comparing the late-gene high and late gene low fractions using a median normalized log ratio of fraction of counts as previously described [49]. sgRNAs that targeted the viral genome at multiple loci were removed from the analysis. Each region of the viral genome was split using previous annotations (S3 Data), and the median enrichment of all sgRNAs targeting each region was compared to all negative control sgRNAs using a Mann-Whitney U test.
Screen validation, early gene expression, and viral DNA replication
Three sgRNAs from the screen were selected, cloned as above, lentivirally delivered to late gene expression reporter cells, and assayed for virion production and late gene expression activity as above. sgRNAs were also delivered to Cas9+ iSLK cells infected with a modified BAC16 virus containing an N-terminal HaloTag fusion to the early viral gene ORF68. 24 hours post reactivation with 1 mM sodium butyrate and 5 μg/mL doxycycline, cells were treated with 20 nM HaloTag JF647 (Promega) and ORF68 levels were quantified by flow cytometry (Accuri C6 plus). To quantitate viral DNA replication, 20 μM EdU was added to the media 48 hours post reactivation for two hours. Cells were then trypsinized and fixed using 4% PFA. EdU was labeled with Cy5 using the Click-IT flow cytometry kit (Invitrogen) and quantified by flow cytometry (BD Accuri C6 Plus). Using unreactivated cells as a control, little host DNA replication was observed in reactivated cells. EdU levels below those observed in unreactivated cells but above background were quantified as viral DNA replication. To quantitate the amount of viral DNA, 48 hours after reactivation, DNA was extracted using DNA QuickExtract (Lucigen), and qPCR was performed using iTaq Universal SYBR Green (Bio-rad) amplifying the ORF57 promoter region of KSHV (QuantStudio 3). Viral quantities were calculated by standard curve. Replicates were independent reactivations on separate days.
Design and analysis of targeted viral sequencing
Cas9+ iSLK-BAC16-K8.1pr-mIFP2 cells containing the KSHV tiling library were reactivated using 1 mM sodium butyrate and 5 μg/mL doxycycline. Genomic DNA was recovered from latent cells prior to reactivation and 48 hours post reactivation using a QIAamp DNA blood Mini kit (Qiagen). To collect viral supernatant, media was replaced 48 hours post reactivation and collected 72 hours post reactivation. Viral supernatant was then mixed with appropriate volume of Qiagen AL buffer and Qiagen proteinase. Sample was then spun over a single QIAamp blood mini column, washed, and eluted as the manual indicates.
PCR was performed from each DNA sample for four regions (ORF21, ORF34, ORF46, and ORF18) using Herculase II (Agilent). Each region was then PCR purified (GeneJet; Thermo) and pooled in equal nanogram amounts. 500 ng of each pooled sample was then prepped using Illumina DNA prep and sequenced on a single lane of SP 150PE NovaSeq 6000.
Reads were trimmed using cutadapt [50], and aligned to the BAC16 genome using bwa mem [51]. Indel counts were then extracted from CIGAR codes, with identical CIGAR codes excluded as duplicates. Complex codes not corresponding to simple indels were excluded. Deletions and insertions at the same position and same size observed in a control sample amplified from purified BAC were excluded as artifacts of library prep (S5 Data and S1 Table). Insertions or deletions whose size were divisible by three were classified as in-frame mutations and all others were classified as out-of-frame mutations. Mutation counts and coverage were calculated as the sum of two replicates for each condition. Basepairs with less than 10^5 coverage were excluded, and a single read was added to each basepair to prevent division by zero. Mutation frequencies across each 100 bp were averaged and a log enrichment value was calculated between either the latent sample and the 48-hour sample, or the 48-hour sample and the supernatant sample. For each comparison, the difference in log enrichment between in-frame mutations and out-of-frame mutations was calculated and used as the relevant signal.
ORF46 BAC mutagenesis
ORF46 DNA binding (H210L) or catalytic (Q87L, D88N) mutants were introduced into BAC16 using red recombination [33]. Briefly, homology arms were used to insert mutations along with a KAN resistance cassette into the endogenous ORF46 locus. The KAN cassette was removed by recombination, and the scarless edit was confirmed by Sanger sequencing. Viral genomes were then purified using the Macherey-Nagel NucleoBond BAC 100 kit and transfected into HEK293T cells using PolyJet (SignaGen). Transfected cells were then cocultured with uninfected iSLK cells and reactivated using 0.33 mM sodium butyrate and 20 ng/mL PMA. Infected iSLK cells were then selected using 1 μg/mL puromycin, 50 μg/mL G418, and 125 μg/mL hygromycin.
iSLK cells infected with the mutant BAC16 viruses were then reactivated using 5 μg/mL doxycycline and 1 mM sodium butyrate. After 72 hours, supernatant was 0.45 um filtered and transferred to uninfected HEK293T cells. 24 hours later, infection was monitored by EGFP expression using flow cytometry (BD Accuri C6 plus). Replicates were independent reactivations from a single experiment.
Separately, DNA was extracted from cells 48 hours post reactivation using Lucigen DNA QuickExtract. Using iTaq Universal SYBR Green (Bio-rad), vDNA replication was then measured using qPCR of the viral ORF57 promoter. Unreactivated cells were used as a control. Replicates were independent reactivations from separate days.
RNA was extracted from cells 48 hours post reactivation using Lucigen RNA quickextract. RNA extraction was performed using Lucigen RNA QuickExtract. Turbo DNAse (Invitrogen) was used to remove DNA. AMV RT (Promega) with 9 bp random primers was used for reverse transcription. qPCR was performed using iTaq Universal SYBR Green (Bio-rad) amplifying the coding region of ORF6 along with primers targeting the host 18S RNA. Replicates are from multiple reactivations performed on the same day.
Supporting information
Acknowledgments
We thank E. Hartenian, X.J. Mao, L. Coscoy, and members of the Glaunsinger and Coscoy labs for helpful discussion. We thank C.K. Tsui for their support and review of the manuscript. Flow cytometry and FACS were conducted at the CRL Flow Cytometry Facility. We thank Hector Nolla and Alma Valeros of the UC Berkeley Cancer Research Laboratory Flow Cytometry Facility for training and expertise.
Data Availability
Count files, full screen results, and indel counts for deep sequencing experiments are available as supplementary data. Raw sequencing data is deposited at SRA (BioProject PRJNA789264).
Funding Statement
A.L.D. is the Rhee Family Fellow of the Damon Runyon Cancer Research Foundation (DRG-2349-18). D.W.M. is a Howard Hughes Medical Institute Awardee of the Life Sciences Research Foundation. This work used the Vincent J. Coates Genomics Sequencing Laboratory at UC Berkeley, supported by NIH S10 OD018174 Instrumentation Grant. Work was supported by a Research for Innovation on Delivery of Editing Reagents award from the Innovative Genomics Institute and the NIAID (R01AI122528 to B.A.G.). B.A.G. is an investigator and receives a salary from the Howard Hughes Medical Institute (HHMI). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Zapatka M, Borozan I, Brewer DS, Iskar M, Grundhoff A, Alawi M, et al. The landscape of viral associations in human cancers. Nat Genet 2020 523. 2020;52: 320–330. doi: 10.1038/s41588-019-0558-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Thompson MP, Kurzrock R. Epstein-Barr Virus and Cancer. Clinical Cancer Research. American Association for Cancer Research; 2004. pp. 803–821. doi: 10.1158/1078-0432.ccr-0670-3 [DOI] [PubMed] [Google Scholar]
- 3.Ganem D. KSHV and the pathogenesis of Kaposi sarcoma: listening to human biology and medicine. J Clin Invest. 2010;120: 939–49. doi: 10.1172/JCI40567 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Minhas V, Wood C. Epidemiology and transmission of Kaposi’s sarcoma-associated herpesvirus. Viruses. 2014;6: 4178–94. doi: 10.3390/v6114178 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Arias C, Weisburd B, Stern-Ginossar N, Mercier A, Madrid AS, Bellare P, et al. KSHV 2.0: A Comprehensive Annotation of the Kaposi’s Sarcoma-Associated Herpesvirus Genome Using Next-Generation Sequencing Reveals Novel Genomic and Functional Features. Dittmer DP, editor. PLoS Pathog. 2014;10: e1003847. doi: 10.1371/journal.ppat.1003847 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Ye X, Zhaoid Y, Karijolich J. The landscape of transcription initiation across latent and lytic KSHV genomes. PLoS Pathog. 2019;15: e1007852. doi: 10.1371/journal.ppat.1007852 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Toth Z, Maglinte DT, Lee SH, Lee HR, Wong LY, Brulois KF, et al. Epigenetic analysis of KSHV latent and lytic genomes. PLoS Pathog. 2010;6: 1–17. doi: 10.1371/journal.ppat.1001013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Brackett K, Mungale A, Lopez-Isidro M, Proctor DA, Najarro G, Arias C. Crispr interference efficiently silences latent and lytic viral genes in kaposi’s sarcoma-associated herpesvirus-infected cells. Viruses. 2021;13. doi: 10.3390/v13050783 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Gruffat H, Marchione R, Manet E. Herpesvirus late gene expression: A viral-specific pre-initiation complex is key. Frontiers in Microbiology. Frontiers Research Foundation; 2016. doi: 10.3389/fmicb.2016.00869 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Davis ZH, Verschueren E, Jang GM, Kleffman K, Johnson JR, Park J, et al. Global mapping of herpesvirus-host protein complexes reveals a transcription strategy for late genes. Mol Cell. 2015;57: 349–60. doi: 10.1016/j.molcel.2014.11.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Wong-Ho E, Wu T-T, Davis ZH, Zhang B, Huang J, Gong H, et al. Unconventional Sequence Requirement for Viral Late Gene Core Promoters of Murine Gammaherpesvirus 68. J Virol. 2014;88: 3411–3422. doi: 10.1128/JVI.01374-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Serio TR, Cahill N, Prout ME, Miller G. A Functionally Distinct TATA Box Required for Late Progression through the Epstein-Barr Virus Life Cycle. J Virol. 1998;72: 8338–8343. doi: 10.1128/JVI.72.10.8338-8343.1998 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Tang S, Yamanegi K, Zheng Z-M. Requirement of a 12-Base-Pair TATT-Containing Sequence and Viral Lytic DNA Replication in Activation of the Kaposi’s Sarcoma-Associated Herpesvirus K8.1 Late Promoter. J Virol. 2004;78: 2609–2614. doi: 10.1128/jvi.78.5.2609-2614.2004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Isomura H, Stinski MF, Murata T, Yamashita Y, Kanda T, Toyokuni S, et al. The human cytomegalovirus gene products essential for late viral gene expression assemble into prereplication complexes before viral DNA replication. J Virol. 2011;85: 6629–6644. doi: 10.1128/JVI.00384-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Perng YC, Campbell JA, Lenschow DJ, Yu D. Human Cytomegalovirus pUL79 Is an Elongation Factor of RNA Polymerase II for Viral Gene Transcription. PLOS Pathog. 2014;10: e1004350. doi: 10.1371/journal.ppat.1004350 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Aubry V, Mure F, Mariamé B, Deschamps T, Wyrwicz LS, Manet E, et al. Epstein-Barr Virus Late Gene Transcription Depends on the Assembly of a Virus-Specific Preinitiation Complex. J Virol. 2014;88: 12825–12838. doi: 10.1128/JVI.02139-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Davis ZH, Hesser CR, Park J, Glaunsinger BA. Interaction between ORF24 and ORF34 in the Kaposi’s Sarcoma-Associated Herpesvirus Late Gene Transcription Factor Complex Is Essential for Viral Late Gene Expression. J Virol. 2016;90: 599–604. doi: 10.1128/JVI.02157-15 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Djavadian R, Chiu Y-F, Johannsen E. An Epstein-Barr Virus-Encoded Protein Complex Requires an Origin of Lytic Replication In Cis to Mediate Late Gene Transcription. Lieberman PM, editor. PLOS Pathog. 2016;12: e1005718. doi: 10.1371/journal.ppat.1005718 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wang Y, Li H, Tang Q, Maul GG, Yuan Y. Kaposi’s sarcoma-associated herpesvirus ori-Lyt-dependent DNA replication: involvement of host cellular factors. J Virol. 2008;82: 2867–82. doi: 10.1128/JVI.01319-07 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.van Diemen FR, Kruse EM, Hooykaas MJG, Bruggeling CE, Schürch AC, van Ham PM, et al. CRISPR/Cas9-Mediated Genome Editing of Herpesviruses Limits Productive and Latent Infections. PLoS Pathog. 2016;12. doi: 10.1371/journal.ppat.1005701 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Tso FY, West JT, Wood C. Reduction of Kaposi’s Sarcoma-Associated Herpesvirus Latency Using CRISPR-Cas9 To Edit the Latency-Associated Nuclear Antigen Gene. J Virol. 2019;93. doi: 10.1128/JVI.02183-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Walter M, Verdin E. Viral gene drive in herpesviruses. Nat Commun. 2020;11: 1–11. doi: 10.1038/s41467-019-13993-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Roehm PC, Shekarabi M, Wollebo HS, Bellizzi A, He L, Salkind J, et al. Inhibition of HSV-1 Replication by Gene Editing Strategy. Sci Rep. 2016;6. doi: 10.1038/srep23146 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Karpov DS, Karpov VL, Klimova RR, Demidova NA, Kushch AA. A Plasmid-Expressed CRISPR/Cas9 System Suppresses Replication of HSV Type I in a Vero Cell Culture. Mol Biol (Mosk). 2019;53: 91–100. doi: 10.1134/S0026898419010051 [DOI] [PubMed] [Google Scholar]
- 25.Wang J, Quake SR. RNA-guided endonuclease provides a therapeutic strategy to cure latent herpesviridae infection. Proc Natl Acad Sci U S A. 2014;111: 13157–13162. doi: 10.1073/pnas.1410785111 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Suenaga T, Kohyama M, Hirayasu K, Arase H. Engineering large viral DNA genomes using the CRISPR-Cas9 system. Microbiol Immunol. 2014;58: 513–522. doi: 10.1111/1348-0421.12180 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.BeltCappellino A, Majerciak V, Lobanov A, Lack J, Cam M, Zheng Z-M. CRISPR/Cas9-Mediated Knockout and In Situ Inversion of the ORF57 Gene from All Copies of the Kaposi’s Sarcoma-Associated Herpesvirus Genome in BCBL-1 Cells. J Virol. 2019;93. doi: 10.1128/JVI.00628-19 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Yuen KS, Chan CP, Wong NHM, Ho CH, Ho TH, Lei T, et al. CRISPR/Cas9-mediated genome editing of Epstein–Barr virus in human cells. J Gen Virol. 2015;96: 626–636. doi: 10.1099/jgv.0.000012 [DOI] [PubMed] [Google Scholar]
- 29.Gabaev I, Williamson JC, Crozier TWM, Schulz TF, Lehner PJ. Quantitative Proteomics Analysis of Lytic KSHV Infection in Human Endothelial Cells Reveals Targets of Viral Immune Modulation. Cell Rep. 2020;33. doi: 10.1016/j.celrep.2020.108249 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hein MY, Weissman JS. Functional single-cell genomics of human cytomegalovirus infection. Nat Biotechnol 2021. 2021; 1–11. doi: 10.1038/s41587-021-01059-3 [DOI] [PubMed] [Google Scholar]
- 31.Shi J, Wang E, Milazzo JP, Wang Z, Kinney JB, Vakoc CR. Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains. Nat Biotechnol. 2015;33: 661–667. doi: 10.1038/nbt.3235 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.He W, Zhang L, Villarreal OD, Fu R, Bedford E, Dou J, et al. De novo identification of essential protein domains from CRISPR-Cas9 tiling-sgRNA knockout screens. Nat Commun. 2019;10: 1–10. doi: 10.1038/s41467-018-07882-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Brulois KF, Chang H, Lee AS-Y, Ensser A, Wong L-Y, Toth Z, et al. Construction and Manipulation of a New Kaposi’s Sarcoma-Associated Herpesvirus Bacterial Artificial Chromosome Clone. J Virol. 2012;86: 9708–9720. doi: 10.1128/JVI.01019-12 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Myoung J, Ganem D. Generation of a doxycycline-inducible KSHV producer cell line of endothelial origin: Maintenance of tight latency with efficient reactivation upon induction. J Virol Methods. 2011;174: 12–21. doi: 10.1016/j.jviromet.2011.03.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Verma D, Li D-J, Krueger B, Renne R, Swaminathan S. Identification of the Physiological Gene Targets of the Essential Lytic Replicative Kaposi’s Sarcoma-Associated Herpesvirus ORF57 Protein. J Virol. 2015;89: 1688–1702. doi: 10.1128/JVI.02663-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Zhu FX, Yuan Y. The ORF45 Protein of Kaposi’s Sarcoma-Associated Herpesvirus Is Associated with Purified Virions. J Virol. 2003;77: 4221–4230. doi: 10.1128/jvi.77.7.4221-4230.2003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Earl C, Bagnéris C, Zeman K, Cole A, Barrett T, Savva R. A structurally conserved motif in γ-herpesvirus uracil-DNA glycosylases elicits duplex nucleotide-flipping. Nucleic Acids Res. 2018;46: 4286–4300. doi: 10.1093/nar/gky217 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Su M-T, Liu I-H, Wu C-W, Chang S-M, Tsai C-H, Yang P-W, et al. Uracil DNA Glycosylase BKRF3 Contributes to Epstein-Barr Virus DNA Replication through Physical Interactions with Proteins in Viral DNA Replication Complex. J Virol. 2014;88: 8883–8899. doi: 10.1128/JVI.00950-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Castañeda AF, Glaunsinger BA. The Interaction between ORF18 and ORF30 Is Required for Late Gene Expression in Kaposi’s Sarcoma-Associated Herpesvirus. J Virol. 2019;93. doi: 10.1128/JVI.01488-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Nishimura M, Watanabe T, Yagi S, Yamanaka T, Fujimuro M. Kaposi’s sarcoma-associated herpesvirus ORF34 is essential for late gene expression and virus production. Sci Rep. 2017;7: 1–12. doi: 10.1038/s41598-016-0028-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Tycko J, Wainberg M, Marinov GK, Ursu O, Hess GT, Ego BK, et al. Mitigation of off-target toxicity in CRISPR-Cas9 screens for essential non-coding elements. Nat Commun. 2019;10. doi: 10.1038/s41467-019-11955-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Pyles RB, Thompson RL. Evidence that the herpes simplex virus type 1 uracil DNA glycosylase is required for efficient viral replication and latency in the murine nervous system. J Virol. 1994;68: 4963–4972. doi: 10.1128/JVI.68.8.4963-4972.1994 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Ranneberg-Nilsen T, Rollag H, Slettebakk R, Backe PH, Olsen Ø, Luna L, et al. The chromatin remodeling factor smarcb1 forms a complex with human cytomegalovirus proteins UL114 and UL44. PLoS One. 2012;7. doi: 10.1371/journal.pone.0034119 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Stuart DT, Upton C, Higman MA, Niles EG, McFadden G. A poxvirus-encoded uracil DNA glycosylase is essential for virus viability. J Virol. 1993;67: 2503–2512. doi: 10.1128/JVI.67.5.2503-2512.1993 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Nakajima K, Guevara-Plunkett S, Chuang F, Wang K-H, Lyu Y, Kumar A, et al. Rainbow Kaposi’s Sarcoma-Associated Herpesvirus Revealed Heterogenic Replication with Dynamic Gene Expression. J Virol. 2020;94. doi: 10.1128/JVI.01565-19 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Brulois K, Toth Z, Wong L-Y, Feng P, Gao S-J, Ensser A, et al. Kaposi’s Sarcoma-Associated Herpesvirus K3 and K5 Ubiquitin E3 Ligases Have Stage-Specific Immune Evasion Roles during Lytic Replication. J Virol. 2014;88: 9335–9349. doi: 10.1128/JVI.00873-14 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Morgens DW, Wainberg M, Boyle EA, Ursu O, Araya CLCL, Tsui CK, et al. Genome-scale measurement of off-target activity using Cas9 toxicity in high-throughput screens. Nat Commun. 2017;8: 15178. doi: 10.1038/ncomms15178 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Deans RM, Morgens DW, Ökesli A, Pillay S, Horlbeck MA, Kampmann M, et al. Parallel shRNA and CRISPR-Cas9 screens enable antiviral drug target identification. Nat Chem Biol. 2016;12: 361–366. doi: 10.1038/nchembio.2050 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Kampmann M, Bassik MC, Weissman JS. Integrated platform for genome-wide screening and construction of high-density genetic interaction maps in mammalian cells. Proc Natl Acad Sci U S A. 2013;110: E2317–26. doi: 10.1073/pnas.1307002110 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal. 2011;17: 10. doi: 10.14806/ej.17.1.200 [DOI] [Google Scholar]
- 51.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25: 1754–60. doi: 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Count files, full screen results, and indel counts for deep sequencing experiments are available as supplementary data. Raw sequencing data is deposited at SRA (BioProject PRJNA789264).