Abstract
SARS-CoV-2 depends on host cell components for infection and replication. Identification of virus-host dependencies offers an effective way to elucidate mechanisms involved in viral infection and replication. If druggable, host factor dependencies may present an attractive strategy for anti-viral therapy. In this study, we performed genome wide CRISPR knockout screens in Vero E6 cells and four human cell lines including Calu-3, UM-UC-4, HEK-293 and HuH-7 to identify genetic regulators of SARS-CoV-2 infection. Our findings identified only ACE2, the cognate SARS-CoV-2 entry receptor, as a common host dependency factor across all cell lines, while other host genes identified were largely cell line specific, including known factors TMPRSS2 and CTSL. Several of the discovered host-dependency factors converged on pathways involved in cell signalling, immune-related pathways, and chromatin modification. Notably, the chromatin modifier gene KMT2C in Calu-3 cells had the strongest impact in preventing SARS-CoV-2 infection when perturbed.
Keywords: CRISPR screens, SARS-CoV-2, Host factors, ACE2, Genome-wide loss-of-function
1. Introduction
Since its identification in late 2019, the highly infectious severe acute respiratory syndrome coronavirus (SARS-CoV-2), the causative agent of COVID-19, has rapidly spread across the world causing a global health crisis. To date, there have been over 650 million infections and greater than 6 million deaths (https://www.worldometers.info/coronavirus/). In addition, multiple variants of SARS-CoV-2 have emerged, causing alarming surges of infections globally and challenging plans for recovery from this pandemic. SARS-CoV-2 is closely related to SARS-CoV-1, which caused the 2002–2004 SARS epidemic, as well as MERS-CoV, the virus responsible for respiratory disease outbreaks in 27 countries since 2012 [1]. Infections can range from asymptomatic or mild cases to severe respiratory disease, which can lead to acute respiratory distress syndrome, septic shock, and multi-organ failure [2]. SARS-CoV-2, after SARS-CoV-1 and MERS-CoV, is the third coronavirus in the last two decades to rise to epidemic, and now, pandemic proportions.
Like all viruses, SARS-CoV-2 is an obligate intracellular pathogen that depends on host cell components for replication. The viral envelope is studded with several proteins, including the Spike (S) glycoprotein that mediates direct virus-host cell interactions [3,4]. The SARS-CoV-2 S-protein is a heavily glycosylated homotrimer composed of two subunits, S1 (receptor binding domain - RBD) and S2 (helical domain that mediates membrane fusion), which act in a coordinate manner to attach to the host cell surface and initiate viral entry [5]. S-protein RBD binds to the host receptor angiotensin-converting enzyme 2 (ACE2), which also serves as the entry receptor for SARS-CoV-1 [6]. High variability in the RBD was the major determinant of cross-species transmission and evolution of SARS-CoV-2 [7]. Furthermore, mutations in the RBD have appeared among pandemic variants, increasing virus transmissibility and potentially compromising efficacy of currently approved treatments and vaccines [8]. ACE2 is highly expressed in the lung, heart, arteries, kidney and intestines, an expression pattern that may in part explain the constellation of symptoms associated with SARS infection [[9], [10], [11], [12], [13]]. In addition to ACE2, the cellular transmembrane serine protease II (TMPRSS2) has been identified as a cofactor that aids viral entry by priming S-protein through cleavage of the S1/S2 and S2’ cleavage sites [14]. Similarly, cathepsin L (CTSL) has also been shown to mediate S priming and facilitate intracellular viral release [15]. Recent studies show that SARS-CoV-2 has a unique insertion that introduces a polybasic furin cleavage site, resulting in a primed S-protein conformation that more readily binds to ACE2 [16]. This cleavage site is absent from SARS-CoV-1 and other 2b betaCoVs [17].
The SARS-CoV-2 life cycle appears to require a mixed array of host factors. A deeper understanding of the host-pathogen interactions of SARS-CoV-2 across a range of permissive models will help shed light on viral pathogenesis and host defense mechanisms. In fact, therapeutic strategies targeting host factors are still important to consider as virus-targeted therapies are more likely to drive viral escape mechanisms and the emergence of novel variants that continue to infect at-risk populations.
Functional genetic screens using genome-wide pooled CRISPR-Cas9 libraries are designed to link genotypes to phenotypes and are ideally suited to understand mechanisms of disease. Over the past five years, CRISPR-based genetic screens have improved our understanding of viral dependencies, providing a powerful means for investigating viral life cycles [18]. A number of recent studies have employed loss-of-function CRISPR-based survival screening approaches to identify host factors that regulate SARS-CoV-2 infection [[19], [20], [21], [22], [23], [24], [25], [26], [27]]. Collectively, these studies have provided valuable insights into the host factors involved in SARS-CoV-2 infection; however, there was minimal overlap in experimental designs and resulting hits between the different studies. These studies primarily employed human cell lines that required exogenous expression of ACE2 and/or TMPRSS2 [14,19,22,[25], [26], [27]], differed in sgRNA libraries used, and varied in the scoring approaches used to call hit genes, all of which could contribute to discrepancies between studies [20].
Here we set out to systematically define host factors that regulate SARS-CoV-2 infection in multiple permissive models. This includes a previously unstudied bladder cancer cell line, UM-UC-4, that endogenously expresses ACE2. Tropism studies were performed in multiple model cell lines, followed by genome-wide loss-of-function CRISPR screens in cell lines deemed permissive to SARS-CoV-2 to identify common SARS-CoV-2 host dependency factors. Due to the high cytopathic effect (CPE) of SARS-CoV-2 in vitro, these screens were predominantly positive selection screens enriching for genes that confer resistance to SARS-CoV-2 cytotoxicity from a handful of surviving cells. As a result of the paucity of SARS-CoV-2 host gene interactions, the biological signal becomes dominated by increased noise and the distribution of guide effects was strongly skewed. To account for these effects, we implemented a scoring strategy adapted from Aregger et al. [28]. Our scoring schema corrects for non-linear relationships between treatment and control arms that result from strong CPE observed in our screens. Furthermore, to accurately identify potentially sparse SARS-CoV-2 host gene interactions, our score models guide-to-guide variability to limit false positive (type I) errors driven by outlier guide effects in the absence of abundant gene effects.
Overall, our results recapitulate that ACE2 is the most prominent pro-viral host factor that is required for SARS-CoV-2 infection and was the only gene that was common between all cell lines studied. Apart from TMPRSS2, there was minimal overlap among other genes identified between cell lines. However, pathway analysis of the combined hits from all screens, including those found in previous studies, revealed diverse cellular pathways involved in modulating signalling processes to evade proteotoxic stress, immune pathways, and epigenetic chromatin modification.
2. Results
2.1. Tropism of SARS-CoV-2 in cell culture models
The SARS lineage of coronaviruses (i.e. SARS-CoV-1 and SARS-CoV-2) are typically isolated and grown on African green monkey (Chlorocebus sp.)-derived Vero E6 kidney epithelial cells, which naturally express high levels of the ACE2 receptor [29]. To date, human cell lines reported to support SARS-CoV-2 replication include Caco-2 colorectal adenocarcinoma cells, Calu-3 lung adenocarcinoma cells, as well as HuH-7 and HuH-7.5 hepatocellular carcinoma cells, with the extent of infection varying between cell lines [[29], [30], [31], [32]]. To overcome host cell restriction to SARS-CoV-2 infection, most cell lines used for in vitro studies have been engineered to exogenously express ACE2, with or without exogenous expression of TMPRSS2. For example, A549 lung adenocarcinoma, HEK-293 embryonic kidney and HuH-7.5 liver cells are often used in conjunction with overexpression of ACE2 for studying SARS-CoV-2 cellular infection [33].
Before starting genome-wide CRISPR screens for host cell entry factors, we assessed the susceptibility of human cell lines originating from various tissues focusing on the cell lines shown to be susceptible to SARS-CoV-1 [34]. This included Caco-2 (colon cancer), Calu-3 (lung cancer), HuH-7 (liver cancer), and UM-UC-4 cells, a bladder carcinoma cell line with one of the highest ACE2 gene expression levels according to data from the Cancer Cell Line Encyclopedia (CCLE) [35]. In addition, we assessed HEK-293 cells transduced with ACE2 and TMPRSS2 (HEK293+A+T) [36], and Vero E6 cells, since both of these cell models have been used extensively to study the CPE of SARS-CoV-2 virus. SARS-CoV-2 isolated from early COVID-19 patients in Canada, SARS-CoV-2/SB3-TYAGNC, was used in these studies [37]. The SARS-CoV-2 viral stocks used in our screens were sequenced, confirming minimal variation from the original strain and most importantly an intact polybasic furin cleavage site (Supplemental Table 1). Each cell line was inoculated with SARS-CoV-2 at a multiplicity of infection (MOI) of 1 or 10. Cells were monitored for CPE, and the degree of infection was determined by quantifying SARS-CoV-2 nucleocapsid (N) protein by immunofluorescence (IF) over a period of 24–120 h post-infection (hpi) (Fig. 1a). Typical characteristics of CPE include cell rounding, detachment, degeneration and syncytium formation, and these features were observed in Vero E6, Calu-3, UM-UC-4 and HEK293+A+T cell lines, coinciding with a robust infection as indicated by the high percentage (75–94%) of N-protein positive cells (Fig. 1a and b) and high CPE (>50% cell death). HuH-7 cells showed moderate infection in comparison, with approximately 16% infection 72 hpi and minimal cytopathies. Caco-2 cells were the most refractory to SARS-CoV-2 infection, with minimal detection of N-protein (2–6%) at a high MOI of 10.
Fig. 1.
SARS-CoV-2 infection and induced cytopathic effects in human cell lines.
SARS-CoV-2 infectivity and induced cytopathic effects were determined in Vero E6, Calu-3, HEK293+A+T, UM-UC-4, HuH-7 and Caco-2 cells. Cells were infected with SARS-CoV-2 at MOI of 1 or 10 (Caco-2 cells). At 48 hpi, cells were fixed in 10% NBF and immunostained with anti-SARS N-protein (green), DAPI (blue) and phalloidin (yellow). A) Images were acquired with Phenix Opera system. B) The % cells positive for N-protein staining (green bars, N) was determined by quantifying number of greens cells relative to DAPI stained cells in the same well, while % cell viability was quantified by counting the number of nuclei in the infected cells relative to mock uninfected wells (blue bars, V). All data points generated from at least n = 2 and expressed as mean ± SEM. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
The presence of viral proteins in infected cells was measured over time by mass spectrometry. SARS-CoV-2 structural proteins N (nucleocapsid), M (Membrane) and S (Spike) could be detected as early as 12 hpi and increased over time in permissive cell lines such as Vero, Calu-3 and HuH-7, indicative of productive infection in these cells (Supplemental Fig. 1a-c). However, in Caco-2, both S and M proteins were not detected over time, confirming the refractory nature of these cells to productive SARS-CoV-2 infection (Supplemental Figure 1d).
We next examined transcript levels for critical host entry factors ACE2, CTSL and TMPRSS2 using data from CCLE, and found these factors were expressed at moderate to high levels in the cell lines used here (Supplemental Table 2). Moreover, ACE2 protein levels were detected in all cell lines that were permissive to SARS-CoV-2, except for Caco-2 cells (Fig. 2a). Using flow cytometry, we observed a high degree of cellular heterogeneity for endogenous ACE2 cell surface expression across the different cell lines. However, ectopic expression in HEK293+A+T cells produced homogeneous ACE2 cell surface expression (Fig. 2b). TMPRSS2 protein was detected in Calu-3, UM-UC-4 and in TMPRSS2-transduced HEK293+A+T cells. In contrast, CTSL was highly expressed in Vero E6; moderately in HuH-7 and UM-UC-4 cells; and at undetectable levels in Calu-3, HEK293+A+T, and Caco-2 cells (Fig. 2a). Thus, host cell permissivity to SARS-CoV-2 observed here can be explained solely by ACE2 expression, either endogenously or exogenously.
Fig. 2.
Expression levels of SARS-CoV-2 entry factors across human cell lines.
A) Western blot for ACE2, CTSL and TMPRSS2 expression.
B) Flow cytometry for ACE2 cell surface expression in parental cells.
2.2. Genome-wide CRISPR screens to identify host factors required for SARS-CoV-2 infection across a panel of cell lines
Multiple groups have reported CRISPR-meditated interaction screens over the past year. Together, these screens made use of several different lentiviral-based pooled genome-wide CRISPR libraries (e.g. Brunello, GeCKO, GeCKOv2, Gattinara), human cell lines exogenously expressing ACE2, various infection conditions, and ultimately different scoring methods [[19], [20], [21], [22], [23], [24], [25], [26], [27]]. We sought to identify the consensus cellular host factors required for SARS-CoV-2 infection across different cell lines of varying tissue origin using the same screening conditions. Genome-wide CRISPR screens were performed in the cell lines that were deemed permissive to SARS-CoV-2 infection as described above (Fig. 1). These included Vero E6, three human cell lines with endogenous ACE2 expression Calu-3, HuH-7, UM-UC-4, and one human cell line, HEK293+A+T, with exogenous ACE2 and TMPRSS2 expression. As a result, four human cell lines representing different tissues affected by SARS-CoV-2 infections (lung, liver, bladder, kidney) were screened with the aim of identifying core genetic host factors, as well as context-specific genetic factors required for SARS-CoV-2 infection and replication.
Screens were performed using the sequence-optimized human all-in-one Toronto KnockOut v3 CRISPR-Cas9 library [38] as illustrated in Fig. 3a. The TKOv3 library contains 71,091 gRNAS (4 gRNAs per gene, on average) targeting 18,053 protein-coding genes in the human genome [38]. The same library was used to screen African green monkey Vero E6 cells; guides were mapped to the African green monkey reference genome to identify guides with perfect homology to their targets, resulting in 56% of the TKOv3 library mapping with perfect homology and targeting 15,935 loci in the Vero E6 genome (see Supplemental Table 3). The TKOv3 guides with no targets in the African green monkey genome also had no significant off-target matches, indicating that the non-targeting fraction of TKOv3 library would not increase detection of off-targets (non-specific). Following transduction of host cells with TKOv3 lentivirus at a low multiplicity of infection (∼ MOI 0.3), pooled knockout cells were split into three replicates, and each challenged with a low-passage SB3-TYAGNC isolate of SARS-CoV-2. Representation of the TKOv3 sgRNA library was maintained at a minimum of 200-fold at each step of the screening workflow for each replicate, and cell passaging in untreated populations continued for up to 15 population doublings. Due to the highly cytopathic nature of SARS-CoV-2 in these cells, SARS-CoV-2 treated cells were harvested when cells recovered from virus induced CPE. Fig. 3b details the course of each screen, including the MOI used for infections, the extent of virus induced CPE, and days until cells were harvested. High (70–99%) virus induced CPE was observed in Vero E6, Calu-3, UM-UC-4 and HEK293+A+T cells, even with reduced MOI. Rechallenge of surviving cells with SARS-CoV-2 in Vero E6 and Calu-3 resulted in no additional cell death, indicating a strong resistant population of knockout cells was selected. These cells were further subjected to N-protein immunofluorescence, with no detection of positively infected cells (data not shown). Due to the moderate level of infection observed in HuH-7 cells and minimal CPE, cells were passaged in the presence of SARS-CoV-2 for up to 15 population doublings with two rounds of SARS-CoV-2 infection. N-protein immunofluorescence was detected in these cells, confirming the presence of SARS-CoV-2 infection in the HuH-7 screen (Supplemental Figure 2). For all screens, genomic DNA was harvested from surviving cells, and guide abundance was quantified by next-generation sequencing.
Fig. 3.
Genome-wide CRISPR screens identify host factors required for SARS-CoV-2 infection.
A) Schematic of pooled genome wide CRISPR screens. Host cells were transduced with TKOv3 lentivirus at 200-fold library representation. Following puromycin selection of the transduced population, cells were infected with SARS-CoV-2 or mock treated. Surviving cells were isolated and sgRNA abundance was determined by next-generation sequencing to identify host factors in SARS-treated versus control untreated populations. Essential genes were also determined for each cell line.
B) Experimental timelines for SARS-CoV-2 treatment and cell harvest for each cell line.
C) Daisy model of gene essentiality across different cell lines and distribution of hits. Petals represent essential genes unique to each cell line. The bar graph represents distribution of hits at 5% FDR across the cell lines. Dark grey font/bars represent core essentials; light grey font/bars represent context-specific essentials.
The overall performance of each screen was robust, as evaluated by changes in cell proliferation or fitness according to gold-standard reference sets of essential and non-essential genes in untreated control samples [38]. Precision-recall (PR) curves show that at 5% FDR, >75% of essential genes were identified (mean recall = 0.84), confirming efficient Cas9 editing and disruption of genes in our screens (Supplemental figure 3a). The high performance of the screens to detect essential genes was also evaluated using the AUC from the PR curves, which was greater than 0.95 for all control screens (min = 0.955, max = 0.994). Pairwise correlation analysis also demonstrated the T0 time point of each genetic screen clustered together with high correlation coefficients, as did technical replicates, confirming that TKOv3 was equally represented across all screens at the onset of TKOv3 infection (Supplemental Figure 3b).
Using the Daisy model of gene essentiality [[38], [39], [40], [41]], we evaluated gene essentiality in untreated samples across different cell lines to identify core essential genes and context-specific genetic vulnerabilities. This analysis revealed 724 core essential genes that were detected in all five cell lines (BAGEL Bayes factor >5, FDR <0.05), and subsets of context-dependent genes that were specific to each cell line (Fig. 3c, Supplemental Table 4). The contextual essential genes in each cell line were subjected to gene set enrichment analysis (GSEA) to identify signatures of essential biological processes and associated components for each cell line. Interestingly, GSEA of the Vero E6 gene hits indicated enrichment for DNA damage response pathways (FDR <0.05), which is consistent with this cell line harboring wild-type TP53, whereas all other cell lines encode mutant TP53 [42] (Supplemental Figure 4a). Subsets of the Calu-3, UM-UC-4, HEK293+A+T and HuH-7 context-specific essential genes were found in amplified regions of their respective genomes, as determined by cross-referencing CNV data from CCLE (Supplemental Figure 4b). Biological process annotation of genes in the amplified gene regions were similar across the cell lines and consisted of vital cell processes such as RNA splicing, translation, cell cycle, and DNA replication. None of the identified gene hits selected by SARS-CoV-2 treatment were identified as core or context-essential fitness genes.
Due to the high CPE of SARS-CoV-2 in vitro (70%–99% cell death), our screens displayed strong positive selection, skewing the distribution of positive hits away from a normal linear model. To address the non-linearity of these screens and improve the sensitivity for identifying positive hits, we implemented a modified quantitative genetic interaction (qGI) scoring method adapted from Aregger et al. [28] and compared our results to other existing methods such as DrugZ, limma and MAGeCK (MLE). Explicitly correcting for non-linear effects in screening data and performing hypothesis testing between treatment and control guides, as opposed to deriving p-values from comparisons against reference distributions, increased the scoring accuracy. As illustrated in Supplemental Figure 5, scoring of screens using the modified qGI method reduced the number of negatively-selected genes whose knockout produce general defects in cell proliferation and fitness, and are unrelated to the presence of SARS-CoV-2. The high CPE observed in Calu-3, HEK293+A+T and UM-UC-4 also makes the detection of negative gene hits unreliable. Additionally, modeling guide-to-guide variability has the added advantage of limiting false positive (type I) errors driven by outlier guide effects, which we found to be problematic with the other scoring methods. For instance, PRSS8 was called significant by limma (FDR = 0.17) and DrugZ (FDR = 0.001); however, when the variance was accurately modelled across replicates by the qGI score, PRSS8 was no longer significant (FDR = 0.41). Consistently, PRSS8 failed to validate in our follow-up experiments, as discussed below. The increased specificity of our score is further exemplified in the UM-UC-4 and HEK293+A+T screens, also shown in Supplemental Figure 5, where ACE2 was the only significant hit at FDR <0.2 using our method. Additional essential genes varied depending on the scoring method used, thus indicating these were most likely false positive calls. Scoring with the qGI method corrected for the noise that could dominate over biological signals that result from a paucity of hits and more accurately identified gene hits at the desired cut off. For example, in Calu-3 cells the top three hits (ACE2, TMPRRS2 and KMT2C), which all validated, were collectively elevated to the top the ranked gene list by our method compared to others.
The SARS-CoV-2 gene interactions identified are shown in Fig. 4a (Supplemental Table 5). As expected, ACE2 was a top resistance hit at FDR <0.1 in all cell lines screened, validating the ability of our screens to select for host factors required for SARS-CoV-2 cytotoxicity. In some cases, such as UM-UC-4 and HEK293+A+T cells where high CPE was observed, ACE2 was the only gene significantly enriched, with the magnitude of the differential score increasing over time in the rechallenged populations (Fig. 4b). Attempts to reduce the CPE in UM-UC-4, Calu-3, and HEK293+A+T cells by reducing the MOI only delayed CPE onset, presumably due the robust replication of the virus in these cell lines. ACE2 was also the top positive interaction in these samples. Screens in HuH-7 cells, which had more moderate SARS-CoV-2 infection compared to the other cell lines, also identified ACE2 within the top three ranked hits. Knockout of endogenous ACE2 in the human cell lines screened (Calu3, HuH-7 and UM-UC-4, Supplemental Figure 7a) confirmed the ACE2 host dependency, with complete inhibition of SARS-CoV-2 infection and absence of CPE (Fig. 4c).
Fig. 4.
Enriched pro-viral gene hits identify host-SARS-CoV-2 genetic interactions.
A) Gene level differential scores showing top genes conferring resistance (red) to SARS-CoV-2 for the following screens and timepoints: Vero E6 (T16); UM-UC-4 (T23); HEK293+A+T (T12); HuH-7 (T15) and Calu-3 (T43). Gene hits at FDR <0.1 are labeled. B) ACE2 guide enrichment overtime in SARS-CoV-2 resistant populations. C) ACE2 knockout in human cell lines confers inhibition against SARS-CoV-2 infection. WT (black); ACE2 knockout (grey); % N-protein values are indicated above each bar. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
2.3. Impact of protease rewiring on virus infectivity
One limitation of screening Vero E6 cells with a library designed against the human genome is that only 56% of the library guides map perfectly to the African green monkey genome, and for stringency, we restricted our analysis to guides with perfect matches. While only one of the four CTSL library guides passed this filter, three CTSL guides appear enriched in unfiltered SARS-CoV-2-treated Vero screen. Two of these guides have single nucleotide mismatches in the PAM-distal region that presumably have limited impact on guide function. Thus, of the cells with confirmed expression of CTSL protein (Fig. 2a), only Vero E6 cells appear to show a dependency on CTSL for viral infection.
Subsequent knockout of CTSL in Vero E6 cells, confirmed by Western blot, validated CTSL as a host gene required for SARS-CoV-2 infection (Supplemental Table 6; Supplemental Figure 7b). In contrast, infection of Calu-3 cells, which endogenously express the serine protease TMPRSS2, required TMPRSS2 as a host factor (FDR <0.1). Surprisingly, TMPRSS2 was not identified as a hit in HEK293+A+T cells, indicating that over-expression of ACE2 is sufficient for SARS-CoV-2 infection in this context. Evaluation of SARS-CoV-2 infection with ACE2 alone compared to ACE2 and TMPRSS2 over-expression in HEK-293 cells confirmed that the homogeneous expression of ACE2 was the key factor required for SARS-CoV-2 permissivity in HEK293 cells with or without TMPRSS2 (Supplemental figure 5). Notably, combined over-expression of ACE2 and TMPRSS2 led to more rapid onset of CPE and produced higher viral titers compared to cells over-expressing ACE2 alone, indicating that TMPRSS2 does increase viral replication and production but is not strictly required for viral entry and the manifestation of CPE in these cells due to the high expression of ACE2.
The identification of CTSL or TMPRSS2 was cell line-dependent and based on the expression status of both genes. While CTSL was previously reported as a host factor in both HuH-7.5 and A549+ACE2 screens [22,25], and was ranked in the top 50 in our HuH-7 screens, it was not a prominently scoring gene in any of these screens. Failure to identify specific proteases in these screens shows that SARS-CoV-2 depends mostly on ACE2 for entry into host cells in vitro, and less so on the specific protease required to prime the Spike receptor.
We next sought to investigate whether the specific protease utilized in different cell lines was mutually exclusive or whether it simply depends on the protease expression profile in each cell line. To address this, TMPRSS2 and CTSL genes were either disrupted using CRISPR, or stably over-expressed in Vero E6 and Calu-3 cells. Calu-3 cells, which are TMPRSS2-dependent according to our results, became resistant to SARS-CoV-2 infection when TMRPSS2 was disrupted (Fig. 5a). Over-expression of CTSL in the presence or absence of TMPRSS2 had no impact on SARS-CoV-2 infection in Calu-3 cells (Fig. 5a). Similar results were observed in Vero E6 cells, in which SARS-CoV-2 entry was CTSL-dependent; over-expression of TMPRSS2 in the presence or absence of CTSL did not result in increased infection, as measured by N-protein abundance (Fig. 5b). In UM-UC-4 cells, where both proteases are expressed, neither protease was identified as a screen hit. Simultaneous disruption of both CTSL and TMPRSS2 only inhibited infection and reduced CPE at 24 hpi. By 48 hpi, the differences in magnitude of infection compared to WT was no longer prevalent (Fig. 5c). It is unclear whether other host proteases facilitate the cleavage function in the absence of TMPRSS2 and CTSL, or whether cell entry in vitro was carried out independently of spike cleavage.
Fig. 5.
TMPRSS2 and CTSL rewiring for S-protein priming.
Cell line-specific dependence on TMPRSS2 or CTSL for host cell entry was evaluated by overexpression and knockout of TMPRSS2 in CTSL-dependent cell lines and CTSL in TMPRSS2-dependent cell lines. Modified cells were challenged with SARS-CoV-2 at MOI 1, and 24–48 hpi changes to virus transmissibility was measured by immunostaining for SARS-CoV-2 N-protein and DAPI for cell viability in A) Calu-3 (48 hpi), B) Vero E6 (48 hpi) and C) UM-UC-4 cells 24 hpi and 48 hpi. (OE) indicates overexpression, (KO) indicates knockout, a minimum of two replicates were performed for each condition. p-values determined by Dunnett's post hoc multiple comparisons test, * p <0.1, ***<0.01, ***<0.001, ****<0.0001.
2.4. Comparative analyses of genetic screens confirm little overlap across models
To determine whether there were any common host factors that could be leveraged for broad-spectrum antiviral therapy, we compared the hits from each screen and included those previously reported in Vero E6 [27], A549+ACE2 [22], native HuH-7.5 [25] and Calu-3 cells [43]. To systematically identify overlapping hits, the published screens were first re-scored using the qGI scoring method. Using this scoring method, the screens were similarly corrected for the non-linearity observed in SARS-CoV-2 cytopathic screens, allowing for standardized comparison between our screens and previously-published screens, which is not possible using Z-scores alone. Gene hits were chosen using an FDR <0.2 and differential effect size >0.5 and the resulting lists limited to the top 50 genes. The published studies required a more liberal FDR filter of <0.5 due to the higher variance observed in these screens, as well as fewer replicates available. As indicated in Fig. 6a, ACE2 was the only gene that was identified in all cell lines and screens. Not surprisingly, the greatest number of overlapping genes was observed between screens done in the same cell line. For example, we observed 10 common factors when comparing our findings to the Vero E6 screen by Wei et al. [27], including SWI/SNF complex genes ARID1A and SMARCA4, which were validated to be SARS-CoV-2 pro-viral genes.
Fig. 6.
Comparison of overlap between different SARS-CoV-2 genome-wide CRISPR screens.
A) Upset plot showing overlap of the top 50 significant hits from this study ranked by differential effects (FDR <0.2) and previously reported screens (FDR <0.5).
B) Enriched GO biological processes for common overlapping gene hits.
Thirteen additional genes, including ACE2, overlapped in at least two different cell lines. Closer examination revealed genes that shared common functions amongst biological processes, which mainly involved regulation of transcription and chromatin organization (Fig. 6b). Although their exact role in SARS-CoV-2 infection and replication remains to be determined, the most enriched pro-viral genes reveal various scenarios by which SARS-CoV-2 could interfere with signalling networks to evade host cytoprotective processes and facilitate viral pathogenesis, some of which have also been shown to impact other viruses (Table 1). For example, HNF1B was identified in both Vero E6 and HuH-7 cells, is a transcription factor found in several tissues including liver and kidney, and is required for regulation of ACE2 expression [46,52,53]. SMAD4 and STRAP have both been shown to play a role in regulation of cell proliferation and TGF-β signalling [49,50]. Modulation of TGF-β receptors or SMAD factors is one of the main ways viruses can interfere with TGF-β signalling. Recently, SARS-CoV N-proteins have been reported to inhibit formation of SMAD3/4 complexes resulting in reduced apoptosis [49].
Table 1.
Common overlapping host genes and potential host pathway impacted.
| Gene | Biological process | Evidence for interaction with other viruses and impact on antiviral pathways | Reference |
|---|---|---|---|
| ARID1A | Component of SWI/SNF chromatin remodeling complex | Facilitates induction of antiviral type 1 interferon | [44] |
| DYRK1A | Dual-specificity kinase involved in multiple cellular functions, including regulation of cell proliferation | Regulation of HIV, HCV, HMCV, HSV-1 replication | [44] [45] |
| HNF1B | Transcription factor in liver cells | Regulates ACE2 expression | [46] |
| TADA1 | Transcriptional adaptor protein, component of STAGA complex | Epigenetic regulation of host by chromatin modification and transcription, no known viral interactions | |
| MED12 | Component of the mediator complex, involved in the regulated transcription of RNA polymerase II-dependent genes | Virus known to target components of RNA pol II components (HIV, HSV, HCMV, HBV). Knockdown of different components of mediator complex shown to inhibition of HIV replication. | [47] [48] |
| SMAD4 | Regulation of TFG-B pathway | Viruses (e.g. HCV, HBV, HPV) have been shown interact with SMAD factors to modulate TGF-B signalling | [49] |
| MED24 | Component of the mediator complex, involved in the regulated transcription of RNA polymerase II-dependent genes | Virus known to target components of RNA pol II components (HIV, HSV, HCMV, HBV). Knockdown of different components of mediator complex shown to inhibitHIV replication | [47] [48] |
| STRAP | Serine/threonine kinase receptor associated protein that negatively regulates TGF-B pathway | Viruses shown to interfere with TFG-B pathway by modulation of transcriptional regulations | [49] [50] |
| KMT2C | Lysine methyltransferase, chromatin modulator | Chromatin modification and transcription, no known viral interactions | |
| KDM6A | Lysine demethylase chromatin modulator | HPV shown to induce KMD6A resulting in epigenetic reprogramming host cells resulting in tumorigenesis | [51] |
Although RNA viruses such as SARS cannot directly alter the host genetic sequence, recent research suggests RNA viruses can antagonize the immune system through epigenetic mechanisms, thereby facilitating infection, spread and transmission [54]. DPF2, UBXN7 and JMJD6 have been shown to be involved in NF-kß or type I IFN signalling, and these genes were found as common hits in the Vero E6 screens described above [55–57]. Moreover, DPF2 is a negative regulator of the non-canonical NF-kß pathway and has been coopted by the influenza virus to suppress IFN-β expression and signalling to evade the host immune response [56]. Lastly, RAD54L2, MED12, MED24, TADA1, KMT2C and KDM6A encode DNA binding transcription factors known to mediate gene activity in response to environmental cues. Perturbation of RAD54L2 or KDM6A resulted in resistance to SARS-CoV-2 infection in Vero E6 cells transduced with guides targeting these genes [27].
To unify the hits across all screens, GSEA analysis was performed on the full ranked list of genes based on the mean effect across all the screens (Calu-3, Vero E6, HEK-293+A+T, HuH-7 and UM-UC-4) and identified several cellular processes related to SARS-CoV-2 infection. At FDR <0.2, enriched pathways included cellular processes related to SARS, transcriptional co-activation, epigenetic chromatin modification and inflammation (Fig. 7). These enrichments point to viral mechanisms used to hijack the host translation machinery through epigenetic pathways to enable infection, spread and transmission by antagonizing the immune system [54]. Despite our focus on the positive hits in our screens, analysis of the negative hits revealed cohesive enrichments that included viral-related processes such as maturation of SARS-CoV-2 Spike protein and HIV and influenza life cycle pathways (Fig. 7).
Fig. 7.
Enrichment of common cellular processes related to SARS-CoV-2 genetic interactions.
A) Enriched pathways for positive terms. B) Enriched pathways for negative terms.
2.5. Genetic screen validation
Since we did not identify any novel genes in common across all the cell lines, apart from the known entry host factors (ACE2, TMPRSS2 or CTSL), we aimed to validate several of the top enriched and high scoring genes in Calu-3 cells, based on their potential role in viral infection, including PRSS8, IRF9, KMT2C, and MYZAP. PRSS8 encodes a serine protease, while IRF9 encodes interferon regulatory factor 9, an integral transcription factor in mediating type I interferon antiviral response. MYZAP encodes for myocardial zonula adherens that targets tight junction proteins to cell junctions and was identified using the DrugZ scoring algorithm. Lastly, KMT2C encodes for a chromatin modification protein that would impact the epigenetic marks and cell state. Polyclonal CRISPR knockouts were generated for each gene by transducing gRNAs individually with Cas9 ribonucleoproteins (RNPs), which resulted in high indel frequencies (Supplemental Table 6). Viral infection and cell viability were assessed through detection of N-protein by immunofluorescence and nuclei counts relative to uninfected cells, respectively (Fig. 8a). PRSS8 and MYZAP knockout pools did not show resistance to SARS-CoV-2 infection. Notably, both genes were significant when scored with other methods (FDR <0.2), but after correction for guide variance using our methods outlined above, the FDR increased beyond our selection threshold. Only the guides targeting chromatin modifying gene KMT2C significantly reduced SARS-CoV-2 infection and CPE, confirming that these genes participate in SARS-CoV-2 infection. Notably, KMT2C is a component of the MLL2/3 complex (also known as the ASCOM complex) and can promote chromatin remodeling by activation of the SWI/SNF complex [58]. SWI/SNF subunits were previously validated by Wei et al. in Vero E6 cells. We evaluated the impact of KMT2C on ACE2 expression levels and surface localization, as some identified host gene functions have been reported to converge on the regulation of ACE2 rather than direct viral interactions. No changes were found in the surface expression of ACE2 in KMT2C knockouts, confirming that loss of these genes did not impact ACE2 expression or localization (Supplemental Figure 8).
Fig. 8.
Validation of resistance gene hits.
sgRNAs targeting individual genes were introduced into Calu-3 cells, which were then challenged with SARS-CoV-2 at MOI 1. A) SARS-CoV-2 N-protein and B) cell viability was measured at 48 hpi and compared to WT; all data points generated from at least n = 2 and expressed as mean ± SEM. C) ACE2 surface expression measured by flow cytometry. p-values determined by Dunnett's post hoc multiple comparisons test, ** <0.01, ****<0.0001.
2.6. Global transcriptional changes to host cells in response to SARS-CoV-2
The global transcriptional changes in host cells in response to SARS-CoV-2 infection were also investigated using Calu-3 and HuH-7 cells, which represented cell lines with high versus moderate infection levels, respectively. These cell lines were infected with SARS-CoV-2 at an MOI ≈1, poly A+RNA was extracted from infected cells at 0, 12 and 24 hpi, and alterations in mRNA levels, patterns of mRNA splicing and mRNA 3′-end formation (alternative polyadenylation) were assessed (Supplemental Figure 9, Supplemental Table 7). In HuH-7 cells, 274 genes were differentially expressed relative to mock samples at 12 hpi. Enriched among genes with reduced expression were genes encoding transcription factors, of which six function in immune response pathways or are known to be differentially expressed upon viral infection (PPARA, FOXO1, FOXO3, JUN, JUNB and BCL6). An unrelated set of 92 genes was differentially expressed at 12 hpi and 58 genes at 24 hpi. Genes induced at 24 hpi were preferentially involved with type I interferon production and signalling, consistent with a recent report [59]. Only minor changes to alternative splicing or alternative polyadenylation during virus infection was observed (Supplemental Figure 9).
3. Discussion
Here we report genome-wide loss of function CRISPR screens in multiple host cell lines of different tissue origin including human lung cells (Calu-3), liver cells (HuH-7), embryonic kidney cells (HEK-293+A+T), a novel bladder cell line (UM-UC-4) and African green monkey kidney cells (Vero E6). The goal was to identify common host genes critical for SARS-CoV-2 infection, that could be leveraged for antiviral therapeutic development. All cell lines were permissive to SARS-CoV-2 infection with high cytopathic effects observed in Calu-3, UM-UC-4 and HEK-293+A+T. UM-UC-4 is introduced here as a new cell line that is permissive to SARS-CoV-2 infection and was found to have high endogenous ACE2 expression. The cytopathic effect in HuH-7 cells was much lower, primarily due to the decreased infection achieved in these cells. The permissiveness of cell lines could be attributed to expression levels of ACE2, as the Caco-2 cells used in this study had undetectable expression of ACE2 by Western blot and minimal SARS-CoV-2 infection compared to the other cell lines. Due to the cytopathic effects of SARS-CoV-2, all screens were biased towards positive selections, primarily identifying pro-viral genes required for SARS-CoV-2 cytotoxicity.
ACE2 was the only gene found to be common across all cell lines used in this study and in some reported screens [22,27,43]. Additionally, enrichment of sgRNAs targeting ACE2 increased over time and in rechallenged SARS-CoV-2-resistant cell pools. We concluded that the identification of ACE2 serves as a positive indicator for effective SARS-CoV-2 genetic screens. Although not shown, screens with Caco-2 cells, which had low infectivity (∼2–6%), showed little variation in sgRNA abundance between treated and control samples and ACE2 was not identified, suggesting this screen was not effective for identifying SARS-CoV-2 host factors.
Apart from ACE2, the known proteases required for Spike processing, TMPRSS2 and CTSL, were also identified. The TMPRSS2 pathway is required for early entry of SARS-CoV-2 into the cells, where adjacent S-proteins are processed at the cell membrane, while late entry occurs when S-proteins traffic to late endosomes/lysosomes for processing by endosomal cathepsins [60,61]. The identification of each protease was cell line-specific, and if detected were in cells that expressed one and not the other. Accordingly, TMPRSS2 was only identified in Calu-3 screens, which expressed TMPRSS2 but not CTSL, while CTSL was identified in Vero E6 cells which expressed CTSL, but not TMPRSS2 [62]. CTSL protein is detected in HuH-7 cells but was not a significant gene hit, a finding that was similar in screens in HuH-7.5 liver and A549+ACE2 lung cell screens [22,25,26]. Surprisingly, TMPRSS2 was not identified in HEK-293 cells that over-expressed TMPRSS2. Similar results were found by Wang et al., in which TMPRSS2 was not identified in their HuH-7.5 cells over-expressing TMPRSS2. Comparison of 10-fold serially diluted SARS-CoV-2 infection in ACE2-versus ACE2+TMPRSS2-overexpressing HEK-293 cells showed ACE2 alone was sufficient for virus infection due to the highly engineered expression levels. However, the addition of TMPRSS2 was beneficial for enhancing viral production 63-fold (Supplemental figure 6). In UM-UC-4 cells, in which both TMPRSS2 and CTSL were detected, simultaneous disruption of both proteases did not impact the overall infection but delayed its onset. Overall, these results indicate TMPRSS2 and CTSL are not utilized in a mutually exclusive manner. It has been shown that SARS-CoV-2 can use both CatB/L as well as TMPRSS2 for priming S-protein [6]. Notably, a recent report modeling the TMPRSS2 and CTSB/L pathway usage across cell lines suggests that the usage of the two pathways varied across cell lines, spanning the range of exclusive usage of one pathway to nearly even usage of both, depending on the corresponding protease expression levels [63], in agreement with our findings. In addition, other studies have reported TMPRSS2-related proteases and cathepsins may contribute to S-protein cleavage, such as TMPRSS11s, TMPRSS4, and CTSB [64,65].
One key difference between SARS-CoV-2 and SARS-CoV-1 is the more efficient use of the TMPRSS2 early entry pathway, resulting in its higher infectivity and transmissibility. This has been attributed to an insertion of the multibasic RRAR furin cleavage site at the S1/S2 junction that is absent from SARS-CoV-1 and other related coronaviruses [66]. Recent studies revealed propagation of SARS-CoV-2 in TMPRSS2 deficient Vero cells introduced mutations in the S-gene and specifically could result in deletion of the polybasic cleavage site [67]. Sequencing of the SARS-CoV-2 viral stock used in these screens confirmed the furin cleavage site was uninterrupted. Although a missense mutation was detected in the furin proximal region (Supplementary Table 1), this change was detected in only a small portion of the sequence (∼30%). Therefore, the majority of the virus stock maintained the cleavage site, as evidenced by the strong pathogenicity of the virus in our screens. Furin cleavage at the multibasic site primes the S-protein to enter via the faster TMPRSS2 pathway. Furin is mainly expressed in the Golgi network, therefore, it is likely this step occurs during viral production [68]. A recent study demonstrated that furin knockout reduced but did not prevent S-protein-mediated cell to cell fusion [68]. Therefore, while furin promotes SARS-CoV-2 infectivity and spread, it is not essential. These findings may explain why furin has not been identified in any of the CRISPR screens reported so far, despite its clear role in increasing virulence of SARS-CoV-2. This points to the restrictions of screening highly cytopathic SARS-CoV-2 isolates, that elicit survival-based phenotypes and consequently bias towards enrichment of early-stage host factors, like ACE2 and TMPRSS2 and CTSB/L proteases.
Multiple screens have been reported early this year assessing SARS-CoV-2 host factors by CRISPR based knockout screens. Beyond ACE2, other gene hits were context-specific, and any overlap of host genes was usually restricted to screens performed in common cell lines [19,43]. The Vero E6 screen performed here identified similar hits to the Vero E6 screen by Wei et al. [27], both enriching for genes that regulate transcription and chromatin modification. A screen in A549 cells enriched for processes regulating endosomal processes, such as endosomal acidification, sorting and recycling, linking A549 cells to the endosomal entry pathway. Consistently with these observations, CTSL, which regulates the late endosomal entry pathway, ranked much higher than TMPRSS2. Perturbation of the genes enriched in these pathways, like Rab7A, also perturbed ACE2 cell surface expression [14,22]. While HuH-7 cells enriched for a set of transcription regulators like BMPR1A, RGMB, and SMAD4, which are involved in TGF-β signalling, a HuH-7 derivative cell line, HuH-7.5, identified different host factors, that were also linked to receptor usage. These enrichments included GAG biosynthesis, cholesterol homeostasis and RAB GTPases, which were shown to influence ACE2 trafficking to the cell surface [22]. This motivated investigation of ACE2 levels in our validation studies. Here, ACE2 levels were not changed in the presence of any gene perturbation in Calu-3 cells, indicating that the hits selected in our screens, such as chromatin modifier KMT2C, did not converge at regulation of ACE2 (Supplemental figure 8).
In conclusion, our study provides an independent set of positive selection CRISPR screens and an integrative comparison of published screens across multiple cell lines. Our data reveals a particularly idiosyncratic set of host factors identified across cell lines, in agreement with previously reported screens. It was recently suggested that technical or biological differences between screens could explain the lack of overlap [69]. Although genetic variation can exist between the same cell line used by different groups, most overlaps were found in common cell lines despite being performed by different groups using different libraries. Thus, we argue it is more likely that biological differences underpin the lack of overlap across different cell lines. We further suggest that correcting for non-linear trends in the datasets and guide-to-guide variability observed in biological phenotypes with sparse enrichments is required to reduce type I errors and improve the accuracy of the gene hits. Across these efforts the consensus enrichment of ACE2, TMPRSS2 or CTSL suggests these screens enrich for early-stage host factors, which is typical of pooled cell survival screens using highly cytopathic viruses [18]. Follow-up screens with more indolent versions of SARS-CoV-2 may identify dependency factors that act at later stages of replication. Nevertheless, the network of host factors that have been identified will be broadly applicable to understanding the impact of SARS-CoV-2 on human cells and facilitate the development of host-directed therapies. As it stands, the consensus view is that the virus that causes COVID-19 will become endemic but will pose less danger over time [70]. The future will depend on a combination of annual vaccines, acquired immunity and new therapeutics.
3.1. Limitations of this study
This study was limited to finding host entry factors required for SARS-CoV-2 infection due to the high cytopathic effect of SARS-CoV-2 in vitro. These types of positive selection screens are strongly biased for pro-viral host factors (i.e., gene products that are required for viral entry) due the small population of cells surviving infection. As a result, the number of host factors to be discovered were limited, and identifying anti-viral host factors (i.e., genes that restrict replication) is unfeasible. Despite cohesive enrichment of gene hits for biological processes commonly involved in viral pathogenesis, the number of hits is small, and further analysis of mechanisms underlying how these genes regulate SARS-CoV-2 infection is required.
4. Materials and methods
4.1. Cell lines
Vero E6 (ATCC), Caco-2 (ATCC), HuH-7 (ATCC), HEK-293+A+T (ACE2, TMPRSS2 stable over-expression), and UM-UC-4 (ECACC) cells were cultured in DMEM (Wisent) with 10% FBS (Gibco) and 1% penicillin-streptomycin (Gibco). Calu-3 and UM-UC-4 cells were cultured in EMEM (Wisent) with 10% FBS and 1% penicillin-streptomycin. All cell lines were dissociated with trypsin-EDTA (Gibco) and maintained at 37 °C and 5% CO2. Cells were regularly monitored for mycoplasma contamination.
4.2. Virus stocks
SARS-CoV-2/SB3-TYAGNC isolate [37] used here was propagated in Vero E6 cells at 37 °C to generate P5 viral stock. Cells were inoculated for 2–3 days at which point cytopathic effects are apparent. Supernatants were harvested, centrifuged, filtered and stored at −80 °C. Viral titers of collected supernatants (50% tissue culture infectious dose, TCID50/mL) were determined according to the Spearman and Karber method [71] as outlined previously [37]. Collected virus stocks were evaluated by extracting RNA from supernatant using the QIAamp viral RNA kit (Qiagen) and sequenced on an Illumina MiniSeq using the ARTICv4.1 amplicon scheme and the SIGNAL v1.5.0 workflow [72]. All experiments with SARS-CoV-2/SB3-TYAGNC were performed at the University of Toronto Combined Containment Level 3 (C-CL3) facility.
4.3. SARS-CoV-2 virus infections
Host target cells were seeded overnight to achieve 90% confluence at the time of infection. Monolayers were infected with SARS-CoV-2/SB3-TYAGNC at indicated MOIs for 1 h at 37 °C in serum-free DMEM or EMEM. After 1 h, the infection inoculum was removed, and cells were washed and replenished with regular cell culture media depending on the cell line. 24–120 h post-infection, supernatants were harvested and stored at −80 and cells fixed with 4% neutral buffered formalin (NBF) for 1 h at room temperature for further analysis.
4.4. Immunofluorescence and confocal microscopy
To detect SARS-CoV-2 N-protein expression, cell cultures were seeded and infected at the desired MOI in black 96-well PerkinElmer CellCarrier ultra imaging plates (PerkinElmer). Infected cells were fixed with 10% neutral buffered formalin (NBF). After fixation, cells were permeabilized with 0.1% Triton X-100 for 10 min at room temperature, washed with PBS and blocked for 1 h in solution containing PBS, 2% BSA and 2% FBS. After blocking, cells were stained overnight at 4 °C with anti-SARS-CoV-2 N-protein antibody (Genscript HC2003) at a dilution of 1:500 in PBS + 1% BSA + 1% FBS. The next day, cells were washed with PBS and incubated with secondary antibody, anti-human Alexa488-IgG (Fc specific) (Jackson Immunoresearch 109-545-098) at a dilution of 1:400 and for 45 min at room temperature. F-actin was visualized in the same samples by incubation with phalloidin-568 (ThermoFisher Scientific A12380). Finally, cells were washed with PBS, and counterstained with Hoechst 33258 to visualize nuclei. Cells were imaged on the Opera Phenix QHES (PerkinElmer) automated high-content screening system. Percentage of cells infected were determined using Harmony™ high content imaging and analysis software by comparing cells stained positive with anti-N antibody versus Hoechst actin-stained cells.
4.5. Western blot analysis
Cells were lysed in RIPA buffer with protease inhibitor cocktail (Pierce) and centrifuged 14,000 rpm for 15 min at 4 °C. Cell lysates were mixed with NuPage LDS sample buffer supplemented with a reducing agent (Life Technologies) and boiled for 10 min. Next, 7–8 μg of protein was resolved on 4–12% Bis-Tris gels (Life Technologies) and transferred to a PVDF membrane using X-cell surelock mini transfer appartus (Life Technologies) at 22V for 1 h. Membranes were blocked with 5% skimmed milk (BioShop) in 1X TBST prior to antibody incubation. Subsequently, proteins were detected using anti-ACE2 (CST 92485), anti-TMPRSS2 (AbCam ab109131), anti-CTSL (R&D AF952), α-tubulin (MilliporeSigma sc-53646), anti-HSP90 (SantaCruz, SC-13119) antibodies with an appropriate HRP-conjugated secondary antibody, anti-mouse IgG (Cell Signalling Technologies 7076, RRID:AB_330924), anti-rabbit IgG (Cell Signalling Technologies 7074, RRID:AB_2099233) or anti-goat IgG (R&D HAF017). Proteins were visualized on iBright Imaging Systems (ThermoFisher Scientific) using Pierce SuperSignal West Pico ECL reagents (ThermoFisher Scientific).
4.6. ACE2 flow cytometry analysis
For ACE2 labeling, cells were harvested with TrypLE Express, washed once with FACS buffer (1%BSA in PBS and 5 mM EDTA) and incubated for 40 min in FACS buffer containing 1-2 μg of human ACE2-488 conjugated antibody (R&D Systems FAB9332) and 1:1000 7-AAD (ThermoFisher Scientific). Mouse IgG2A-488 (R&D systems, IC003G) was used an isotype control. Flow cytometry was performed using the iQue® Advanced Flow Cytometry platform (Satorius) and analyzed with FlowJo v10 software.
4.7. Genome-wide CRISPR screens
The human Toronto knockout v3 (TKOv3) genome-scale CRISPR library (Addgene #90294) was used to perform pooled CRISPR knockout screens in Vero E6, Caco-2, HuH-7, Calu-3 and HEK-293+A+T cells [38]. Host cells were transduced with TKOv3 lentivirus at an MOI of 0.3, such that each sgRNA was represented in about 200 cells. Twenty-four hours after, TKOv3 transduced cells were selected with puromycin (3-12 μg/ml) for 48 h. After 48 h, cells were harvested, pooled, and split into three replicates of at least 1.5 × 10^7 cells (minimum 200-fold coverage TKOv3 library) and passaged every 3–6 days depending on the doubling time of the host cell line, maintaining coverage at 200-fold. 3 × 10^7 cells were collected for genomic DNA extraction at T0, and at every passage thereafter. After two cell passages, transduced cells were infected with SARS-CoV-2 at the determined MOI for each host cell. For cytopathic screens (Vero E6, Calu-3, HEK-293+A+T), 2 × 10^7 cells per replicate were infected with SARS-CoV-2 at the indicated MOI. Virus-induced CPE was apparent 2–3 days after infection, after which surviving cells were maintained in standard growth media until confluence. SARS-CoV-2-treated cells were collected for genomic DNA extraction and passaged for repeat infection with SARS-CoV-2 until no CPE was observed. For non-cytopathic drop-out screens (HuH-7, Caco-2), SARS-CoV-2-infected cells were passaged every 3–4 days, at which point cells were re-infected with SARS-CoV-2 for 2–3 rounds. For all screens, mock-infected cells were passaged, and cell pellets collected for genomic DNA every 3–6 days until completion to serve as a reference for sgRNA analysis.
Genomic DNA from all screens was extracted from cell pellets using the Wizard Genomic DNA Purification kit (Promega). Sequencing libraries were prepared from 100 μg of gDNA for control sample and non-cytopathic screen samples or 10 μg of gDNA for cytopathic screen samples via a 2-step nested PCR using primers that include Illumina TruSeq adapters with i5 and i7 indices. Barcoded libraries were gel purified using PureLink Quick Gel Extraction kit (ThermoFisher), sequenced on an Illumina HiSeq2500 using single-read sequencing, and were completed with standard primers for dual-indexing with HiSeq SBS Kit v4 reagents as described in Ref. [28].
4.8. Screen analysis
Demultiplexed FASTQ files were first trimmed by locating constant sequence anchors and extracting 20-bp guide sequences preceding the anchor sequence. Pre-processed single-end reads were aligned to TKOv3 reference library sequences using Bowtie (v0.12.8) allowing up to two mismatches and one exact alignment (specific parameters: -v2 -m1 -p4 --sam-nohead). Successfully aligned reads were counted and combined into a matrix with guide annotations. Differential effects between SARS-CoV-2 treated and control screens were estimated using the method described in Aregger et al. [28], which is briefly summarized here. Read counts for all samples were normalized to ‘counts per million’ (cpm) by dividing each read count by the sum of all read counts in the sample then multiplying by one million and adding a pseudo count of one. Log fold-changes (LFCs) between normalized guide abundance at each sequenced timepoint and the starting timepoint (T0) of each screen were computed. Guides with fewer than 30 or greater than 10,000 absolute read counts at T0 were removed. For the Vero E6 screen, TKOv3 guides that did not map to the Vero E6 genome with perfect homology were filtered out. LFC values for each guide were averaged across the technical replicates, and guide-level residual LFCs between each screen and the corresponding control screen were computed. To correct for potential non-linear relationships between the treatment and control data, a second-degree LOESS curve with a span equal to 40% of the residual LFCs was fit before subtracting its predicted values from the residual LFCs. Gene-level p-values reflecting the difference between SARS-CoV-2 screens and their corresponding control screen were calculated from LOESS-normalized residual LFCs with moderated t-testing as implemented in the limma R package [73]. Multiple hypothesis testing was corrected for using the Benjamini-Hochberg method. Gene-level differential effects were computed by taking the mean residual LFC for all genes with two or more remaining guides post-filtering based on T0 read counts and, for the Vero E6 screen, guides which mapped to the Vero E6 genome with perfect homology. Intersection of gene hits between different screens was visualized using the Intervene tool [74].
4.9. RNA sequencing
Total cellular RNA was extracted using the RNeasy Kit (Qiagen) according to the manufacturer's instructions. Samples were submitted for mRNA-Seq at the Donnelly Sequencing Centre at the University of Toronto (http://ccbr.utoronto.ca/donnelly-sequencing-centre). RNA-seq libraries were sequenced on an Illumina NovaSeq6000 platform using an S2 flowcell at 2 × 151-bp read lengths. Gene expression changes were analyzed by pseudo-aligning pre-trimmed reads to GENCODE v29 transcripts using Salmon v0.14.1 [75] and aggregated per gene using the R package tximport [76]. Differential expression was then assessed using the classic mode (exactTest) edgeR [77] and genes with an FDR <0.05 and a greater than 2-fold change were considered differential. For each comparison, only genes expressed at a minimum of 5 RPKM in one or both conditions were considered. Alternative splicing was analyzed using Vast-tools v2.2.2 [78] in combination with the VastDB Hs2 library released on Dec. 20, 2019 (https://github.com/vastgroup/vast-tools). Changes were considered significant if they were greater than 10 dPSI/dPIR and the expected minimum change was different from zero at p > 0.95 according to vast-tools’ diff module. Events were filtered requiring a minimum of 10 reads per event and a balance score (quality score 4) of ‘OK’,‘B1’,‘B2’, ‘Bl’ or ‘Bn’ for alternative exons or > 0.05 for intron retention events in at least two of three replicates. Alternative cleavage and polyadenylation analysis was performed using QAPA with default settings, essentially as previously described [79]. QAPA builds the reference library of 3′ UTRs from all annotated protein-coding genes using GENCODE basic gene annotation (v19) for humans (hg19), supplemented by experimentally defined polyA sites archived in the PolyAsite database [80]. To quantify the poly(A) site usage (PAU), the percentage of expression of a single 3′ UTR over the total expression level of all 3′ UTRs for a given gene was calculated. Lengthening events were defined as those with proximal PAU group difference between two conditions (PAU_Group_diff) < −20% and shortening events with proximal PAU_Group_diff >20%.
4.10. Gene set enrichment and network analysis
Analysis of enriched gene functions was performed as ordered queries with g:Profiler [81] using the TKOv3 library as a custom background. The ‘biological process' domain of Gene Ontology as well as CORUM were selected as annotation sources, and only terms with at least five and at most 1000 members as well as an intersection size with the query of at least 3 genes were considered. For plotting purposes, only terms with at least 3-fold enrichment with respect to the background were used, and when terms mutually overlapped by at least 70% genes present in the input, only the most enriched term was retained. For screens, enrichment analysis was performed for overlapping hits with differential score >0.5 and FDR >0.2 or 0.5 for published screens.
GO enrichment analysis for differentially expressed genes and genes containing alternative splicing changes was performed with FuncAssociate 3.0 [82], requiring at minimum a 5-fold enrichment and excluding categories of more than 1000 genes. Terms with a mutual gene overlap of greater then 50% were merged. Gene set analysis for APA changes were performed using the gProfiler R package (v0.7.0) using the GO-Bioprocess, GO-Molecular Function and Reactome pathway standards [81].
Gene set enrichment analysis for all screens was performed on differential LFCs for each screen using the fgsea R package [83] on Molecular Signatures Database Homo sapiens curated gene sets [84]. Gene sets were constrained to include those with between 15 and 500 terms, and only significantly enriched terms with Benjamini-Hochberg-corrected p-values less than 0.5 were reported.
4.11. Mass spectrometry
Protein isolation: Proteins were isolated from Trizol following RNA isolation. Briefly, DNA was precipitated by the addition of 0.3 ml of 100% EtOH and discarded. 1.5 ml of isopropanol was added to the remaining mixture to precipitate the proteins. After 30 min incubation at RT (room temperature), the samples were centrifuged for 10 min at 14000 rpm at 4C. The pellets were washed twice with 95% EtOH. The pellets were dried briefly to evaporate ethanol and resuspended in 8 M Urea, 50 mM Tris, ph = 7.9. The pellets were left overnight at 4C to ensure efficient solubilisation. The next day, 8 M urea was diluted in half with 50 mM ammonium bicarbonate. The proteins were then precipitated using Proteoextract protein precipitation kit (Calbiochem; cat # 539180) according to manufacturer's instructions. The precipitated proteins were resuspended in 50 mM ammonium bicarbonate and subjected to trypsin digestion.
Trypsin digestion: Each sample was resuspended in 88 μL of 50 mM NH4HCO3, reduced with 8 mM DTT for 1 h at room temperature, alkylated with 500 mM iodoacetamide for 45 min in the dark room, and digested with 1 μg of trypsin overnight at 37 °C. Samples were desalted using ZipTip Pipette tips (EMD Millipore) using manufacturer protocol and dried.
Data Acquisition: Peptides were reconstituted in 20 μl of 1% formic acid and 5 μl was loaded onto the column. Peptides were separated on a reverse phase Acclaim PepMap trap column and EASY-Spray PepMap analytical column using the EASY-nLC 1200 system (Proxeon). The organic gradient was driven by the EASY-nLC 1200 system using buffers A and B. Buffer A contained 0.1% formic acid (in water), and buffer B contained 80% acetonitrile with 0.1% formic acid. 90 min gradient was utilized at a flow rate of 250 nL/min, with a gradient of 0%–6% buffer B in 1 min, followed by 6%–30% buffer B in 75 min, 30%–100% in 4 min, and 100% buffer B for 10 min. Eluted peptides were directly sprayed into a Q Exactive HF mass spectrometer (ThermoFisher Scientific) with collision induced dissociation (CID) using a nanospray ion source (Proxeon). The full MS scan ranged from 300 to 1650 m/z and was followed by data-dependent MS/MS scan of the 20 most intense ions. The resolutions of the full MS and MS/MS spectra were 60,000 and 15,000, respectively. Data-dependent mode was used for MS data acquisition with target values of 3E+06 and 1E+05 for MS and MS/MS scans, respectively. All data were recorded with Xcalibur software (ThermoFisher Scientific).
Data Analysis: AP-MS datasets were searched with Maxquant (v.1.6.6.0) (Tyanova et al., 2016). Human protein reference sequences from the UniProt Swiss-Prot database were downloaded on 18-06-2020. SARS-CoV-2 protein reference sequences were downloaded from NCBI (GenBank accession NC_045512.2, isolate = Wuhan-Hu-1) supplemented by isolate 2019-nCoV/USA-WA1/2020 (GenBank Accession MN985325) for ORF3b, ORF9b and ORF9c. Spectral counts as well as MS intensities for each identified protein were extracted from Maxquant protein Groups file. MS intensity in each sample was normalized to allow across cell line comparisons. Each intensity value in a sample is multiplied by 1e+12, then divided by the total intensity in the entire sample to obtain the normalized intensity.
4.12. Generation of cell lines
Gene-KO cell lines were generated by electroporation (Neon® Transfection system) of Cas9-RNPs. CRISPR guide RNA sequences (gRNA) from TKOv3 library hits were synthesized by IDT. gRNAs were complexed at a 1:1 M ratio with ATTO550 labeled tracrRNA (IDT) in TE buffer by heating at 95 °C for 5 min followed by cooling to room temperature to form crRNA:tracrRNA duplexes. 180 pmol of Alt-R Cas9 enzyme (IDT) was combined with 220 pmol of crRNA:tracrRNA duplex (IDT) at room temperature for 20 min to form ribonucleoprotein (RNP) complexes in 10 μl total volume with Resuspension buffer R (Neon® Transfection system). Target cells were resuspended in Resuspension Buffer R to a dilution of 5 × 10^3 cells/ul. 90 μl (4.5 × 10^5) of the cell solution was gently mixed with CRISPR RNP complexes and immediately electroporated according to the manufacturer's protocols (Neon® Transfection system, Thermo Fisher Scientific) and transferred to pre-warmed 6-well plates for incubation under standard conditions. To confirm gene knockout, genomic DNA from surviving cells was extracted using Extracta DNA Prep (Quanta Bio), Sanger sequencing was performed across the gRNA target sites following PCR amplification and gene knockouts were confirmed using TIDE (https://tide.nki.nl/) to identify out-of-frame insertion-and-deletion mutations (Supplemental Table 6).
To generate protease overexpression lines, TMPRSS2-FLAG, FLAG-TMPRSS2 or Cathepsin L-FLAG constructs were cloned into pLenti6.2 plasmid with blasticidin resistance marker using gateway cloning system (Invitrogen). Lentiviruses were produced by transfecting HEK-293T cells with pPAX2, pVSVG and pLenti6.2 generated above using Lipofectamine 2000 (Invitrogen). The conditioned media was collected from transfected cultures, filtered through 0.22 μm filters and applied to Vero E6, Calu-3 and HuH-7 cells at 1:10 ratio of conditioned media: fresh media, in the presence of 8 μg/ml polybrene (Sigma). Transduced cells were washed 24 h post-transduction and selected in blasticidin (Gibco)-containing media (5 μg/ml for Vero and Calu-3, 2 μg/ml for HuH-7) for two weeks. The expression of the FLAG-tagged constructs was verified by western blotting.
Author contributions
Conceptualization: K.C., A.G.F., H.L., F.G., P.M., U.B., K.R.B., E.M., J.G., B.J.B., M.T., J.M.; Methodology: K.C., A.G.F, H.L., F.G., P.M., S.H., E.M., A.H.Y.T., N.C.H.; Validation: K.C., A.G.F., H.L., F.G., P.M.; Formal Analysis: K.C., K.R.B., H.N.W., M.B., U.B., S.P., F.M.; Investigation: K.C., A.G.F, H.L., F.G., P.M., E.M., S.H.,K.A., A.A., Resources: S.M., Writing- Original draft: K.C., P.M., A.G.F., U.B., K.R.B., M.B., H.N.W., H.L., J.M.; Writing-Review and Editing: K.C., P.M., A.G.F., U.B., K.R.B., M.B., H.N.W., H.L., C.M., J.M.; Supervision: K.C., N.C.H., P.M., S.M., K.M., S.G.O., J.G., B.R., B.J.B., M.T.,C.M., J.M.; Funding acquisition: S.M., K.M., S.G.O., J.G., B.J.B., M.T., J.M.
Acknowledgements
We would like to acknowledge Anne-Claude Gingras and Payman Samavarchi-Tehrani for sharing HEK-293+A+T cell line and Betty Poon for assistance in the C-CL3 facility. This work was supported by the University of Toronto COVID-19 Action Initiative Fund to J.M., B.J.B., S.G.O., J.G., K.M., and S.M. Indirect support was also received from the University of Toronto and the Temerty Foundation to support enhanced capacity and operations of the Toronto Combined Containment Level 3 Facility during the COVID-19 pandemic. This work was also partially supported from a Canadian Institutes for Health Research Project Grant to J.M. (MOP142375) and a Disruptive Innovations Phase 2 Grant from Genome Canada to J.M.
Footnotes
Supplementary data related to this article can be found at https://doi.org/10.1016/j.heliyon.2022.e12744.
Contributor Information
Katherine Chan, Email: katiesk.chan@utoronto.ca.
Jason Moffat, Email: j.moffat@utoronto.ca.
Appendix A. Supplementary data
The following are the supplementary data related to this article:
References
- 1.da Costa V.G., Moreli M.L., Saivish M.V. The emergence of SARS, MERS and novel SARS-2 coronaviruses in the 21st century. Arch. Virol. 2020;165:1517–1526. doi: 10.1007/s00705-020-04628-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Vos L.M., Bruyndonckx R., Zuithoff N.P.A., Little P., Oosterheert J.J., Broekhuizen B.D.L., Lammens C., Loens K., Viveen M., Butler C.C., et al. Lower respiratory tract infection in the community: associations between viral aetiology and illness course. Clin. Microbiol. Infect. 2021;27:96–104. doi: 10.1016/j.cmi.2020.03.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Hartenian E., Nandakumar D., Lari A., Ly M., Tucker J.M., Glaunsinger B.A. The molecular virology of coronaviruses. J. Biol. Chem. 2020;295:12910–12934. doi: 10.1074/jbc.REV120.013930. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.V'Kovski P., Kratzel A., Steiner S., Stalder H., Thiel V. Coronavirus biology and replication: implications for SARS-CoV-2. Nat. Rev. Microbiol. 2021;19:155–170. doi: 10.1038/s41579-020-00468-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Letko M., Marzi A., Munster V. Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses. Nat Microbiol. 2020;5:562–569. doi: 10.1038/s41564-020-0688-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hoffmann M., Kleine-Weber H., Schroeder S., Kruger N., Herrler T., Erichsen S., Schiergens T.S., Herrler G., Wu N.H., Nitsche A., et al. SARSCoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell. 2020 doi: 10.1016/j.cell.2020.02.052. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Starr T.N., Greaney A.J., Hilton S.K., Ellis D., Crawford K.H.D., Dingens A.S., Navarro M.J., Bowen J.E., Tortorici M.A., Walls A.C., et al. Deep mutational scanning of SARS-CoV-2 receptor binding domain reveals constraints on folding and ACE2 binding. Cell. 2020;182:1295–1310. doi: 10.1016/j.cell.2020.08.012. e1220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Linsky T.W., Vergara R., Codina N., Nelson J.W., Walker M.J., Su W., Barnes C.O., Hsiang T.Y., Esser-Nobis K., Yu K., et al. De novo design of potent and resilient hACE2 decoys to neutralize SARS-CoV-2. Science. 2020;370:1208–1214. doi: 10.1126/science.abe0075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Crackower M.A., Sarao R., Oudit G.Y., Yagil C., Kozieradzki I., Scanga S.E., Oliveira-dos-Santos A.J., da Costa J., Zhang L., Pei Y., et al. Angiotensin-converting enzyme 2 is an essential regulator of heart function. Nature. 2002;417:822–828. doi: 10.1038/nature00786. [DOI] [PubMed] [Google Scholar]
- 10.Danilczyk U., Sarao R., Remy C., Benabbas C., Stange G., Richter A., Arya S., Pospisilik J.A., Singer D., Camargo S.M., et al. Essential role for collectrin in renal amino acid transport. Nature. 2006;444:1088–1091. doi: 10.1038/nature05475. [DOI] [PubMed] [Google Scholar]
- 11.Hashimoto T., Perlot T., Rehman A., Trichereau J., Ishiguro H., Paolino M., Sigl V., Hanada T., Hanada R., Lipinski S., et al. ACE2 links amino acid malnutrition to microbial ecology and intestinal inflammation. Nature. 2012;487:477–481. doi: 10.1038/nature11228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Imai Y., Kuba K., Rao S., Huan Y., Guo F., Guan B., Yang P., Sarao R., Wada T., Leong-Poi H., et al. Angiotensin-converting enzyme 2 protects from severe acute lung failure. Nature. 2005;436:112–116. doi: 10.1038/nature03712. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zhang H., Penninger J.M., Li Y., Zhong N., Slutsky A.S. Angiotensin-converting enzyme 2 (ACE2) as a SARS-CoV-2 receptor: molecular mechanisms and potential therapeutic target. Intensive Care Med. 2020;46:586–590. doi: 10.1007/s00134-020-05985-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zhu Y., Feng F., Hu G., Wang Y., Yu Y., Zhu Y., Xu W., Cai X., Sun Z., Han W., et al. A genome-wide CRISPR screen identifies host factors that regulate SARS-CoV-2 entry. Nat. Commun. 2021;12:961. doi: 10.1038/s41467-021-21213-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zhao M.-M., Yang W.-L., Yang F.-Y., Zhang L., Huang W.-J., Hou W., Fan C.-F., Jin R.-H., Feng Y.-M., Wang Y.-C., Yang J.-K. Cathepsin L plays a key role in SARS-CoV-2 infection in humans and humanized mice and is a promising target for new drug development. Signal Transduct. Targeted Ther. 2021;6:134. doi: 10.1038/s41392-021-00558-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Shang J., Wan Y., Luo C., Ye G., Geng Q., Auerbach A., Li F. Cell entry mechanisms of SARS-CoV-2. Proc. Natl. Acad. Sci. U. S. A. 2020;117:11727–11734. doi: 10.1073/pnas.2003138117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Rabaan A.A., Al-Ahmed S.H., Haque S., Sah R., Tiwari R., Malik Y.S., Dhama K., Yatoo M.I., Bonilla-Aldana D.K., Rodriguez-Morales A.J. SARS-CoV-2, SARS-CoV, and MERS-COV: a comparative overview. Inf. Med. 2020;28:174–184. [PubMed] [Google Scholar]
- 18.McDougall W.M., Perreira J.M., Reynolds E.C., Brass A.L. CRISPR genetic screens to discover host-virus interactions. Curr Opin Virol. 2018;29:87–100. doi: 10.1016/j.coviro.2018.03.007. [DOI] [PubMed] [Google Scholar]
- 19.Baggen J., Persoons L., Vanstreels E., Jansen S., Van Looveren D., Boeckx B., Geudens V., De Man J., Jochmans D., Wauters J., et al. Genome-wide CRISPR screening identifies TMEM106B as a proviral host factor for SARS-CoV-2. Nat. Genet. 2021;53:435–444. doi: 10.1038/s41588-021-00805-2. [DOI] [PubMed] [Google Scholar]
- 20.Bailey A.L., Diamond M.S. A crisp(r) new perspective on SARS-CoV-2 biology. Cell. 2021;184:15–17. doi: 10.1016/j.cell.2020.12.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Baumann K. Cellular basis for SARS-CoV-2 infection. Nat. Rev. Mol. Cell Biol. 2021;22:2. doi: 10.1038/s41580-020-00319-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Daniloski Z., Jordan T.X., Wessels H.H., Hoagland D.A., Kasela S., Legut M., Maniatis S., Mimitou E.P., Lu L., Geller E., et al. Identification of required host factors for SARS-CoV-2 infection in human cells. Cell. 2021;184:92–105. doi: 10.1016/j.cell.2020.10.030. e116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Heaton B.E., Trimarco J.D., Hamele C.E., Harding A.T., Tata A., Zhu X., Tata P.R., Smith C.M., Heaton N.S. SRSF protein kinases 1 and 2 are essential host factors for human coronaviruses including SARS-CoV-2. bioRxiv, 2020. 2020;2008 doi: 10.1101/2020.08.14.251207. 2014.251207. [DOI] [Google Scholar]
- 24.Hoffmann H.H., Sanchez-Rivera F.J., Schneider W.M., Luna J.M., Soto-Feliciano Y.M., Ashbrook A.W., Le Pen J., Leal A.A., Ricardo-Lax I., Michailidis E., et al. Functional interrogation of a SARS-CoV-2 host protein interactome identifies unique and shared coronavirus host factors. Cell Host Microbe. 2021;29:267–280. doi: 10.1016/j.chom.2020.12.009. e265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Schneider W.M., Luna J.M., Hoffmann H.H., Sanchez-Rivera F.J., Leal A.A., Ashbrook A.W., Le Pen J., Ricardo-Lax I., Michailidis E., Peace A., et al. Genome-scale identification of SARS-CoV-2 and pan-coronavirus host factor networks. Cell. 2021;184:120–132. doi: 10.1016/j.cell.2020.12.006. e114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wang R., Simoneau C.R., Kulsuptrakul J., Bouhaddou M., Travisano K.A., Hayashi J.M., Carlson-Stevermer J., Zengel J.R., Richards C.M., Fozouni P., et al. Genetic screens identify host factors for SARS-CoV-2 and common cold coronaviruses. Cell. 2021;184:106–119. doi: 10.1016/j.cell.2020.12.004. e114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wei J., Alfajaro M.M., DeWeirdt P.C., Hanna R.E., Lu-Culligan W.J., Cai W.L., Strine M.S., Zhang S.M., Graziano V.R., Schmitz C.O., et al. Genome-wide CRISPR screens reveal host factors critical for SARS-CoV-2 infection. Cell. 2021;184:76–91. doi: 10.1016/j.cell.2020.10.028. e13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Aregger M., Lawson K.A., Billmann M., Costanzo M., Tong A.H.Y., Chan K., Rahman M., Brown K.R., Ross C., Usaj M., et al. Systematic mapping of genetic interactions for de novo fatty acid synthesis identifies C12orf49 as a regulator of lipid metabolism. Nat Metab. 2020;2:499–513. doi: 10.1038/s42255-020-0211-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ogando N.S., Dalebout T.J., Zevenhoven-Dobbe J.C., Limpens R., van der Meer Y., Caly L., Druce J., de Vries J.J.C., Kikkert M., Barcena M., et al. SARS-coronavirus-2 replication in Vero E6 cells: replication kinetics, rapid adaptation and cytopathology. J. Gen. Virol. 2020;101:925–940. doi: 10.1099/jgv.0.001453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Chu H., Chan J.F., Yuen T.T., Shuai H., Yuan S., Wang Y., Hu B., Yip C.C., Tsang J.O., Huang X., et al. Comparative tropism, replication kinetics, and cell damage profiling of SARS-CoV-2 and SARS-CoV with implications for clinical manifestations, transmissibility, and laboratory studies of COVID-19: an observational study. Lancet Microbe. 2020;1:e14–e23. doi: 10.1016/S2666-5247(20)30004-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Harcourt J., Tamin A., Lu X., Kamili S., Sakthivel S.K., Murray J., Queen K., Tao Y., Paden C.R., Zhang J., et al. 2020. Isolation and Characterization of SARS-CoV-2 from the First US COVID-19 Patient. bioRxiv. [DOI] [Google Scholar]
- 32.Harcourt J., Tamin A., Lu X., Kamili S., Sakthivel S.K., Murray J., Queen K., Tao Y., Paden C.R., Zhang J., et al. Severe acute respiratory syndrome coronavirus 2 from patient with 2019 novel coronavirus disease, United States. Emerg. Infect. Dis. 2020;26 doi: 10.3201/eid2606.200516. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Gordon D.E., Hiatt J., Bouhaddou M., Rezelj V.V., Ulferts S., Braberg H., Jureka A.S., Obernier K., Guo J.Z., Batra J., et al. Comparative host-coronavirus protein interaction networks reveal pan-viral disease mechanisms. Science. 2020 doi: 10.1126/science.abe9403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Kaye M. SARS-associated coronavirus replication in cell lines. Emerg. Infect. Dis. 2006;12:128–133. doi: 10.3201/eid1201.050496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Barretina J., Caponigro G., Stransky N., Venkatesan K., Margolin A.A., Kim S., Wilson C.J., Lehar J., Kryukov G.V., Sonkin D., et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–607. doi: 10.1038/nature11003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Abe K.T., Li Z., Samson R., Samavarchi-Tehrani P., Valcourt E.J., Wood H., Budylowski P., Dupuis A.P., 2nd, Girardin R.C., Rathod B., et al. A simple protein-based surrogate neutralization assay for SARS-CoV-2. JCI Insight. 2020;5 doi: 10.1172/jci.insight.142362. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Banerjee A., Nasir J.A., Budylowski P., Yip L., Aftanas P., Christie N., Ghalami A., Baid K., Raphenya A.R., Hirota J.A., et al. Isolation, sequence, infectivity, and replication kinetics of severe acute respiratory syndrome coronavirus 2. Emerg. Infect. Dis. 2020;26:2054–2063. doi: 10.3201/eid2609.201495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Hart T., Tong A.H.Y., Chan K., Van Leeuwen J., Seetharaman A., Aregger M., Chandrashekhar M., Hustedt N., Seth S., Noonan A., et al. Evaluation and design of genome-wide CRISPR/SpCas9 knockout screens. G3 (Bethesda) 2017;7:2719–2727. doi: 10.1534/g3.117.041277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Hart T., Brown K.R., Sircoulomb F., Rottapel R., Moffat J. Measuring error rates in genomic perturbation screens: gold standards for human functional genomics. Mol. Syst. Biol. 2014;10:733. doi: 10.15252/msb.20145216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Hart T., Chandrashekhar M., Aregger M., Steinhart Z., Brown K.R., MacLeod G., Mis M., Zimmermann M., Fradet-Turcotte A., Sun S., et al. High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities. Cell. 2015 doi: 10.1016/j.cell.2015.11.015. [DOI] [PubMed] [Google Scholar]
- 41.Hart T., Moffat J. BAGEL: a computational framework for identifying essential genes from pooled library screens. BMC Bioinf. 2016;17:164. doi: 10.1186/s12859-016-1015-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Brown K.R., Mair B., Soste M., Moffat J. CRISPR screens are feasible in TP53 wild-type cells. Mol. Syst. Biol. 2019;15:e8679. doi: 10.15252/msb.20188679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Goujon C., Rebendenne A., Roy P., Bonaventure B., Valadao A.C., Desmarets L., Rouille Y., Tauziet M., Arnaud-Arnould M., Giovannini D., et al. Bidirectional genome-wide CRISPR screens reveal host factors regulating SARS-CoV-2, MERS-CoV and seasonal HCoVs. Res Sq. 2021 doi: 10.21203/rs.3.rs-555275/v1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Hu Y., Wang X., Song J., Wu J., Xu J., Chai Y., Ding Y., Wang B., Wang C., Zhao Y., et al. Chromatin remodeler ARID1A binds IRF3 to selectively induce antiviral interferon production in macrophages. Cell Death Dis. 2021;12:743. doi: 10.1038/s41419-021-04032-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Lindberg M.F., Meijer L. Dual-specificity, tyrosine phosphorylation-regulated kinases (DYRKs) and cdc2-like kinases (CLKs) in human disease, an overview. Int. J. Mol. Sci. 2021;22 doi: 10.3390/ijms22116047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Khodadoost M., Niknam Z., Farahani M., Razzaghi M., Norouzinia M. Investigating the human protein-host protein interactome of SARS-CoV-2 infection in the small intestine. Gastroenterol Hepatol Bed Bench. 2020;13:374–387. [PMC free article] [PubMed] [Google Scholar]
- 47.O'Brien M.J., Ansari A. Critical involvement of TFIIB in viral pathogenesis. Front. Mol. Biosci. 2021;8 doi: 10.3389/fmolb.2021.669044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Ruiz A., Pauls E., Badia R., Riveira-Munoz E., Clotet B., Ballana E., Este J.A. Characterization of the influence of mediator complex in HIV-1 transcription. J. Biol. Chem. 2014;289:27665–27676. doi: 10.1074/jbc.M114.570341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Mirzaei H., Faghihloo E. Viruses as key modulators of the TGF-beta pathway; a double-edged sword involved in cancer. Rev. Med. Virol. 2018;28 doi: 10.1002/rmv.1967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Jung H., Seong H.A., Manoharan R., Ha H. Serine-threonine kinase receptor-associated protein inhibits apoptosis signal-regulating kinase 1 function through direct interaction. J. Biol. Chem. 2010;285:54–70. doi: 10.1074/jbc.M109.045229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Poreba E., Broniarczyk J.K., Gozdzicka-Jozefiak A. Epigenetic mechanisms in virus-induced tumorigenesis. Clin. Epigenet. 2011;2:233–247. doi: 10.1007/s13148-011-0026-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Pedersen K.B., Chhabra K.H., Nguyen V.K., Xia H., Lazartigues E. The transcription factor HNF1alpha induces expression of angiotensin-converting enzyme 2 (ACE2) in pancreatic islets from evolutionarily conserved promoter motifs. Biochim. Biophys. Acta. 2013;1829:1225–1235. doi: 10.1016/j.bbagrm.2013.09.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Senkel S., Lucas B., Klein-Hitpass L., Ryffel G.U. Identification of target genes of the transcription factor HNF1beta and HNF1alpha in a human embryonic kidney cell line. Biochim. Biophys. Acta. 2005;1731:179–190. doi: 10.1016/j.bbaexp.2005.10.003. [DOI] [PubMed] [Google Scholar]
- 54.Atlante S., Mongelli A., Barbi V., Martelli F., Farsetti A., Gaetano C. The epigenetic implication in coronavirus infection and therapy. Clin. Epigenet. 2020;12:156. doi: 10.1186/s13148-020-00946-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Hu Y., O'Boyle K., Auer J., Raju S., You F., Wang P., Fikrig E., Sutton R.E. Multiple UBXN family members inhibit retrovirus and lentivirus production and canonical NFkappaBeta signaling by stabilizing IkappaBalpha. PLoS Pathog. 2017;13 doi: 10.1371/journal.ppat.1006187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Shin D., Lee J., Park J.H., Min J.Y. Double plant homeodomain fingers 2 (DPF2) promotes the immune escape of influenza virus by suppressing beta interferon production. J. Virol. 2017;91 doi: 10.1128/JVI.02260-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Zhang W., Wang Q., Yang F., Zhu Z., Duan Y., Yang Y., Cao W., Zhang K., Ma J., Liu X., Zheng H. JMJD6 negatively regulates cytosolic RNA induced antiviral signaling by recruiting RNF5 to promote activated IRF3 K48 ubiquitination. PLoS Pathog. 2021;17 doi: 10.1371/journal.ppat.1009366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Schulz W.A., Lang A., Koch J., Greife A. The histone demethylase UTX/KDM6A in cancer: progress and puzzles. Int. J. Cancer. 2019;145:614–620. doi: 10.1002/ijc.32116. [DOI] [PubMed] [Google Scholar]
- 59.Banerjee A., El-Sayes N., Budylowski P., Jacob R.A., Richard D., Maan H., Aguiar J.A., Demian W.L., Baid K., D'Agostino M.R., et al. Experimental and natural evidence of SARS-CoV-2 infection-induced activation of type I interferon responses. iScience. 2021 doi: 10.1016/j.isci.2021.102477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Hoffmann M., Hofmann-Winkler H., Pöhlmann S. 2018. Priming Time: How Cellular Proteases Arm Coronavirus Spike Proteins. Activation of Viruses by Host Proteases; pp. 71–98. [DOI] [Google Scholar]
- 61.Park J.E., Li K., Barlan A., Fehr A.R., Perlman S., McCray P.B., Jr., Gallagher T. Proteolytic processing of Middle East respiratory syndrome coronavirus spikes expands virus tropism. Proc. Natl. Acad. Sci. U. S. A. 2016;113:12262–12267. doi: 10.1073/pnas.1608147113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Matsuyama S., Nao N., Shirato K., Kawase M., Saito S., Takayama I., Nagata N., Sekizuka T., Katoh H., Kato F., et al. Enhanced isolation of SARS-CoV-2 by TMPRSS2-expressing cells. Proc. Natl. Acad. Sci. U. S. A. 2020;117:7001–7003. doi: 10.1073/pnas.2002589117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Padmanabhan P., Desikan R., Dixit N.M. Targeting TMPRSS2 and Cathepsin B/L together may be synergistic against SARS-CoV-2 infection. PLoS Comput. Biol. 2020;16 doi: 10.1371/journal.pcbi.1008461. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Hashimoto R., Sakamoto A., Deguchi S., Yi R., Sano E., Hotta A., Takahashi K., Yamanaka S., Takayama K. Dual inhibition of TMPRSS2 and Cathepsin Bprevents SARS-CoV-2 infection in iPS cells. Mol. Ther. Nucleic Acids. 2021;26:1107–1114. doi: 10.1016/j.omtn.2021.10.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Hoffmann M., Hofmann-Winkler H., Smith J.C., Kruger N., Arora P., Sorensen L.K., Sogaard O.S., Hasselstrom J.B., Winkler M., Hempel T., et al. Camostat mesylate inhibits SARS-CoV-2 activation by TMPRSS2-related proteases and its metabolite GBPA exerts antiviral activity. EBioMedicine. 2021;65 doi: 10.1016/j.ebiom.2021.103255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Xia S., Lan Q., Su S., Wang X., Xu W., Liu Z., Zhu Y., Wang Q., Lu L., Jiang S. The role of furin cleavage site in SARS-CoV-2 spike protein-mediated membrane fusion in the presence or absence of trypsin. Signal Transduct. Targeted Ther. 2020;5:92. doi: 10.1038/s41392-020-0184-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Sasaki M., Uemura K., Sato A., Toba S., Sanaki T., Maenaka K., Hall W.W., Orba Y., Sawa H. SARS-CoV-2 variants with mutations at the S1/S2 cleavage site are generated in vitro during propagation in TMPRSS2-deficient cells. PLoS Pathog. 2021;17 doi: 10.1371/journal.ppat.1009233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Papa G., Mallery D.L., Albecka A., Welch L.G., Cattin-Ortolá J., Luptak J., Paul D., McMahon H.T., Goodfellow I.G., Carter A., et al. Furin cleavage of SARS-CoV-2 Spike promotes but is not essential for infection and cell-cell fusion. PLoS Pathog. 2021;17 doi: 10.1371/journal.ppat.1009246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Brest P., Mograbi B., Hofman P., Milano G. Using genetics to dissect SARS-CoV-2 infection. Trends Genet. 2021;37:203–204. doi: 10.1016/j.tig.2020.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Phillips N. The coronavirus is here to stay - here's what that means. Nature. 2021;590:382–384. doi: 10.1038/d41586-021-00396-2. [DOI] [PubMed] [Google Scholar]
- 71.Hamilton M.A., Russo R.C., Thurston R.V. Trimmed Spearman-Karber method for estimating median lethal concentrations in toxicity bioassays. Environ. Sci. Technol. 1977;11:714–719. doi: 10.1021/es60130a004. [DOI] [Google Scholar]
- 72.Nasir J.A., Kozak R.A., Aftanas P., Raphenya A.R., Smith K.M., Maguire F., Maan H., Alruwaili M., Banerjee A., Mbareche H., et al. 2020. A Comparison of Whole Genome Sequencing of SARS-CoV-2 Using Amplicon-Based Sequencing, Random Hexamers, and Bait Capture. Viruses 12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Ritchie M.E., Phipson B., Wu D., Hu Y., Law C.W., Shi W., Smyth G.K. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43 doi: 10.1093/nar/gkv007. e47-e47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Khan A., Mathelier A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinf. 2017;18:287. doi: 10.1186/s12859-017-1708-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Patro R., Duggal G., Love M.I., Irizarry R.A., Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods. 2017;14:417–419. doi: 10.1038/nmeth.4197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Soneson C., Love M.I., Robinson M.D. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res. 2015;4:1521. doi: 10.12688/f1000research.7563.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.McCarthy D.J., Chen Y., Smyth G.K. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40:4288–4297. doi: 10.1093/nar/gks042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Tapial J., Ha K.C.H., Sterne-Weiler T., Gohr A., Braunschweig U., Hermoso-Pulido A., Quesnel-Vallieres M., Permanyer J., Sodaei R., Marquez Y., et al. An atlas of alternative splicing profiles and functional associations reveals new regulatory programs and genes that simultaneously express multiple major isoforms. Genome Res. 2017;27:1759–1768. doi: 10.1101/gr.220962.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Ha K.C.H., Blencowe B.J., Morris Q. QAPA: a new method for the systematic analysis of alternative polyadenylation from RNA-seq data. Genome Biol. 2018;19:45. doi: 10.1186/s13059-018-1414-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Gruber A.J., Schmidt R., Gruber A.R., Martin G., Ghosh S., Belmadani M., Keller W., Zavolan M. A comprehensive analysis of 3' end sequencing data sets reveals novel polyadenylation signals and the repressive role of heterogeneous ribonucleoprotein C on cleavage and polyadenylation. Genome Res. 2016;26:1145–1159. doi: 10.1101/gr.202432.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Raudvere U., Kolberg L., Kuzmin I., Arak T., Adler P., Peterson H., Vilo J. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update) Nucleic Acids Res. 2019;47:W191–W198. doi: 10.1093/nar/gkz369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Berriz G.F., Beaver J.E., Cenik C., Tasan M., Roth F.P. Next generation software for functional trend analysis. Bioinformatics. 2009;25:3043–3044. doi: 10.1093/bioinformatics/btp498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Korotkevich G., Sukhov V., Budin N., Shpak B., Artyomov M.N., Sergushichev A. bioRxiv; 2021. Fast Gene Set Enrichment Analysis. [DOI] [Google Scholar]
- 84.Liberzon A., Subramanian A., Pinchback R., Thorvaldsdottir H., Tamayo P., Mesirov J.P. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27:1739–1740. doi: 10.1093/bioinformatics/btr260. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.








