Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Nov 18.
Published in final edited form as: Cell Syst. 2020 Oct 19;11(5):523–535.e9. doi: 10.1016/j.cels.2020.09.009

Mismatch-CRISPRi reveals the co-varying expression-fitness relationships of essential genes in Escherichia coli and Bacillus subtilis

John S Hawkins 1,8, Melanie R Silvis 1,8, Byoung-Mo Koo 1, Jason M Peters 1,7, Hendrik Osadnik 1, Marco Jost 1,4,5,6, Cameron C Hearne 1, Jonathan S Weissman 4,5,6, Horia Todor 1,*, Carol A Gross 1,2,3,9,*
PMCID: PMC7704046  NIHMSID: NIHMS1642568  PMID: 33080209

SUMMARY

Essential genes are the hubs of cellular networks, but lack of high-throughput methods for titrating gene expression has limited our understanding of the fitness landscapes against which their expression levels are optimized. We developed a modified CRISPRi system leveraging the predictable reduction in efficacy of imperfectly matched sgRNAs to generate defined levels of CRISPRi activity and demonstrated its broad applicability. Using libraries of mismatched sgRNAs predicted to span the full range of knockdown levels, we characterized the expression-fitness relationships of most essential genes in Escherichia coli and Bacillus subtilis. We find that these relationships vary widely from linear to bimodal, but are similar within pathways. Notably, despite ~2 billion years of evolutionary separation between E. coli and B. subtilis, most essential homologs have similar expression-fitness relationships, with rare but informative differences. Thus, the expression levels of essential genes may reflect homeostatic or evolutionary constraints shared between the two organisms.

Keywords: CRISRPi, functional genomics, systems biology, essential genes, evolution, gene regulation, E. coli, B. subtilis

Graphical Abstract

graphic file with name nihms-1642568-f0001.jpg

eTOC Blurb:

Hawkins and Silvis et al. develop a system for predictably titrating gene expression in bacteria by introducing specific mismatches into CRISPRi sgRNAs. Mismatched sgRNAs enable multiple knockdown levels across many genes in a single experiment. They use this technique to determine the expression-fitness curves of all essential genes in Escherichia coli and Bacillus subtilis, finding that they are shared within pathways and between homologs diverged by ~2 billion years.

INTRODUCTION

Bacteria must optimize protein production to maximize survival and growth in constantly changing environments. Given the high energetic cost of protein synthesis, optimizing expression is particularly important for essential genes: although only ~5-10% of the genome, they constitute a disproportionate fraction (~50%) of the proteome (Lalanne et al., 2018) and insufficient expression is, by definition, fatal. Previous work using CRISPR interference (CRISPRi), hypomorphs, and promoter replacement revealed gene-, environment-, and antibiotic-specific fitness effects of altering essential gene expression (Bauer et al., 2015; Dekel and Alon, 2005; Eames and Kortemme, 2012; Johnson et al., 2019; Keren et al., 2016; Nichols et al., 2011; Peters et al., 2016), but the lack of a facile method for systematically perturbing bacterial gene expression has thus far prevented a comprehensive understanding of how bacteria optimize expression of their essential protein complement. CRISPRi, which represses bacterial transcription by targeting a catalytically dead Cas9 (dCas9) to an open reading frame using a complementary sgRNA, is an inducible and inherently barcoded system that has been used to perturb essential gene expression in its native context. However, tuning transcriptional repression by adjusting dCas9 or sgRNA abundance (Liu et al., 2017; Peters et al., 2016) is noisy and precludes the interrogation of multiple knockdown levels in a single experiment (Vigouroux et al., 2018). Furthermore, despite recent advances in our understanding of the determinants of sgRNA efficacy (Calvo-Villamañán et al., 2020), tuning transcriptional repression through sgRNA design remains difficult due to low accuracy and intrinsic sequence constraints.

Here we establish a species-independent approach for predictably titrating CRISPRi activity in bacteria using single mismatches in the base-pairing region of sgRNAs. Mismatched sgRNAs enable massively parallel interrogation of the fitness effects of many intermediate levels of CRISPRi efficacy across genes in a single pooled growth experiment. Building on previous studies of off-target and mismatched sgRNA activity (Gilbert et al., 2014; Jost et al., 2020; Vigouroux et al., 2018), we screened a comprehensive library of mismatched gfp-targeting sgRNAs in Escherichia coli and Bacillus subtilis and used the data to build a species-independent model of mismatched sgRNA activity for bacterial CRISPRi. Using this model, we designed two large (>30,000 elements) libraries consisting of 90 mismatched sgRNAs spanning knockdown space targeting each essential gene in E. coli and B. subtilis. We used this data to explore gene-specific expression-fitness relationships by comparing the fitness effects of all sgRNAs targeting a gene to their predicted levels of CRISPRi activity. Our analysis of per gene expression-fitness relationships suggests that CRISPRi targeting of different essential genes differentially affects cellular fitness, but that these effects are largely consistent both within pathways, and between E. coli and B. subtilis homologs. The similarity of expression-fitness relationships between E. coli and B. subtilis homologs despite ~2 billion years evolutionary distance raises the possibility that shared homeostatic or evolutionary constraints underlie the optimization of essential gene expression and highlights processes with potentially different evolutionary pressures. Our findings not only provide insights into bacterial physiology but also introduce an important tool for exploring reduced-expression phenotypes in many bacterial species.

RESULTS AND DISCUSSION

CRISPRi efficacy is similarly titrated by sgRNA mismatches in E. coli and B. subtilis

Mismatched sgRNA efficacy has been sparsely tested in E. coli (Qi et al., 2013) using rfp targeted by variants of a single sgRNA. However, no comprehensive, multi-species measurements of repression by mismatched sgRNAs have been reported for bacterial CRISPRi. To directly quantify the impact of mismatches on the repression of mismatched sgRNAs in bacterial systems, we generated a comprehensive library of sgRNA spacers targeting gfp (3201 total), consisting of all spacers fully complementary to the non-template strand (33), a majority of their possible single mismatch variants (47/60), and a subset of their possible double mismatch variants (49/1710) (Figure S1A). Using FACS-seq (Figure 1A, STAR Methods), we quantified the ability of these sgRNAs to repress transcription of a highly expressed chromosomal copy of gfp both in E. coli and B. subtilis (Figure S2A-C, Table S1). We found that sgRNAs with either single (Figure 1B) or double (Figure S3A) mismatches in their base-pairing regions generated the full range of repression (no efficacy to same efficacy as fully complementary sgRNAs, Table S1) in both species. sgRNA activity was unimodal (Figure S2D-G) and highly correlated between E. coli and B. subtilis (R2: singly mismatched sgRNAs = 0.65, doubly mismatched sgRNAs = 0.61, all sgRNAs = 0.71; Figure 1B, Figure S3A, and Table S1), despite an evolutionary distance of several billion years and differences in experimental setup (E. coli: plasmid-encoded sgRNAs, B. subtilis: chromosomally integrated sgRNAs).

Figure 1. Singly mismatched sgRNAs reproducibly generate a range of knockdown efficacies in B. subtilis and E. coli but perform differently from a dCas9-KRAB system in mammalian cells.

Figure 1.

(A) Workflow of a FACS-seq experiment. (B) FACS-seq scores (average of 2 biological replicates) for each singly mismatched sgRNA targeting gfp in B. subtilis and E. coli. Additional noise in E. coli likely represents changes in plasmid copy number during outgrowth or other E. coli specific effects. (C) Mean relative activity of sgRNAs with every possible single base substitution at every possible position in E. coli and B. subtilis targeting gfp and in a mammalian CRISPRi system targeting essential genes (Jost et al., 2020), total of 26,248 mismatched sgRNAs). The seed region constitutes the 7 most PAM-proximal bases, the middle region the next 5 bases, and the distal region the final 8 bases.

Mismatched sgRNA efficacy has been explored in mammalian CRISPRi systems (Gilbert et al., 2014; Jost et al., 2020). However, substantial differences exist between CRISPRi modalities in bacteria (blocking RNA-polymerase elongation) and mammalian systems (recruiting chromatin modifiers to promoters). To compare mismatched sgRNA efficacy between bacteria and mammalian systems, we calculated the mean relative activity of bacterial gfp-targeting and mammalian essential gene-targeting singly mismatched sgRNAs (Jost et al., 2020) for all combinations of mismatch position and base substitution. We found that although sgRNA activity is correlated between the two systems (R2 = 0.61, Figure 1C and Figure S4), the activity of the mammalian system is more strongly impacted by mismatches in general, particularly in the PAM-proximal seed region (Figure 1C and Figure S4). Whereas almost all mismatches in the seed region completely abolish sgRNA activity in the mammalian system, sgRNAs with equivalent mismatches still retain measurable activity in the bacterial system, consistent with previous reports (Qi et al., 2013) and our measurements of individual strains (Figure S2D-G). These differences may be due to differences in how CRISPRi functions. In bacteria, dCas9 efficiently blocks transcriptional elongation when targeted within the ORF, while in mammalian systems, efficient repression requires targeting a dCas9-KRAB fusion to occlude the promoter region and recruit chromatic modifying proteins (Gilbert et al., 2013). Previous work compared the activity of mismatched sgRNAs in mammalian cells using either dCas9 or dCas9-KRAB (Gilbert et al., 2014) and found that mismatches in the seed region were better tolerated by a dCas9 repression system than by a dCas9-KRAB system. This indicates that KRAB function may be responsible for the sensitivity of the mammalian system to mismatches. Taken together, these data suggest that although the primary determinant of mismatched sgRNA efficacy in both bacteria and mammalian systems is shared, differences in how CRISPRi functions in these two systems manifest as quantitative differences in mismatched sgRNA efficacy. Mismatched sgRNAs function similarly in E. coli and B. subtilis, potentially allowing facile design of sgRNAs with defined activity across diverse bacterial systems.

A species-independent linear model robustly predicts mismatched-sgRNA activity

Given the species-independent performance of gfp-targeting mismatched sgRNAs, we next asked whether we could accurately predict the effects of single mismatches on sgRNA activity. Previous work on CRISPRi off-target effects (Gilbert et al., 2014; Qi et al., 2013) and concurrent work on mismatched sgRNAs in a mammalian context (Jost et al., 2020), identified mismatch position, base substitution, and the GC% of the fully complementary spacer as the strongest determinants of mismatched sgRNA efficacy. We therefore constructed a simple linear model that used one-hot encoded mismatch position (20 parameters), one-hot encoded base substitution (12 parameters), and spacer GC% (1 parameter) to predict the relative efficacy of mismatched gfp-targeting sgRNAs (Figure 2A).

Figure 2. Mismatched sgRNA activity is accurately predicted by a simple linear model.

Figure 2.

(A) Schematic representation of a simple linear model for predicting the relative activity of mismatched sgRNAs. (B) Distributions of singly mismatched sgRNA relative activities by mismatch position. Each distribution represents 36-93 sgRNAs. (C) Comparison of model parameters for base substitution and the average ΔΔG of the mismatch calculated using a nearest neighbor approximation and the values from (Alkan et al., 2018). (D) The predictions of a linear model trained on GC%, mismatch position, and mismatch identity compared to the measured relative gfp knockdown efficacies of each sgRNA averaged over both species. Inset is a histogram of the differences between predicted and measured knockdown, reflecting both prediction and measurement error: 56% of sgRNAs measured within 0.15 of their predicted activity (red bars). 95% confidence intervals for the predictions were in the range (0.100 to 0.156), indicating consistent and low prediction error. (E) The predictions of the linear model compared to the measured singly mismatched sgRNA association rates (kON) in vitro (Boyle et al., 2017). Grey lines indicate the average (solid) and average +/− 1 SD (dashed) association rate of sgRNAs with mutated PAMs. Since such sgRNAs have no measurable association rate, this represents the detection limit of the assay in (Boyle et al., 2017).

We separately trained this linear 33-parameter model on the E. coli, B. subtilis, or species-averaged relative efficacy of our 1,551 singly mismatched gfp-targeting sgRNAs. Regardless of which data was used for training, model weights for mismatch location, base substitution, and spacer GC% were similar (Figure S5, Table S2), could be used for cross-prediction (Figure S6) and reflected known dCas9 behavior. Model weights for mismatch location indicated decreasing sgRNA efficacy for mismatches closer to the PAM (Figure 2B), as expected based on the mechanism of dCas9 binding and R-loop formation (Gong et al., 2018). Model weights for base substitution were correlated (R2 = 0.60, p < 0.005) to the changes in the free energy of sgRNA-DNA pairing (ΔΔG) caused by the base substitution (Figure 2C). Finally, the negative coefficient assigned to GC% suggests that sgRNAs with high GC% are more tolerant of mismatches, consistent with what has been found in a mammalian system (Jost et al., 2020). Despite the simplicity of this model, the effects of single mismatches were robustly predicted (Figure 2D, Figure S6, species-averaged R2 = 0.56, 11-fold cross-validation mean squared error = 0.10 +/ 0.08). Additionally, when applied to our 1,617 doubly mismatched sgRNAs by assuming that mismatches independently affect sgRNA efficacy (as suggested in Qi et al., 2013), our model accurately predicted the relative efficacy of these sgRNAs (species-averaged R2 = 0.53, Figure S3B).

To validate a biophysical interpretation of our model, we took advantage of a previously published data set containing measured association rates (kon) of a dCas9-sgRNA complex to 60 singly mismatched and 1130 doubly mismatched DNA sequences (Boyle et al., 2017). Reinterpreting this data as sgRNA mismatches, we compared the measured association rates to the relative sgRNA activity predicted by our model for these orthogonal sgRNAs (Figure 2E, Figure S3C). Our predicted sgRNA activity was highly correlated (R2: single mismatches = 0.71, double mismatches = 0.45) to the kon measured in this in vitro system, supporting the hypothesis that mismatched-CRISPRi functions by reducing the association rate of the dCas9-sgRNA complex for the target DNA, likely by slowing R-loop formation (Gong et al., 2018). Taken together, these data strongly suggest that a simple linear model trained on the relative efficacy of our gfp-targeting singly mismatched sgRNA library can be used to design mismatched sgRNAs with a defined relative activity level targeting any gene.

Measuring the fitness of libraries of mismatched sgRNAs in E. coliand B. subtilis

Using our model of mismatched sgRNA activity, we designed a set of sgRNAs targeted to the essential gene complement of E. coli and B. subtilis (~300 genes in each species, Table S3) and predicted to have a range of activities. We generated large pooled libraries of strains in which each essential gene is targeted by 100 sgRNAs (10 fully matched guides, each with 9 singly mismatched variants, STAR Methods, Figure S1C). Because the large size of these libraries (>30,000 elements) complicates handling and limits multiplexing, we also designed compact libraries in which each essential gene is targeted by 11 sgRNAs (library size ~4,000 elements) to facilitate follow-up experiments. Additionally, for two well characterized essential genes encoding UDP-GlcNAc-1 carboxyvinyltransferase (E. coli: murA, B. subtilis: murAA), and dihydrofolate reductase (E. coli: folA, B. subtilis: dfrA), we generated comprehensive libraries (at least 47/60 single mismatch variants for each sgRNA within the gene, STAR Methods, Fig. S1B). The libraries were grown for 10 doublings, maintaining exponential phase through back-dilution (Figure 3A). We calculated the relative fitness (Kampmann et al., 2013; Rest et al., 2013) of each strain by comparing its relative abundance (quantified by next-generation sequencing of the sgRNA spacers) to the relative abundance of 1,000 non-targeting sgRNAs at the start and end of each experiment (STAR Methods, Table S3). Relative fitness is defined as the number of doublings of any strain relative to the number of wildtype doublings over the time course of the experiment. Strains with a relative fitness of 1 grow as well as wild-type; lower values imply slower growth. Relative fitness was highly reproducible in both species (R2 > 0.9, Figure S7A-B) and was validated by orthogonal measurements of individual strain fitness (Figure S7C). Our relative fitness values for fully complementary guides were correlated with previously reported measurements (Rousset et al., 2018; Wang et al., 2018) but had greatly expanded dynamic range (Figure S7D-E) due to differences in experimental design. Whereas previous studies were optimized for determining essentiality by quantifying fitness over >15 generations, we optimized our experiments for an expanded dynamic range by quantifying fitness over 10 generations and sequencing to greater depth. This expanded dynamic range enabled the quantification of strong fitness defects, including measurement of negative relative fitness, which indicates active depletion from the pool. CRISPRi targeting of 23 E. coli genes and 24 B. subtilis genes reproducibly (>5 sgRNAs/gene) caused negative relative fitness (Table S4). Consistent with an interpretation of negative relative fitness as lysis, a majority (15/24) of these B. subtilis genes caused lysis (as assayed by microscopy) when targeted with a fully complementary sgRNA (Peters et al., 2016) (Table S4, STAR Methods).

Figure 3. The expression-fitness curves of essential genes in E. coli and B. subtilis can be studied using singly mismatched sgRNAs.

Figure 3.

(A) Schematic of the fitness experiment design. Each library was measured in 4-6 replicates and averaged. (B) Distribution of per sgRNA locus (solid lines) and per gene (dashed lines) correlations (Pearson r) for sgRNAs targeting genes in E. coli (orange) and B. subtilis (blue). (C-D) The fitness effects of all fully complementary sgRNA targeting essential genes in E. coli (C) and B. subtilis (D) showing that the identity of the targeted gene is the driving factor in determining the fitness effect of an sgRNA. Genes are arranged in order of median fitness defect. Values represent the mean of 4-6 biological replicates. Individual values for each sgRNA and their standard deviations can be found in Table S3. The standard deviation of relative fitness of our 1,000 non-targeting control sgRNAs was 0.0825 in E. coli and 0.0444 in B. subtilis, indicating that even subtle growth defects of 5-10% can be accurately resolved.

Per gene expression-fitness relationships are robustly quantified using mismatched-sgRNAs

We next assessed whether comparing the activity of sgRNAs predicted from our model to their measured relative fitness would allow us to infer the expression-fitness relationships of essential genes. This requires both that our gfp-trained model accurately predicts relative sgRNA efficacy and that fully complementary sgRNAs have similar efficacy at all loci within a gene.

We first tested the applicability of our gfp-trained model to sgRNAs targeting endogenous genes. Since repression of essential gene expression monotonically decreases cellular fitness, we reasoned that if our model is accurate, predicted sgRNA efficacy should be negatively correlated to relative fitness within a series of sgRNAs targeted to a specific locus. Consistent with this hypothesis, we found that predicted sgRNA activity within series was negatively correlated to the relative fitness of those strains in both E. coli (median r = −0.74, Figure 3B) and B. subtilis (median r = −0.86, Figure 3B), suggesting that relative sgRNA activity was correctly predicted. Weaker correlations in E. coli likely reflect variation in sgRNA plasmid copy number and/or E. coli specific effects (Cui et al., 2018). To further probe the generality of our model, we trained it on the relative fitness effects of our comprehensive mismatched sgRNA libraries (Figure S1B) targeting the essential endogenous dihydrofolate reductase genes: E. coli folA (1,525 sgRNAs) and B. subtilis dfrA (1,281 sgRNAs). Because dihydrofolate reductase abundance is linearly related to fitness above an initial threshold of activity (Bhattacharyya et al., 2016), we interpreted the fitness defects of strains containing mismatched sgRNAs targeting these genes as readouts of knockdown efficacy. We found that when trained on the folA or dfrA data, model weights (Figure S5) and performance (Figure S6) were similar to the gfp-trained model, suggesting that both our model of mismatched sgRNA efficacy and the parameters fit from the gfp data are broadly applicable.

We next tested whether knockdown efficacy was consistent across targeted loci. Because our model predicts knockdown efficacy with respect to the fully complementary sgRNA (“relative knockdown efficacy”), it can be applied across sgRNA families to determine expression-fitness curves only if fully complementary sgRNAs achieve similar knockdown efficacy at all loci within a gene. Fully complementary sgRNAs targeting gfp and folA/dfrA generated similar levels of knockdown and variability did not correlate with the location of the sgRNA within the gene (Figure S8). To determine if this pattern held true for other endogenous essential genes, we reasoned that differences in knockdown at different loci within a gene would manifest as differences in the fitness effect of fully complementary sgRNAs targeting the same gene (Figure 3C-D). Comparing the variability of the fitness effect of fully complementary sgRNAs targeting the same gene to the overall variability in the fitness effect of fully complementary sgRNAs using sum of squares, we found that between gene variability accounted for 73.3% of total variability in E. coli and 81.4% of total variability in B. subtilis. This suggests that fully complementary sgRNAs targeting the same gene are substantially more similar with regards to their fitness outcomes than fully complementary sgRNAs as a whole, and supports the assumption that fully complementary sgRNAs targeting the same gene have similar levels of activity.

Consistent with these outcomes, predicted sgRNA activity was negatively correlated with cellular fitness for all sgRNAs targeting the same gene in both E. coli (median r = −0.65, Figure 3B) and B. subtilis (median r = −0.75, Figure 3B). Taken together, these analyses strongly suggest that although noise exists in both predicted sgRNA activity and measured relative fitness, we can nonetheless accurately and sensitively probe the expression-fitness relationships of almost all essential genes in E. coli and B. subtilis by comparing the predicted activity of 90 mismatched sgRNAs to their measured fitness using a pooled screening approach.

Expression-fitness relationships are similar within biological processes and between essential homologs

Examining the expression-fitness relationships of the E. coli and B. subtilis essentialomes, we were struck by their diverse and gene-specific nature (Data S1, Data S2, and Table S4). To quantitatively characterize these differences and mitigate the effects of prediction and measurement noise, we first binned the sgRNAs targeting each gene according to their predicted sgRNA activity and calculated the median fitness within each bin (STAR Methods, and Figure S9A-B) using the average fitness of 4–6 replicates for each sgRNA. We then used these simplified representations of per gene expression-fitness relationships (listed for all genes in Table S5) to calculate pairwise distances between all 270 E. coli and all 240 B. subtilis essential genes.

Within each organism, we found that the expression-fitness relationships of genes involved in the same biological process (whether defined by KEGG, GO biological process, or COG, all functional annotations in Table S5) were significantly more similar to each other than to those of genes involved in different biological processes, even when excluding gene pairs in the same operon to account for CRISPRi polarity (all p < 10−16, STAR Methods). Inversely, clustering genes by the shape of their expression-fitness curves produced functional enrichments (Table S6) in both E. coli and B. subtilis. To eliminate the possibility that these similarities were the result of systemic biases in the prediction of mismatched sgRNA activity, we performed these same analyses using only the average per gene fitness effect of fully complementary sgRNAs, which do not depend on our model of mismatched sgRNA activity. We found that the per gene fitness effects of fully complementary sgRNAs were significantly more similar within biological processes (all p < 10−16, STAR Methods) than between biological processes, although clustering solely by the per gene fitness effects of fully complementary sgRNAs produced fewer functional enrichments. The similarity between expression-fitness relationships within pathways was noted in a previous study that used synthetic promoters to vary the expression of 81 S. cerevisiae genes (Keren et al., 2016). Our results extend this paradigm to the majority of essential functions in bacteria, and provide insights into which pathways are robust and which are sensitive, which we discuss in subsequent sections.

We next compared the expression-fitness relationships of essential gene homologs between E. coli and B. subtilis. In this cross-species comparison, we found that the expression-fitness curves of essential genes were, as a group, more similar to that of their homologs (p < 10−10) than to other genes in the opposing species. This observation was recapitulated using only the fully complementary sgRNAs, although with less significant differences (p < 10−6). The broad similarities between the expression-fitness relationships of many E. coli and B. subtilis homologs suggests that major homeostatic constraints on essential gene expression are shared between these two species, despite their evolutionary distance.

Expression-fitness relationships of biological processes

To explore the shared optimizations of bacterial essential gene expression more deeply, we examined the expression-fitness relationships of all E. coli and B. subtilis essential genes within three functional categories: cofactor biosynthesis, translation, and cytoplasmic peptidoglycan precursor synthesis (Table S7). Cofactor biosynthesis and cytoplasmic peptidoglycan precursor synthesis were chosen because many genes in these functional categories exhibited shared expression-fitness relationships (Table S6), and translation was chosen because it contained the largest number of essential genes in both E. coli and B. subtilis.

CRISPRi targeting of most essential cofactor biosynthesis genes (KEGG pathways under “Metabolism of cofactors and vitamins”; B. subtilis: 25 genes, E. coli: 36 genes,) did not strongly affect fitness in either species after 10 generations (Figure 4A-B). This observation is consistent with the small-colony but non-culturable phenotype of essential cofactor biosynthesis gene deletions (Koo et al., 2017) and suggests that these cofactors, which include riboflavin, menaquinone, NAD, CoA, and others, and/or the enzymes producing them are present in excess of what is required for exponential growth. This buffer may be required to enable rapid shifts in metabolism in response to changing environmental conditions, as has been proposed for enzymes in the pentose-phosphate pathway (Christodoulou et al., 2018). Transcriptional buffering may further fortify cells to CRISPRi targeting of these genes. CRISPRi targeting of the gene encoding dihydrofolate reductase resulted in a relatively strong fitness defect in both organisms (Data S1, Data S2), likely due to the role of this enzyme in recycling dihydrofolate back into tetrahydrofolate following its oxidation during methyl group donation. Additionally, in both E. coli and B. subtilis, the small number of co-factor genes with even stronger phenotypes than dihydrofolate reductase, were involved in additional processes, such as fatty acid or nucleotide metabolism, likely accounting for their strong phenotype (Table S7).

Figure 4. Expression-fitness relationships of essential genes are consistent within biological process and between B. subtilis and E. coli.

Figure 4.

Each panel compares the relative fitness to predicted sgRNA activity for all essential genes involved in: cofactor biosynthesis genes (KEGG pathways under “Metabolism of cofactors and vitamins”) in B. subtilis (A) or E. coli (B); KEGG pathways under “Translation” in B. subtilis (C) or E. coli (D); peptidoglycan biosynthesis (KEGG pathway ko00550) in B. subtilis. (E) or E. coli (F). Each thin line represents the expression-fitness relationship of a single essential gene. The thick line represents the median of all essential genes with the specified functional annotation in the specified organism. A list of all genes in each panel and their relative fitness is available in Table S7. Median absolute distances (MADs) associated with the sliding bin medians for every gene can be found in Table S5. Per gene expression-fitness relationships and the MADs associated with the binned values are illustrated in Data S1 and Data S2.

The robustness of both bacteria to CRISPRi targeting of essential cofactor synthesis genes contrasts with the strong, approximately linear effect of targeting genes involved in translation (KEGG pathways under “Translation”, Figure 4C-D; B. subtilis: 59 genes, E. coli: 62 genes). Although low levels of CRISPRi targeting of these genes do not strongly affect fitness, potentially due to a limited ability to autoregulate through transcriptional derepression, once this buffer is overcome, fitness decreases linearly with increasing sgRNA activity. Previous work has established a linear relationship between growth rate and the number of ribosomes per cell during exponential growth in E. coli, B. subtilis, and other bacteria (Borkowski et al., 2016; Schaechter et al., 1958; Scott et al., 2010). By linearly inhibiting the expression of genes required for large (rpl genes: 18 in B. subtilis, 19 in E. coli) and small (rps genes: 17 in B. subtilis, 16 in E. coli) subunit assembly and function, we likely decrease the number of functional ribosomes, leading to a corresponding linear decrease in growth rate. Moreover, feedback to restore ribosomal protein expression is unlikely because most ribosomal proteins are negatively regulated by their excess relative to rRNA (Nomura et al., 1980; Scott et al., 2014). Depletion of translation factors such as tRNA synthases (B. subtilis: 22 genes ,E. coli: 22 genes) has a similarly linear effect on growth rate (Data S1, Data S2), likely due to slowed elongation rate (Dai et al., 2016) as has been shown for some antibiotics that inhibit translation elongation (Scott et al., 2010). The linear relationship between the expression of proteins involved in translation and growth rate in both species reinforces the universal importance of translational capacity for determining growth rate.

CRISPRi targeting of genes involved in cytoplasmic peptidoglycan (PG) precursor synthesis (KEGG ko00550; B. subtilis: 10 genes, E. coli: 10 genes) also generated strong phenotypes in both species. However, in contrast to the linear expression-fitness relationship of genes involved in translation, PG synthesis genes exhibited bimodal fitness outcomes that depended on predicted sgRNA activity (Figure 4E-F). This bimodality is highlighted by the fitness outcomes of the comprehensive murA and murAA-targeting libraries (Figure S9A-B), even when considered independently of predicted sgRNA activity (Figure S9C-D). Cells tolerated partial repression of these genes without exhibiting a fitness defect. If expression was sufficiently repressed, these strains lysed (Table S4) as has been described for murA, murG, and mraY inhibition in E. coli (Fransen et al., 2017; Mengin-Lecreulx et al., 1991; Zheng et al., 2008) and for murC, murD, and murG depletion in B. subtilis (Peters et al., 2016). To determine whether the bimodality observed for these genes was due to bimodal CRISPRi activity, we measured the ability of 18 mismatched sgRNAs to repress a murAA-gfp transcriptional fusion in B. subtilis. These measurements were conducted in a B. subtilis strain complemented with non-targeted murAA (STAR Methods) to enable quantification of lethal levels of knockdown and to avoid the potential for transcriptional feedback. Measured knockdown closely tracked the predicted activity of sgRNAs targeting murAA (Figure S9E, Table S11), suggesting that the nonlinear expression-fitness relationship of uncomplemented murAA reflects non-linearly decreasing growth due to MurAA depletion, transcriptional feedback, cell lysis, or other host specific effects. Previous work in E. coli (Mengin-Lecreulx and van Heijenoort, 1985) and Salmonella typhimurium (Kahan et al., 1974) demonstrated that the levels of enzymes involved in cytoplasmic peptidoglycan precursor synthesis are not affected by growth rate or by fosfomycin, an antibiotic that covalently inactivates MurA, the first committed step in PG synthesis. Additionally, recent work in E. coli, Caulobacter crecentus, and Listeria monocytogenes (Harris and Theriot, 2016) found that low doses of fosfomycin impact cell morphology, as expected if peptidoglycan synthesis is inhibited, but did not impact growth rate. This suggests that the absence of homeostatic regulation in peptidoglycan precursor synthesis is widespread in bacteria. Given this lack of regulation, the dearth of intermediate fitness outcomes in either species upon repression of PG precursor synthesis is notable. It suggests that neither species is able to slow growth rate or up-regulate cytoplasmic PG precursor synthesis in response to reduced flux through this pathway to prevent lysis. It has been proposed that bacteria use peptidoglycan precursor concentration to sense and balance cellular metabolism and growth (Harris and Theriot, 2016). This would be incompatible with direct feedback regulation of cytoplasmic PG precursor synthesis and may explain the sharp transition between growth and lysis.

The expression-fitness relationships of some cell wall synthesis genes differ between E. coli and B. subtilis

Given the similarity between the expression-fitness curves of most essential genes in E. coli and B. subtilis, we reasoned that homologs with substantially different expression-fitness curves may reflect biologically meaningful differences between the two organisms. We identified 9 homologs as significantly different between the two organisms (Table S8, FDR < 0.2).

A majority (7/9) of these genes encoded enzymes involved in peptidoglycan (PG) synthesis and maturation, highlighting that although these pathways are conserved between gram-positive and -negative species, major distinctions exist in how they contribute to the construction of viable cells. Whereas CRISPRi targeting of genes involved in cytoplasmic PG precursor synthesis (Figure 5, group 3) generated bimodal expression-fitness relationships in both E. coli and B. subtilis, CRISPRi targeting of genes encoding enzymes involved in UDP-GlcNAc synthesis, meso-DAP synthesis, and longitudinal cell wall synthesis differentially affected the two species (Figure 5).

Figure 5. The expression-fitness relationships of genes encoding enzymes involved in various steps of cell wall biosynthesis in B. subtilis and E. coli show similarities and differences.

Figure 5.

(A) Enzymatic steps in pathway of peptidoglycan synthesis and incorporation, color coded by portion of the pathway. The enzymes involved in the synthesis of UDP-GlcNAc from fructose-6-phosphate are shown in dark blue, the enzymes involved in the synthesis of meso-DAP from aspartate are shown in red, the enzymes involved in the synthesis of Lipid II from UDP-GlcNAc are shown in orange, and the enzymes involved in the Rod PG incorporation system are shown in green. (B) Predicted knockdown vs. relative fitness for all of the essential genes involved in the pathway sections indicated in (A), for B. subtilis (left) and E. coli (right). The relative fitness and associated MADs of individual genes can be found in Table S5.

E. coli was significantly more tolerant of CRISPRi targeting of the mreBCD operon than B. subtilis (Figure 5, group 4). In contrast to B. subtilis, which lysed at intermediate levels of mreBCD knockdown, E. coli exhibited a minimal fitness defect after 10 generations and lysed only after 15 generations (Table S3). This observation is consistent with the small effect of CRISPRi targeting of mrdA (the PBP2 associated with MreBCD) on fitness in E. coli (Data S1), and with previous work which found that the fitness of Enterobacter cloacae is also relatively unaffected by mreBCD CRISPRi targeting (Peters et al., 2019). It is unclear why E. coli and other gram-negative bacteria are less affected by mreBCD CRISPRi targeting than B. subtilis, however the lack of transcriptional knockdown when targeting E. coli mreC (Reis et al., 2019) suggests that transcriptional buffering through feedback may play a role.

E. coli was also more robust than B. subtilis to CRISPRi targeting of genes required for producing either UDP-GlcNAc (Figure 5, group 1) or meso-DAP (Figure 5, group 2). Whereas B. subtilis lysed when these genes were targeted with high activity sgRNAs, E. coli exhibited a minimal fitness effect after 10 generations (Figure 5, Table S4, and Table S8). Because these genes are known to cause cell lysis when deleted (Kim et al., 2013; McLennan and Masters, 1998), we determined whether transcriptional feedback mediated by divergent regulatory mechanisms (Barreteau et al., 2008; Rodionov et al., 2003) is responsible for the lack of observed phenotype in E. coli, by measuring the ability of 2 fully complementary and 2 mismatched sgRNAs to reduce the expression of E. coli genes encoding enzymes involved in meso-DAP (asd, dapD, dapE), and UDP-GlcNAc (glmS) synthesis using RT-qPCR. We found that both fully complementary and mismatched sgRNAs were effective in significantly reducing the expression of these genes, with mismatched sgRNAs affecting gene expression less than fully matched sgRNAs (Figure 6A). All fully complementary sgRNAs targeting dapD and dapE generated similar knockdown (~20 fold) to a sgRNA targeting rfp, suggesting that if compensatory mechanisms exist, they must be post-transcriptional. In contrast, fully complementary sgRNAs targeting asd and glmS were less efficacious, generating ~5 fold knockdown, suggesting that the robustness of E. coli to CRISPRi targeting of asd and glmS may be due, at least in part, to transcriptional feedback. Consistent with this hypothesis, asd, which encodes an important branchpoint enzyme in amino acid synthesis is multivalently repressed by lysine, threonine, and methionine (Boy and Patte, 1972) and negatively regulated by the small RNA SgrS (Bobrovskyy and Vanderpool, 2016). By de-repressing asd in response to low amino acid levels, these mechanisms may be responsible for the increased robustness of E. coli to asd knockdown. Similarly, the robustness of E. coli to CRISPRi targeting of glmS may also be due to feedback regulation by a positively acting sRNA, glmZ, which stabilizes glmS mRNA in response to low intracellular GlcN-6-P levels (Urban and Vogel, 2008), such as those caused by CRISPRi targeting of glmS. Taken together these results suggest that feedback, mediated by divergent regulatory mechanisms (Barreteau et al., 2008; Rodionov et al., 2003) may contribute to the reduced sensitivity of E. coli to CRISPRi targeting of asd and glmS.

Figure 6. Screening mismatched sgRNA libraries in combination with chemical and genetic perturbations reveals modulators of essential gene requirements.

Figure 6.

(A) RT-qPCR measurements of repression by 2 fully complementary and 2 mismatched sgRNAs targeting 4 essential genes and rfp in E. coli shows that both mismatched and fully complementary sgRNAs are able to repress transcription. Mismatched sgRNAs (labeled *.mm) repress transcription less efficiently than their fully complementary counterparts. asd and glmS are less efficiently repressed by all 4 sgRNAs (both fully complementary and mismatched), suggesting that transcriptional or post-transcriptional feedback is important in regulating their level. (B) Schematic of peptidoglycan recycling and synthesis pathway in E. coli. (C-D) Volcano plots comparing the median change in relative fitness for all sgRNAs targeting a gene to the statistical significance of those changes as quantified by a Wilcox test in the ΔampG (C) and Δmpl (D) genetic backgrounds. The dashed line represents a Bonferroni corrected p-value < 0.01

Cell wall synthesis in E. coli and B. subtilis also differs in PG recycling (Figure 6B). Whereas E. coli recycles cleaved PG in both exponential and stationary phase, B. subtilis does so only in stationary phase (Johnson et al., 2013). We reasoned that increased robustness of E. coli to knockdown of genes in meso-DAP and UDP-GlcNAc synthesis might be a consequence of the ability of E. coli to supplement de novo synthesis of these compounds with recycled PG. We therefore repeated our small library fitness experiments in E. coli strains deleted for the genes encoding either of the two key enzymes involved in recycling, and compared the relative fitness of essential gene knockdowns in the different backgrounds (Methods, Figure S1C, Table S12). AmpG is the sole permease involved in PG recycling, and its deletion abolishes PG recycling (Johnson et al., 2013). Deletion of ampG did not sensitize E. coli to knockdown of PG synthesis genes (Figure 6C), suggesting that recycled PG does not contribute robustness to knockdown. Deletion of mpl, which acts downstream of AmpG to ligate salvaged tripeptides to UDP-GlcNAC, sensitized E. coli to knockdown of murI and dapA (Figure 6D). These two genes encode enzymes responsible for isomerizing L-Glu to D-Glu (murI and catalyzing the rate limiting step in meso-DAP synthesis (dapA, Reverend et al., 1982) D-Glu and meso-DAP are two of the three amino acids in the tripeptide ligated by mpl (the other being L-Ala). The sensitizing and specific effect of an mpl deletion suggests that the flux through PG recycling is important for fitness, while the lack of sensitizing effects in the ampG deletion suggests that the absence of this flux can be overcome. Compensatory regulation in response to a cytoplasmic PG intermediate upstream of mpl ligation (e.g. UDP-GlcNAC) would be activated in response to ampG deletion, but not in response to mpl deletion, potentially explaining our observations. However, the mechanism of such regulation remains to be elucidated.

These experiments also identified a novel phenotype, highlighting the ability of these approaches to generate new biology. Deleting either ampG or mpl de-sensitized E. coli to the depletion of FtsH (Figure 6C-D), an essential protease responsible for balancing flux through lipopolysaccharide biosynthesis and phospholipid synthesis by regulating the stability of LpxC, an enzyme catalyzing the first committed step in lipopolysaccharide synthesis (Ogura et al., 1999). It remains to be determined whether PG recycling affects the balance between lipopolysaccharide biosynthesis and phospholipid synthesis (perhaps by depleting the pool of UDP-GlcNAc), or an ancillary FtsH function. Underscoring the importance of screening multiple levels of essential gene knockdown, no significant epistatic interactions were identified in either strain background when only the two fully complementary sgRNAs targeting each gene were considered. These data highlight the utility of combining mismatched CRISPRi libraries with other genetic backgrounds to identify modulators of essential gene requirements.

Expression-fitness curves are modulated by external perturbations

Bacteria produce some enzymes at higher levels than needed for immediate survival, likely to buffer against future environmental perturbations (Figure 4A-B, Christodoulou et al., 2018). To explore the ability of mismatched sgRNAs to capture shifts in the expression-fitness relationship of individual genes driven by external perturbations, we measured the relative fitness of the comprehensive sgRNA library targeting drfA in B. subtilis treated with two sub-MIC doses of the antibiotic trimethoprim. Trimethoprim directly inhibits DfrA (Figure 7A) and has been shown to act synergistically with partial knockdown of dfrA (Peters et al., 2016). However, the degree of synergy as a function of dfrA knockdown has not been investigated. We found that a low dose of trimethoprim abolished the initial buffering observed in the untreated strain (Figure 7B). A higher (but still sub-MIC) dose of trimethoprim further depressed the expression-fitness relationship, and caused a phenotype even at the lowest levels of knockdown. These data suggest that the DfrA concentration is buffered against external perturbations and highlights the ability of mismatched sgRNAs to enable exploration of these subtle shifts.

Figure 7. Environmental changes can modulate essential gene requirements.

Figure 7.

(A) Schematic of dihydrofolate reductase (dfrABsu) inhibition by trimethoprim. (B) Expression fitness curve of B. subtilis dfrA in LB (black), LB+15ng/ml trimethoprim (red), and LB+30ng/ml trimethoprim (purple) showing the expression dependent synergy between DfrA depletion and trimethoprim.

PERSPECTIVE

Bacterial essentialomes typically consist of several hundred genes encoding the core enzymes central to viability. Lack of precise, high-throughput methods for titrating gene expression in bacteria has precluded an understanding of how they respond to a continuum of essential gene repression. Here we report a modified CRISPRi system that generates graduated, species-independent levels of repression by programming dCas9 with singly mismatched sgRNAs. Leveraging this system, we assessed the fitness effects of titrating essential gene expression for almost all essential genes in E. coli and B. subtilis. These data revealed striking differences between the expression-fitness landscapes of genes and pathways that highlighted shared and unique biological constraints driving the optimization of essential gene expression. Because CRISPRi tools have now been established for many bacteria (e.g. Peters et al., 2019) our approach can be applied to pathogenic and microbiome (e.g. Liu et al., 2017, Rock et al., 2017) strains to inform target selection for drug design and illuminate unique constraints for bacterial growth.

Our comprehensive characterization of mismatched sgRNAs targeting gfp identifies the organism-independent rules that determine mismatched sgRNA activity in bacteria (Figure 1 and Figure 2). By applying these rules, either independently or with the software developed here (Methods), sgRNAs can be readily designed that generate a defined level of knockdown for any bacterial gene. These mismatched sgRNAs enable a range of high-throughput techniques to accelerate biological discovery. First, mismatched sgRNAs allow the rapid generation of reduced expression mutants. Such mutants have been used for drug target discovery and lead optimization in Mycobacterium tuberculosis (Johnson et al., 2019). However, that effort required extensive up-front strain optimization to identify gene-specific expression levels that facilitated identification of chemical-genetic interactions. In contrast, our system enables facile generation and testing of a broad range of repression levels, potentially accelerating drug discovery and target identification. Second, our high-throughput pooled screening methodology to determine relative fitness is amenable to testing the effects of essential gene titration in varying environmental and genetic backgrounds. To simplify the exploration of essential gene requirements in diverse conditions, we constructed smaller (11 sgRNA/gene) libraries for E. coli and B. subtilis essential genes that can be easily screened in varying conditions or transferred into different genetic backgrounds. The reduced complexity of these libraries aids multiplexing while retaining a broad range of phenotypes for most genes in both species (Figure S10). Finally, our approach also allows the use of CRISPRi to measure epistatic interactions between essential and non-essential genes, which is critical for network discovery. This approach had previously been hampered by the need to fully repress the non-essential gene (to maximize the chance of a phenotype) while only partially repressing the essential gene (to enable cell survival). This hurdle can be overcome by targeting the essential gene with a mismatched sgRNA, and the non-essential gene with a fully complementary sgRNA.

Although a mounting body of evidence supports the idea that gene essentiality is a quantitative trait (Rancati et al., 2018), systematically exploring this hypothesis has been challenging due to the lack of universally applicable methods of essential gene titration. Here we firmly establish quantitatively different fitness effects of essential gene depletion by targeting each gene at 10 separate loci using directly comparable CRISPRi methodologies and fitness measurements in two species. This dataset allows meaningful comparisons of expression-fitness relationships across species, and we use it to compare homologous essential genes in E. coli and B. subtilis. The similarity of most expression-fitness relationships between these diverged species underscores potentially shared evolutionary constraints and also highlights the significant differences in UDP-GlcNAs synthesis, meso-DAP synthesis and in the Rod PG synthetic apparatus as targets for further study. In contrast to synthetic promoter-based methods of studying expression-fitness relationships (Keren et al., 2016) which inherently eliminate native transcriptional feedback loops, our method is sensitive to the effects of essential gene regulation, allowing the identification of transcriptional feedback loops responsible for cellular homeostasis. Whether a cell buffers the effects of gene repression by regulatory feedback or by producing an excess of gene product is secondary to the importance of having buffering capacity, which is readily observed for some genes and not others. These studies inform target selection for drug design, illuminate aspects of bacterial growth, and provide a starting point for investigating how bacteria program robustness into their essential gene network.

Star Methods

Lead contact:

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Carol Gross (cgrossucsf@gmail.com).

Material availability:

  • Plasmids generated in this study are available upon request.

  • Strains generated in this study are available upon request.

Data and code availability:

  • Source data:
    • All raw sequencing data is deposited in the Short Read Archive under accession PRJNA574461.
    • All analyzed fitness data are available in the paper’s Supplemental Information as indicated.
    • All analyzed FACS-seq data are available in the paper’s Supplemental Information as indicated.
  • Code:
    • All custom analysis scripts referenced here and in the Key Resources Table are publicly available.
  • Scripts:
    • The scripts used to generate the figures reported in this paper involved using the “plot” and “image” functions available in the R software package, version 3.5+, available at https://www.r-project.org/, to plot data reported in the Supplemental Information, as described in the figure legends and STAR Methods.
    • Scripts were not used to generate the diagrams reported in this paper, which were manually constructed using Adobe Illustrator.
  • Any additional information required to reproduce this work is available from the Lead Contact.

KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER
Chemicals, Peptides, and Recombinant Proteins
Lysogeny broth (LB), Lennox Fisher scientific Cat# BP1427-2
Bacillus subtilis MC medium Koo et al., 2017 N/A
Bacillus subtilis competence medium Koo et al., 2017 N/A
IPTG Denville scientific Cat# C18280-13
Xylose
Ampicillin sodium salt Sigma-Aldrich Cat# A9518
Kanamycin sulfate Sigma-Aldrich Cat# K1377
Erythromycin Sigma-Aldrich Cat# E5389
Spectinomycin dihydrochloride pentahydrate Sigma-Aldrich Cat# S9007
Chloramphenicol Sigma-Aldrich Cat# C0378
Carbenicillin Millipore-Sigma Cat# 205805
Gentamicin sodium salt Fisher Scientific Cat# AAJ1605103
Trimethoprim Sigma-Aldrich Cat# T7883-5G
 
Q5 High-Fidelity DNA polymerase New England Biolabs Cat# M0493S
HiFi Assembly New England Biolabs Cat# E2621L
BsaI-HFv2 New England Biolabs Cat# R3733
T4 DNA Ligase New England Biolabs Cat# M0202L
Critical Commercial Assays
DNeasy Blood & Tissue Kit Qiagen Cat# 69506
Midiprep Kit Qiagen Cat# 12143
QIAprep Spin miniprep kit Qiagen Cat# 27106
Deposited Data
Raw sequencing data (FASTQs) for relative fitness experiments and FACS-seq experiments This study SRA: PRJNA574461
 
 
 
 
Experimental Models: Organisms/Strains
Bacillus subtilis 168 BGSC 1A1
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm) Peters et al., 2016 CAG74209
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), amyE∷Pveg-sgRNA(cat) (CRISPRi libraries: sgRNA spacers listed in Table S3) This study N/A
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), thrC∷Pveg-gfp(Spc) This study CAG78920
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), thrC∷Pveg-gfp(Spc), pJSHA77 (CRISPRi libraries: sgRNA spacers listed in Table S1) This study N/A
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), thrC∷Pveg-rfp(Spc) This study CAG78921
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), thrC∷Pveg-rfp(Spc), pJSHA77 (CRISPRi libraries: sgRNA spacers listed in Table S1) This study N/A
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), sacA∷Pveg-rfp This study CAG78922
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), sacA∷Pveg-rfp, murAA-gfp(Kan) This study CAG78923
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), sacA∷Pveg-rfp, murAA-gfp(Kan), thrC∷Pveg-murAA*(Spc) This study CAG78924
Escherichia coli BW25113 Baba et al., 2006 N/A
Escherichia coli BW25113 Tn7att∷PlLac-O1-dcas9(Gent) This study CAG78830
Escherichia coli BW25113 Tn7att∷PlLac-O1-dcas9(Gent), pJSHA77 (CRISPRi libraries: sgRNA spacers listed in Table S3) This study N/A
Escherichia coli BW25113 Tn7att∷PBBa_J23105-dcas9(Gent), yjaA:Pveg-gfp(Cat):yjaB This study CAG78108
Escherichia coli BW25113 Tn7att∷PBBa_J23105-dcas9(Gent), yjaA:Pveg-gfp(Cat):yjaB, pJSHA77 (CRISPRi libraries: sgRNA spacers listed in Table S1) This study N/A
Escherichia coli BW25113 Tn7att∷PBBa_J23105-dcas9(Gent), yjaA:Pveg-rfp(Cat):yjaB This study CAG78107
Escherichia coli BW25113 Tn7att∷PBBa_J23105-dcas9(Gent), yjaA:Pveg-rfp(Cat):yjaB, pJSHA77 (CRISPRi libraries: sgRNA spacers listed in Table S1) This study N/A
10-beta Electrocompetent Escherichia coli New England Biolabs Cat# C3020K
Oligonucleotides
Primers used in this study are listed in Table S9 This study N/A
Recombinant DNA
pDG1731 Radeck et al., 2013 pBS4S (Addgene# 55170)
pDG1731-gfp This study N/A
pDG1731-rfp This study N/A
pDG1622 BGSC ECE119
pJSHA77 This study N/A
pJSHA77-rfp This study N/A
pJSHA77-gfp This study N/A
 
 
 
 
 
 
 
Software and Algorithms
Bowtie2 Langmead and Salzberg, 2012 http://bowtie-bio.sourceforge.net/bowtie2/index.shtml
sgRNA design (fully matched sgRNAs) This study https://github.com/traeki/sgrna_design
Linear model training (train_linear_model.py) This study https://github.com/traeki/mismatch_crispri
Design a subset of mismatch sgRNA (choose_guides.py) This study https://github.com/traeki/mismatch_crispri
FASTQ analysis to calculate sgRNA abundance and relative fitness (count_guides.py, compute_gammas.py, gamma_to_relfit.py) This study https://github.com/traeki/mismatch_crispri
FlowJo v10 FlowJo, LLC
Other
 
 
 
 
 

Experimental model and subject details:

Microbes

Escherichia coli strains were cultured in LB medium at 37C. Bacillus subtilis strains were cultured in LB medium at 37C.

Methods details:

General strain manipulations and procedures

Bacillus subtilis strain construction and growth conditions

All B. subtilis strains were constructed in the wildtype 168 background using natural competence as previously described (Koo et al., 2017). For all individual CRISPRi strains and libraries, a recipient strain encoding dcas9 under control of the Pxyl promoter at the lacA locus (strain CAG74209) (Peters et al., 2016), was transformed with an sgRNA plasmid (see “sgRNA plasmid construction”) which recombines in single copy at the amyE locus, selecting for chloramphenicol resistance. In select cases, single- vs. double-crossover events from plasmid integration were distinguished by streaking on starch plates to assay disruption of amyE.

For the GFP knockdown FACS-seq experiments, two modified recipient strains expressing dcas9 were constructed: one encoding gfp (strain CAG78920) and the other encoding rfp (strain CAG78921). To construct these, the dcas9 strain (strain CAG74209) was transformed with pDG1731-gfp or pDG1731-rfp to integrate Pveg-gfp-spc or Pveg-rfp-spc, respectively, at the thrC locus, selecting for spectinomycin resistance. All subsequent transformations of the gfp and rfp-marked strains required threonine supplementation in the competence media (40μg/ml), as thrC is disrupted.

For flow cytometry-based competition experiments (see “Relative fitness validation”), the dcas9 recipient strain was transformed with a modified sgRNA plasmid that also encodes either Pveg-gfp or Pveg-rfp (see “sgRNA plasmid construction”).

A murAA-gfp transcriptional fusion knock-down reporter strain was constructed by transformation of the dcas9 strain (above) with the DNA fragments containing constitutively expressed rfp with removable kanR cassette, murAA-gfp transcriptional fusion with removable kanR cassette, and constitutively expressed non-targeted murAA with spectinomycin resistant gene, of which fragments were integrated into sacA, murAA and thrC locus respectively, sequentially in that order. The B. subtilis Pveg promoter was used for constitutive expression of rfp and non-targeted murAA. The DNA fragment containing constitutively expressed rfp with removable kanR cassette was constructed by joining PCR of three fragments: Pveg-rfp-kanR fragment amplified from pACYC-rfp-kanR, and 1 kb each 5′ and 3′ flanking sequences of sacA. The DNA fragment containing murAA-gfp transcriptional fusion with removable kanR cassette was constructed by the joining of gfp-kanR amplified from pACYC-gfp-kanR, 1 kb of the 3’ end of the murAA open reading frame, and 1 kb downstream of the murAA open reading frame. Before the gfp-kanR fragment was integrated downstream of murAA to generate strain CAG78923, the kanR cassette was removed from rfp strain as described previously to generate strain CAG78922 (Koo et al., 2017). Non-targeted murAA was designed to remove PAM sequence or alter the sgRNA targeting sequence without substituting amino acid sequence of murAA. Non-targeted murAA DNA was generated by overlapping PCR with mutagenic primers (Table S9) and its BsiWI/NruI digested fragment were cloned into pJMP3 (Addgene #79875) digested with BsrGI/PmeI. The cloned plasmid was transformed into the dcas9, rfp, murAA-gfp strain, selecting for spectinomycin resistance, to generate strain CAG78924. Finally, this strain was transformed with sgRNA plasmids as described above.

Unless otherwise noted, all strain construction and growth assays for B. subtilis were done in LB medium and using antibiotics at the specified concentrations: erythromycin (1μg/ml), spectinomycin (100μg/ml), chloramphenicol (7.5μg/ml), kanamycin (7.5μg/ml).

Escherichia coli strain construction and growth conditions

All CRISPRi library strains were constructed in the wildtype BW25113 background by electroporating an sgRNA plasmid or plasmid pool (see “sgRNA plasmid construction”) into a recipient strain encoding dcas9 (for essential gene knockdown libraries), or dcas9 and gfp or rfp (for GFP knockdown libraries), selecting for ampicillin resistance. Ampicillin resistance is through the expression of a beta-lactamase, which is found in the periplasmic space and is unlikely to affect essential gene knockdowns since any ampicillin that entered the periplasmic space would be inactivated.

For the essential gene knockdown library recipient strain (strain CAG78830), Tn7 transposition was used to integrate a dcas9 expression cassette into the Tn7att site using triparental mating of DAP(diaminopimelic acid)-dependent donors and selecting for gentamicin resistance in the absence of DAP, as previously described (Peters et al., 2019). The dcas9 expression cassette is modified from previously described versions (Peters et al., 2019), contains dcas9 from S. pyogenes (Qi et al., 2013) with a 3X Myc C-terminal tag, and is expressed from the IPTG-inducible promoter PlLac-O1 (Lutz and Bujard, 1997) and regulated by lacIq.

For the GFP knockdown FACS-seq experiments, two recipient strains expressing dcas9 were constructed: one encoding gfp (strain CAG78108) and the other encoding rfp (strain CAG78107). Each was generated by first cloning the constitutive gfp and rfp expression cassettes from pDG1731-gfp and pDG1731-rfp upstream of frt-cat-frt from pKD3 (Datsenko and Wanner, 2000), integrating them into the chromosome between yjaA and yjaB using recombineering (Thomason et al., 2014), and selecting for chloramphenicol resistance. P1 phage transduction (Thomason et al., 2007) was then used to move the gfp-frt-cat-frt or rfp-frt-cat-frt cassettes into BW25113, selecting for chloramphenicol resistance. Chromosomal dcas9 was then introduced to these strains by conjugation using a pseudo-Hfr dcas9 donor, as described previously (Rauch et al., 2017), where dcas9 is expressed by the minimal synthetic promoter PBBa_J23105 {https://parts.igem.org}, and transconjugates were selected using gentamicin and chloramphenicol.

For the RT-qPCR experiments, CRISPRi strains expressing sgRNAs targeting peptidoglycan biosynthesis genes were individually reconstructed by electroporating the dcas9 strain (strain CAG78830) with sgRNA plasmids constructed as described below (see “sgRNA plasmid construction”).

For the experiments combining the compact sgRNA libraries with deletions of peptidoglycan recycling pathway genes, the desired deletion alleles (ampG∷kan or mpl∷kan) were isolated from the Keio collection. Briefly, the deletion mutants were isolated, confirmed by PCR of kanamycin cassette junctions, and P1 phage was made from verified strains. Transduction of the dcas9 strain (strain CAG78830) was performed (Thomason et al., 2007) with each the phage, selecting for kanamycin resistance, and the resulting strains were transformed with sgRNA plasmid libraries as detailed below (see “Escherichia coli CRISPRi library construction)

Unless otherwise noted, all strain construction and growth assays for E. coli were done in LB medium and using antibiotic selection at the specified concentrations: ampicillin (100μg/ml), carbenicillin (50μg/ml), gentamicin (10μg/ml), chloramphenicol (25μg/ml), kanamycin (30ug/ml).

Bacillus subtilis CRISPRi library construction

As in the individual CRISPRi strain construction (above), CRISPRi libraries were constructed by transforming sgRNA plasmids into the dcas9 strain. The protocol was modified in one of two ways in order to increase the scale; we found both methods were sufficient to maintain coverage of the pooled plasmids. In one method, cells were grown in B. subtilis competence medium to OD600=1.5, and then incubated with plasmid DNA (300μl cells + 300ng plasmid DNA) in 96-well deep-well plates. Incubations were performed for 2hr at 37C with shaking (900RPM), after which point plates were spun down at 5000g for 10 minutes and resuspended in 2mL LB medium before plating on plates (Falcon #351058) with chloramphenicol at a density ~0.4M CFU/plate and growth overnight at 37C. A second method incubated competent cells (grown in B. subtilis competence medium to OD600=1.5) with plasmid DNA in culture flasks, for 2hr at 37C with shaking (900RPM), after which point cells were spun down in 50ml tubes and resuspended in 2-6ml LB before plating on chloramphenicol plates as before.

To store the transformed CRISPRi library, plates were scraped, pelleted and resuspended in S7 salts (Koo et al., 2017) with 15% glycerol, and stored in 500uL aliquots at −80C.

Escherichia coli CRISPRi library construction

Strain library construction from plasmid libraries was achieved by electroporating plasmid DNA into the recipient strains, and plating on plates (Falcon #351058) with carbenicillin and 0.2% glucose (to repress uptake of residual lactose in LB that can induce the IPTG-controlled dcas9 in the essential gene knockdown strains) at a density ~0.4M CFU/plate and growth overnight at 37C. To store the libraries, plates were scraped, pelleted, and resuspended in 15% glycerol to be stored at −80C.

sgRNA plasmid construction

The sgRNA plasmid pJSHA77 was modified from pDG1622 to increase transformation and double-crossover efficiency. 1.5kb of DNA upstream of amyE was PCR amplified from B. subtilis 168 genomic DNA and inserted into pDG1662 by HiFi Assembly (New England Biolabs #E2621L), replacing the shorter upstream fragment of amyE in pDG1662. Synthetic DNA containing a transcription terminator, an sgRNA driven by Pveg with BsaI cut sites for spacer cloning, and downstream tandem transcription terminators was purchased from IDT and cloned into the previously described pDG1662 derivative by HiFi Assembly (New England Biolabs #E2621L), generating pJSHA77.

Oligonucleotide pools containing the desired elements with flanking restriction sites and library-specific PCR adapters were obtained from Agilent Technologies (Table S9). The oligonucleotide pools were amplified by 15 cycles of PCR using Q5 polymerase (New England Biolabs #M0493S) and custom primers (Table S9). The PCR product was digested with BsaI-HFv2 (New England Biolabs #R3733) and gel purified from 10% TBE gels (Invitrogen #EC6275BOX) to remove adapter ends. pJSHA77 vector was midi-prepped (Qiagen #12143), digested with BsaI-HFv2 for 1hr, and treated with Antarctic phosphatase (New England Biolabs # M0289S). Each ligation was carried out using 100ng of digested vector at a 1:2 (vector:insert) molar ratio for 12 hrs at 16C using T4 DNA Ligase (New England Biolabs #M0202L). Ligations were transformed into electrocompetent cells (New England Biolabs #C3020K), recovered for 1hr at 37C in LB, and then inoculated into 100ml with carbenicillin and grown overnight. Plasmid libraries were collected by midiprep (Qiagen #12143) and analyzed by deep sequencing (Illumina MiSeq #MS-103-1002) to assess cloning efficiency and library diversity.

For individual sgRNA strains, inserts were prepared by annealing two single-stranded DNA oligos together to create the 4-base overhangs, and then annealed inserts were ligated using T4 DNA Ligase (New England Biolabs #M0202L) individually into pJSHA77 digested with BsaI-HFv2 and treated with Antarctic phosphatase (New England Biolabs # M0289S).

For the single-strain competition validation strains, pJSHA77 was first modified to incorporate a constitutively expressed Pveg-gfp or Pveg-rfp using HiFi Assembly (New England Biolabs #E2621L). Strains were then constructed as described above, ligating annealed-pair inserts into the modified vector after digesting with BsaI-HFv2.

sgRNA plasmid library design

Code for designing (fully matched) sgRNA spacers targeting a list of genomic loci can be found at https://github.com/traeki/sgrna_design.

Non-targeting sgRNA controls were designed by creating random 20nt sequences with a distribution of GC content similar to B. subtilis (~45%), and then using bowtie (Langmead et al., 2009) to identify (and subsequently filter out) sgRNAs which aligned (allowing 3 or fewer mismatches) to other intragenic targets in the combined genomes of E. coli and B. subtilis, or any targets in gfp or rfp.

For the libraries targeting all essential genes in B. subtilis, multiple iterations of sgRNA library design (i.e. spacer design), construction, and analysis were used. For B. subtilis libraries, all presented data is from V2 library measurements, with the exception of the trimethoprim experiments which used measurements of the V1 libraries (all data in Table S3).

For the V1 libraries targeting B. subtilis genes we chose target genes to be all those previously identified as essential, putative essential, or low-fitness (Koo et al., 2017; Peters et al., 2016) (Table S3). For every gene in the V1 set, two non-overlapping fully complementary spacers were chosen, each targeting the non-template strand as close to the start of the ORF as possible. For each fully complementary spacer, a set of 25 spacer variants were designed and ordered: 2x the fully complementary spacer, 5x randomly chosen single-mismatches within 7 bases of the PAM, 5x randomly chosen single-mismatches 8-12 bases from the PAM, 3x randomly chosen single-mismatches 13-19 bases from the PAM (to exclude the outermost base), 10x randomly chosen double mismatches 1-19 bases from the PAM. In addition, for every gene the first three non-overlapping template-strand spacers were included.

The V2 B. subtilis libraries included all essential B. subtilis genes as well as a subset of non-essential but fitness-impacting genes (Table S3) from V1 of the library. The V2 E. coli libraries included a majority of genes with evidence for essentiality (Table S3) (Koo et al., 2017). For every gene in this set, ten non-overlapping fully complementary spacers were chosen on the non-template strand, as close to the start of the ORF as possible. For each fully complementary spacer, a set of 10 spacer variants was designed and ordered (for a total of 100 sgRNAs per gene): 1x the original fully complementary spacer, 9x single-mismatches (Figure S1). Single-mismatches were chosen using the following criteria: all possible single-mismatch variants were evaluated by the trained linear model for a predicted sgRNA activity {https://github.com/traeki/mismatch_crispri, train_linear_model.py and choose_guides.py}. These predicted sgRNA activities were categorized into five bins: <10%, >90%, and three equally sized bins between 10% and 90% predicted sgRNA activity. Three sgRNAs were chosen from each of the middle three bins. For the design of all libraries using this strategy, a preliminary version of the linear model was used.

The compact libraries with 11 sgRNAs per gene were selected as above, with the following modifications for both species: for each gene 2x fully complementary sgRNAs were chosen, and 9x single-mismatch variants were selected from among all possible single-mismatch variants of each, using a binning strategy as described above (Figure S1). For E. coli, also as described above, the bins were generated using predicted sgRNA activity. For B. subtilis the bins were instead generated using the measured relative fitness values from the V1 experiment, and the selected sgRNAs were therefore a subset of those used in the V1 library.

The dfrA, gfp, and rfp V1 comprehensive libraries (used in the trimethoprim experiment and all FACS-seq experiments) were designed analogous to the V1 essential gene libraries, with 100 sgRNAs per target: 4x the original fully complementary spacer, 20x randomly chosen single-mismatches within 7 bases of the PAM, 15x randomly chosen single-mismatches 8-12 bases from the PAM, 12x randomly chosen single-mismatches 13-20 bases from the PAM, and 49x randomly chosen double mismatches 1-20 bases from the PAM (Figure S1).

For the V2 comprehensive libraries targeting dfrA, murAA, folA, or murA, we designed all possible non-template spacers, each with all possible single-mismatches, for a total of 60x mismatch variants per fully complementary sgRNA.

FACS-seq experiments

FACS-seq experimental details

Three separate strain libraries were constructed and mixed together for use in the sorting experiments: a gfp+ strain with the gfp-targeting sgRNA library (mismatch-GFP), a gfp+ strain with the non-targeting sgRNA control library (“high-GFP” or “control sgRNA” in figure), and a gfp strain with the rfp-targeting sgRNA library (“no-GFP” or “dark control”) (Figure 1A). Glycerol stocks of each library were fully thawed, inoculated into replicate 12.5ml cultures of LB (B. subtilis) or LB with ampicillin (E. coli) at 0.01 OD600, and allowed to grow for 2.5-3hr. Then cultures were back-diluted to 0.01 OD600 in LB with 1% xylose (B. subtilis) or LB with ampicillin (E. coli) and grown for 2.5hr. Immediately before sorting the cultures were mixed at a ratio reflecting the overall diversities of their libraries (40% mismatch-GFP, 40% low-GFP, 20% high-GFP), and then the mixture was diluted 1:10 in PBS at room temperature (B. subtilis) or on ice (E. coli).

Sorting was done on the mixed cultures using a BD FACSAria II (Laboratory for Cell Analysis in Helen Diller Family Comprehensive Cancer Center at UCSF), using the blue laser (488 nm) and the FITC detector (530/30 nm), and at a flow rate of 5 and collecting for 20min total. Post-sorting the collected bins were filtered using either cellulose nitrate membranes with 2um pore (Thermo Scientific #145-0020) or mixed cellulose esters 0.22um pore disc filters (MF-Millipore #GSWP02500) on a glass filtration apparatus. Filters were resuspended in 9ml LB (B. subtilis) or LB with ampicillin (E. coli) by vortexing at max speed for 30s, then split into two outgrowth cultures and grown overnight in 4ml LB (B. subtilis) or LB with ampicillin (E. coli). A portion of the input mixed sample (i.e. pre-sorting) was treated similarly and grown overnight. DNA was extracted from each outgrowth culture separately and analyzed by deep sequencing as described above.

FACS-seq analysis

For each species, two biological replicates (i.e. cultures starting from unique glycerol stocks) were sorted by FACS, and from each biological replicate’s 4 bins (plus unsorted mixture) two technical replicates (i.e. two overnight outgrowth cultures from which DNA was extracted) were sequenced. Library spacers were counted in each sequenced sample, normalized to the sample’s total number of spacers counted, and technical replicate normalized counts were added together. To correct for differences in sequencing depth and cell number differences between bins, for each biological replicate we used a linear model to determine a coefficient for each bin such that the sum of counts from all bins for each sgRNA was as similar as possible to the unsorted (mixed) sample. Briefly, we used the sklearn package (sklearn.linear_model) in Python and applied it to the mixed sample after removing from it the top and bottom 5th percentiles.

We sought to define a metric for enrichment in the GFP-high bins vs. the GFP-low bins that would be similar in scale to relative fitness. We define an enrichment ratio (ER) for each sgRNA as:

ER=33n.normBin4+23n.normBin313n.normBin2+03n.normBin1

where n.normBin i is the normalized counts in Bin i, and Bin1 has the lowest GFP fluorescence while Bin4 has the highest. By this metric, values close to 1 have the highest GFP fluorescence (or weakest sgRNA activity) and values <1 have lower GFP fluorescence (or stronger sgRNA activity). Enrichment scores were normalized on a per experiment basis by subtracting the mean enrichment score of the “dark controls” and dividing by the mean enrichment score of the “high-GFP” strains. The resulting scores for each sgRNA (called the “FACS-seq score” in the main text) are available in Table S1.

FACS-seq validation

To validate our sorting procedure and the relationship between the calculated FACS-seq score and the fluorescence of a single strain, we randomly isolated 9 strains from the E. coli GFP knockdown library and analyzed them by flow cytometry to quantify knockdown relative to a non-targeting sgRNA (Figure S2E-H). Strains were grown in deep 96-well plates in 300ul LB overnight, diluted back and grown to ~0.4 OD600 before measurement. Briefly, data was collected on a LSRII flow cytometer (BD Biosciences) using the blue laser (488 nm) and the FITC detector (530/30 nm). Data for at least 20,000 cells were collected, and median fluorescence values were extracted using FlowJo (FlowJo, LLC). Data from representative samples were plotted as histograms using FlowJo to confirm that single-cell fluorescence was unimodal within the population (Figure S2E-H). sgRNA plasmids were miniprepped (Qiagen #27106) from each library isolate and Sanger sequenced to ascertain their identity in the library experiment. To assay the behavior of the same sgRNAs in B. subtilis, the miniprepped plasmid was transformed into B. subtilis as described above, double-crossover events were verified by streaking on starch plates, and the strains were analyzed by flow cytometry as described above. All relative fluorescence measurements are provided in Table S1 and plotted in Figure S2A.

Linear model of singly mismatched sgRNA efficacy

Having measured the ability of ~1,600 singly mismatched sgRNAs to knockdown GFP expression, we sought to build a model to predict the effect of mismatches on sgRNA efficacy. Since an enrichment score of 1 represent maximal GFP fluorescence, and a score of 0 represents no GFP fluorescence, we define knockdown for each sgRNA as:

knockdownsgRNA=1FACS.seqscore

We then normalized the ability of each mismatched sgRNA to knockdown GFP compared to its equivalent fully complementary sgRNA using the equation below:

sgRNAactivitysinglymismatchedsgRNA=knockdownsinglymismatchedsgRNAknockdownfullycomplementarysgRNA

We next built a model that fit the activity of each sgRNA using the position of the mismatch (from 0 to 19, with 19 being PAM proximal, one hot encoded), the transition of the mismatch (from X to Y, one hot encoded), and the GC% of the fully complementary sgRNA. Mismatched sgRNAs were excluded from the analysis if they were variants of fully complementary sgRNAs with less than 0.5 knockdown (as described above). The parameters from this model trained on E. coli, B. subtilis, or species-averaged per sgRNA activity are presented in Table S2 and Figure S5, the raw data in Table S1.

Predicted sgRNA activity validation

To validate the linear model’s ability to predict sgRNA activity based on sgRNA sequence, we measured the knockdown of a murAA-gfp transcriptional fusion in a B. subtilis strain that was complemented by a non-targeted copy of murAA. These strains also expressed a chromosomal rfp that allowed for calculation of the GFP/RFP ratio on a per cell basis. Strains were grown as described above (FACS-seq validation), with the exception that dcas9 was induced using 1% xylose after dilution. Data was collected on a LSRII flow cytometer (BD Biosciences) using the blue laser (488 nm) and the FITC detector (530/30 nm) for GFP detection, and the yellow/green laser (561 nm) and the PE-Texas Red detector (610/20 nm) for RFP detection. Data for at least 20,000 cells were collected, and the per-cell GFP/RFP ratios as well as the population median GFP/RFP ratios were extracted using FlowJo (FlowJo, LLC). Relative knockdown was normalized to a murAA-gfp strain lacking an sgRNA, after first subtracting the background GFP fluorescence from a non-fluorescent B. subtilis strain. Relative GFP fluorescence measurements are provided in Table S11.

Relative fitness experiments

Relative fitness experimental details

Glycerol stocks of the B. subtilis essential-gene library (V1 or V2), the dfrA and murAA libraries (V1 or V2), and the library of non-targeting control sgRNAs were fully thawed, mixed, and inoculated into 150 mL cultures of LB at a combined OD600 of 0.01 (5% control, 75% essential-gene library, 10% dfrA library, 10% murAA library). This culture was allowed to grow to OD600 0.1, at which point the culture was back-diluted to OD600 0.01 in fresh 150 mL culture of LB + 1% xylose. This culture was then grown to OD600 0.3 (~5 doublings), back-diluted to OD600 0.01 in LB + 1% xylose, and grown to OD600 0.3 (total ~10 doublings). Samples were collected a) immediately before back dilution into xylose and b) after the final growth phase, ~10 doublings apart (Figure. 2A). The trimethoprim experiments were carried out in an identical manner, except that both 1% xylose and trimethoprim (Sigma-Aldrich #T7883-5G) (0 ng/mL, 15ng/mL, or 30ng/mL) were added from the first back-dilution and maintained throughout growth. Concentrations of trimethoprim were chosen such that wildtype growth rate was unaffected.

Fitness experiments for the E. coli V2 libraries were carried out in an identical manner to the B. subtilis fitness experiments with the following exceptions: all growth occurred in the presence of ampicillin, and induction was achieved with 1mM IPTG instead of 1% xylose.

For both B. subtilis and E. coli, compact library experiments were carried out in an identical manner as the larger scale fitness experiments above, save that the volume of cultures was 15mL, and only compact libraries and non-targeting control libraries were mixed together (90% compact library, 10% controls).

At the desired time points, B. subtilis cultures were collected (1ml) by pelleting (9000xg 2min) and genomic DNA was extracted using the DNeasy Blood & Tissue kit (Qiagen #69506) with the recommended gram-positive pre-treatment and RNAse A treatment. For the E. coli fitness experiments, E. coli cultures were collected (4ml) by pelleting (20000xg 2min) and plasmid DNA was extracted using the QIAprep Spin miniprep kit (Qiagen #27106). sgRNA spacer sequences were amplified from gDNA (200ng, B. subtilis) or plasmid DNA (400ng, E. coli) using Q5 polymerase (New England Biolabs #M0493S) for 14x cycles using custom primers containing TruSeq adapters and indices (Table S9), followed by gel-purification from 8% TBE gels (Invitrogen #EC62152BOX), and sequencing on HiSeq 4000 with single-end 50bp reads at the UCSF Center for Advanced Technology using a custom sequencing primer (Table S9).

Relative fitness analysis

Raw FASTQ files were aligned to the library oligos and counted using {https://github.com/traeki/mismatch_crispri, count_guides.py}, and relative fitness was calculated using {https://github.com/traeki/mismatch_crispri, compute_gammas.py and gamma_to_relfit.py}. For each strain (x) with at least 100 counts at t0 we calculate the relative fitness F(x) according to:

F(x)=log2rwt(t0)rx(t10)rwt(t10)rx(t0)gwt+1

where rx(ti) is the fraction of strain X in the population at time i and gwt is the number of generations of wildtype growth in the experiment. A derivation of this equation can be found in (Keren et al., 2016) and (Rest et al., 2013). In our experiments, gwt is calculated from the OD measurements of the culture, and rwt(ti) is calculated as the median of 1000 non-targeting control sgRNAs from that sample. For strains with at least 100 counts at t0 and 0 counts at t10, we set:

log2rx(t10)rx(t0)=log21rx(t0)

Finally, the relative fitness measurements of each sgRNA were averaged across samples (B. subtilis experiments: 6 replicates, E. coli experiments: 4 replicates) to calculate the final relative fitness value and standard deviation (Table S3).

Detection limits of relative fitness measurements

Our relative fitness experiments seek to quantify the number of doublings each strain experiences during the course of the experiment, relative to the number of doublings a wild-type (or a non-targeting sgRNA control) strain experiences during this time. To do so, we measure the bulk growth of the population, and quantify the relative abundance of each strain at the start and end of each experiment via next-generation sequencing. Changes in the relative abundance of a strain are determined by the growth rate of the individual strain relative to the population as a hole. For example, there is a 210 ~ 1,000-fold increase in the number of cells during a 10-doubling experiment. Therefore, cells that do not divide (but remain intact) will experience a 1,000-fold decrease in relative abundance.

Our ability to measure the relative abundance of strains is constrained by sequencing depth. Assuming an equal number of reads at the start and end of the experiment, measurement of a 1,000-fold decrease in relative abundance requires that a strain have at least 1,000 reads at the start of the experiment. A poorly represented strain (e.g. 50 read counts at the start of the experiment) cannot decrease 1,000-fold and be meaningfully measured.

Previously reported pooled fitness experiments of CRISPRi libraries in E. coli prioritized sensitivity to slight growth defects over quantifying the extent of a strong fitness defect (Figure S6D-E). To do so, these experiments were run for many generations (15+) and were sequenced with relatively less depth (median counts ~100). This limited their ability to quantify strong fitness effects. In contrast, this study prioritized quantification of the full range of possible fitness outcomes. As a result, our experiments were run for 10 generations and deeply sequenced (median counts > 1,000), allowing us to quantify a broad range of fitness defects.

Many strains were abundant enough at the start of the experiment to allow accurate quantification of decreases greater than 210 ~ 1,000-fold. These events (relative fitness < 0) represent active depletion from the pool.

Relative fitness validation

To validate the practice of using pooled growth measurements as an approximation of relative fitness, we also measured the relative fitness of individual dfrA knockdown strains grown in the presence of a wildtype strain. For each dfrA sgRNA, the spacer was cloned separately into pJSHA77-gfp and pJSHA77-rfp, each transformed into the dcas9 strain, and then competed against a wildtype constitutively expressing the opposite fluorophore (i.e. strains with a dfrA sgRNA and expressing gfp were competed against a wildtype expressing rfp). Strains were mixed at a starting OD600 of 0.01 in 300μL of LB in four replicate wells of a 96-well deep-well plate, covered with a breathable film, and grown shaking at 900 RPM at 37C. Cells were diluted to OD600 0.01 in fresh LB with 1% xylose and grown again (900 RPM, 37C) to OD600 0.3. Immediately after each back-dilution (and at end of experiment) the previous plate was fixed with 50μL of 37% formaldehyde per well, incubated for 10min at room temperature, and quenched with 50μL of 2.5 M glycine. The quenched reaction was diluted 1:20 into 1X PBS before measurement by flow cytometry (LSRII, BD Biosciences) using the blue laser (488 nm) and the FITC detector (530/30 nm) for GFP detection, and the yellow/green laser (561 nm) and the PE-Texas Red detector (610/20 nm) for RFP detection. Data for at least 20,000 cells were collected, and thresholds based on control wells were used to define the GFP+ and RFP+ populations to determine the ratio of each population in each sample using FlowJo (FlowJo, LLC). All calculated relative fitness measurements from this validation experiment are provided in Table S3.

Relative expression measurements

Growth and RT-qPCR

Reconstructed E. coli CRISPRi strains targeting genes involved in peptidoglycan precursor biosynthesis were grown in triplicate from single colonies in pre-warmed 4ml LB + ampicillin for 2.5hrs before back-dilution (1:80) in pre-warmed 4ml LB + ampicillin + 1mM IPTG and growth for 3hr prior to collection (OD600 ~ 0.2). The control strains express rfp and harbor sgRNA plasmids expressing sgRNAs targeting either rfp or gfp (“non-targeting”) and were treated identically. Samples were collected (300ul) in 900ul TRIzol-LS (Thermo Fisher #10296010) and stored at −20C overnight. The following day RNA was extracted according to the TRIzol protocol. RNA was quantified using a NanoDrop 2000c Spectrophotometer (Thermo Scientific) to normalize input (500ng input / 20ul reaction). For each RT-qPCR probe set and each sample replicate, reactions were performed in triplicate.

All RT-qPCR assays were done using the Luna Universal One-Step RT-qPCR kit (New England Biolabs #E3005S) according to its RT and cycling protocols, in 96 well PCR plates (Neptune #3732.X) and measured on a CFX Connect Real-Time System (Bio-Rad).

RT-qPCR analysis

Standard curves for each primer pair were first assessed on serially diluted RNA (extracted from the CRISPRi control strain) to confirm single melting peaks, strong correlations of technical replicates, and to calculate their efficiencies (in accordance with (Bustin et al., 2009)). The relative expression (or Normalized Relative Quantity (NRQ)) of each gene of interest in each experimental sample was calculated according to (Hellemans et al., 2007), which uses the geometric mean of two reference genes (here: atpB and recA) to normalize the probe of interest within each sample, and further calculates the fold-change in relative expression compared to a wildtype strain. The “non-targeting” rfp+ strain was considered the wildtype for the normalization of all other strains.

Expression-fitness relationship analysis

Quantifying similarity between fully complementary guides targeting the same gene

Our gfp based model predicts the activity of singly mismatched sgRNAs relative to the activity of the fully complementary sgRNA from which they are derived. To use this relative sgRNA activity as a proxy for absolute activity, fully complementary sgRNAs targeting the same gene should have the same activity. Since we cannot easily measure the sgRNA activity directly when targeting endogenous essential genes, we reasoned that we could validate this assumption by comparing fitness effect of fully complementary sgRNAs targeting the same gene (plotted in Figure S8).

To determine whether fully complementary sgRNAs targeting the same essential gene had similar effects, we compared the total sum of squares (totalSS) to the within gene sum of squares (withinSS) for fully complementary sgRNAs targeting essential genes. In E. coli, the withinSS accounted for 26.7% of the totalSS and in B. subtilis the withinSS accounted for 18.6% of the totalSS. This suggests that fully complementary sgRNAs targeting the same gene are substantially more similar with regards to their fitness outcomes than fully complementary sgRNAs as a whole, and supports the assumption that fully complementary sgRNAs targeting the same gene have similar levels of activity.

Expression-fitness relationship analysis details

In order to quantitatively assess the expression-fitness relationship of genes targeted by the V2 E. coli and B. subtilis libraries, we developed a per gene pipeline, described below.

  1. In general, fully complementary sgRNAs targeting the same gene had similar fitness effects (Figure S8), suggesting that all fully complementary sgRNAs induce a similar level of knockdown. We identified outlier sgRNAs that were significantly less effective at inducing a fitness defect (and therefore were likely to be ineffective at knocking down their target) by comparing the distribution of fitness values for each series (series = the fully complementary sgRNA and its 9 singly mismatched variants) to the fitness distribution of the remaining series targeting the same gene. Using a two-sided t-test, we assessed whether the distribution of their relative fitness values was significantly different (p < 0.05) from the relative fitness distribution of the remaining sgRNAs targeting the gene. If their distribution was significantly different and their mean relative fitness was higher than the other sgRNAs targeting the same gene, we surmised that the fully matched sgRNA was likely not functional and excluded its series from further analysis.

  2. We next predicted the sgRNA activity of all sgRNAs using the model of sgRNA efficacy described above trained on the two species averaged GFP data also described above. Consistent with the definition of sgRNA activity above, fully complementary sgRNAs were assigned an sgRNA activity of 1.

  3. We binned sgRNAs that passed our filter (Step 1) based on their predicted sgRNA activity (bin width = 0.2, bin spacing = 0.05, for a total of 17 bins), and within each bin we calculated the median relative fitness. A fully healthy (relative fitness = 1, predicted sgRNA activity = 0) pseudocount was included for each gene. Per sgRNA and per gene patterns are shown in Data S1 (E. coli) and Data S2 (B. subtilis). Per gene bin medians for essential genes can be found in Tables S5.

Per gene bin medians were used in all analyses of gene expression-fitness relationship similarity.

Per gene and per sgRNA family correlation

For both E. coli and B. subtilis, per sgRNA family correlations were calculated for sgRNA families that passed the filter described above, had at least one relative fitness value less than 0.7, and had measurements for at least 6/10 possible sgRNAs. Similarly, per gene correlations for genes with a last bin fitness less than 0.7 and computed using sgRNA families that passed the filter described above.

Gene expression-fitness relationship clustering and enrichment analysis

To determine whether per gene expression-fitness curves were biologically meaningful, we clustered the bin medians (described above) for all essential gene in E. coli and B. subtilis into 9 clusters using k-means with 10,000 random restarts. Functional enrichment within clusters was calculated for COG categories, GO biological process terms, and KEGG terms using the hypergeometric test. Only p-values with Bonferroni corrected (p < 0.05) are shown in Table S6.

Gene similarity comparisons

To determine whether the expression-fitness relationships of genes within COG categories, GO biological process, or KEGG categories were more similar to each other than to those of other genes we first calculated pairwise Euclidean distances between the expression-fitness relationships of all essential genes within each species. We then used a two-sided t-test to compare the distances between genes within each category to the distances between those genes and genes in different categories. We accounted for CRISPRi polarity due to operon structure by excluding any distances between genes within the same operon (defined as two genes in the same direction <50bp apart) from both the “inside category” and the “outside category” set.

To determine whether the expression-fitness relationships of homologous genes were more similar to each other than to those of other genes in the opposing organism, we calculated the pairwise Euclidean distance between the expression-fitness relationships of all essential genes that have essential homologs in both E. coli and B. subtilis (n = 150, as defined in Koo et at., 2017). We next used a two-sided t-test to determine if the distance between homologs was, on average, different from the overall distribution of distances between these 150 genes (i.e. when one gene from one species is compared to the 149 genes in the opposing species). To determine which pairs of homologs were significantly dissimilar, for each gene pair (including homologs), we calculated how many cross-species comparisons involving either gene were more similar than the comparison in question. We compared this number in homologs and nonhomologs to calculate a FDR.

Quantification and statistical analysis:

Statistical parameters—including r, R2, SD—are reported in the Figures, Figure Legends, or Supplementary Tables, as indicated and described in Methods Details (above).

Supplementary Material

1
2

Table S1. Related to Figure 1.

FACS-seq data (enrichment ratios, training features for model, and relative gfp knockdown).

3

Table S2. Related to Figure 2.

Linear model parameters and weights from trained models.

4

Table S3. Related to Figure 3.

Fitness Data for Essential Gene sgRNA guides from Competitive Pooled-Growth Experiments.

5

Table S4. Related to Figure 3.

Lysis Phenotypes for B. subtilis and E. coli

6

Table S5. Related to Figure 4.

Ontological annotations and phenotype clustering data for essential genes.

7

Table S6. Related to Figure 4.

Statistically significant functional enrichments from k=9 kmeans clustering in E. coli and B. subtilis.

8

Table S7. Related to Figure 4.

Table of expression-fitness relationships of essential genes in E. coli and B. subtilis displayed in Figure 4.

9

Table S8. Related to Figure 5.

Similarity of the expression-fitness relationship between essential homologous genes in E. coli and B. subtilis.

10

Table S9. Related to STAR Methods.

List of primers used in this study.

11

Table S10. Related to STAR Methods.

List of strains used in the study.

12

Table S11. Related to Figure 3, Figure 4, and Figure 5

Table of values for the murAA-gfp transcriptional fusion validation experiments.

13

Table S12. Related to Figure 6.

Table containing the median fitness differences between essential gene depletion in the wildtype, and the Δmpl & ΔampG backgrounds.

14
15

Highlights:

  • A universal bacterial system for titrating CRISPRi using partially mismatched sgRNAs

  • Determined expression-fitness relationships of E. coli and B. subtilis essentialome

  • Expression-fitness relationships are shared within pathways and between homologs

  • Shared homeostatic constraints underlie the optimization of essential gene expression

ACKNOWLEDGMENTS

We thank M. Kampmann, D. Santos, M. Horlbeck, A. Banta, G-W Li, and members of the J.S.W. and C.A.G. Labs for extensive helpful discussions, C. Lu, C. Liem, M. DeVera, and R. Pak for assistance with library cloning, J. Garabino for help with flow cytometry, and E. Chow, D. Bogdanoff, and K. Chaung from the UCSF Center for Advanced Technology for help with sequencing.

Funding: MRS was supported by the National Institutes of Health T32 GM007810a training grant and a National Science Foundation Graduate Research Fellowship. JSW is a Howard Hughes Medical Institute Investigator. This work was supported in part by the National Institutes of Health (F32 GM108222 and K22AI137122 to JMP; F32 GM116331 and K99 GM130964 to MJ; P50 GM102706, U01 CA168370, R01 DA036858, and RM1 HG009490 to JSW; R35 GM118061 to CAG) and the Innovative Genomics Institute, UC Berkeley (CAG).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

DECLARATION OF INTERESTS

JSW and MJ have filed patent applications related to CRISPRi/a screening and mismatched sgRNAs in eukaryotic systems. JSW consults for and holds equity in KSQ Therapeutics, Maze Therapeutics, and Tenaya Therapeutics. JSW is a venture partner at 5AM Ventures. MJ consults for Maze Therapeutics.

REFERENCES:

  1. Alkan F, Wenzel A, Anthon C, Havgaard JH, and Gorodkin J (2018). CRISPR-Cas9 off-targeting assessment with nucleic acid duplex energy parameters. Genome Biol. 19, 177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Barreteau H, Kovac A, Boniface A, Sova M, Gobec S, and Blanot D (2008). Cytoplasmic steps of peptidoglycan biosynthesis. FEMS Microbiol. Rev. 32, 168–207. [DOI] [PubMed] [Google Scholar]
  3. Bauer CR, Li S, and Siegal ML (2015). Essential gene disruptions reveal complex relationships between phenotypic robustness, pleiotropy, and fitness. Mol. Syst. Biol. 11, 773. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bhattacharyya S, Bershtein S, Yan J, Argun T, Gilson AI, Trauger SA, and Shakhnovich EI (2016). Transient protein-protein interactions perturb E. coli metabolome and cause gene dosage toxicity. Elife 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bobrovskyy M, and Vanderpool CK (2016). Diverse mechanisms of post-transcriptional repression by the small RNA regulator of glucose-phosphate stress. Mol. Microbiol 99, 254–273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Borkowski O, Goelzer A, Schaffer M, Calabre M, Mäder U, Aymerich S, Jules M, and Fromion V (2016). Translation elicits a growth rate-dependent, genome-wide, differential protein production in Bacillus subtilis. Mol. Syst. Biol 12, 870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Boy E, and Patte JC (1972). Multivalent repression of aspartic semialdehyde dehydrogenase in Escherichia coli K-12. J. Bacteriol 112, 84–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Boyle EA, Andreasson JOL, Chircus LM, Sternberg SH, Wu MJ, Guegler CK, Doudna JA, and Greenleaf WJ (2017). High-throughput biochemical profiling reveals sequence determinants of dCas9 off-target binding and unbinding. Proc. Natl. Acad. Sci. U. S. A. 114, 5461–5466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bustin SA, Benes V, Garson JA, Hellemans J, Huggett J, Kubista M, Mueller R, Nolan T, Pfaffl MW, Shipley GL, et al. (2009). The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clin. Chem 55, 611–622. [DOI] [PubMed] [Google Scholar]
  10. Calvo-Villamñán A, Ng JW, Planel R, Ménager H, Chen A, Cui L, and Bikard D (2020). On-target activity predictions enable improved CRISPR-dCas9 screens in bacteria. Nucleic Acids Res. 48, e64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Christodoulou D, Link H, Fuhrer T, Kochanowski K, Gerosa L, and Sauer U (2018). Reserve Flux Capacity in the Pentose Phosphate Pathway Enables Escherichia coli’s Rapid Response to Oxidative Stress. Cell Syst 6, 569–578.e7. [DOI] [PubMed] [Google Scholar]
  12. Cui L, Vigouroux A, Rousset F, Varet H, Khanna V, and Bikard D (2018). A CRISPRi screen in E. coli reveals sequence-specific toxicity of dCas9. Nat. Commun. 9, 1912. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Dai X, Zhu M, Warren M, Balakrishnan R, Patsalo V, Okano H, Williamson JR, Fredrick K, Wang Y-P, and Hwa T (2016). Reduction of translating ribosomes enables Escherichia coli to maintain elongation rates during slow growth. Nat Microbiol 2, 16231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Datsenko KA, and Wanner BL (2000). One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc. Natl. Acad. Sci. U. S. A 97, 6640–6645. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Dekel E, and Alon U (2005). Optimality and evolutionary tuning of the expression level of a protein. Nature 436, 588–592. [DOI] [PubMed] [Google Scholar]
  16. Eames M, and Kortemme T (2012). Cost-benefit tradeoffs in engineered lac operons. Science 336, 911–915. [DOI] [PubMed] [Google Scholar]
  17. Fransen F, Hermans K, Melchers MJB, Lagarde CCM, Meletiadis J, and Mouton JW (2017). Pharmacodynamics of fosfomycin against ESBL- and/or carbapenemase-producing Enterobacteriaceae. J. Antimicrob. Chemother 72, 3374–3381. [DOI] [PubMed] [Google Scholar]
  18. Gilbert LA, Larson MH, Morsut L, Liu Z, Brar GA, Torres SE, Stern-Ginossar N, Brandman O, Whitehead EH, Doudna JA, et al. (2013). CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Gilbert LA, Horlbeck MA, Adamson B, Villalta JE, Chen Y, Whitehead EH, Guimaraes C, Panning B, Ploegh HL, Bassik MC, et al. (2014). Genome-Scale CRISPR-Mediated Control of Gene Repression and Activation. Cell 159, 647–661. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Gong S, Yu HH, Johnson KA, and Taylor DW (2018). DNA Unwinding Is the Primary Determinant of CRISPR-Cas9 Activity. Cell Rep. 22, 359–371. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Harris LK, and Theriot JA (2016). Relative Rates of Surface and Volume Synthesis Set Bacterial Cell Size. Cell 165, 1479–1492. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Hellemans J, Mortier G, De Paepe A, Speleman F, and Vandesompele J (2007). qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data. Genome Biol. 8, R19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Johnson EO, LaVerriere E, Office E, Stanley M, Meyer E, Kawate T, Gomez JE, Audette RE, Bandyopadhyay N, Betancourt N, et al. (2019). Large-scale chemical-genetics yields new M. tuberculosis inhibitor classes. Nature 571, 72–78. [DOI] [PubMed] [Google Scholar]
  24. Johnson JW, Fisher JF, and Mobashery S (2013). Bacterial cell-wall recycling. Ann. N. Y. Acad. Sci 1277, 54–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Jost M, Santos DA, Saunders RA, Horlbeck MA, Hawkins JS, Scaria SM, Norman TM, Hussmann JA, Liem CR, Gross CA, et al. (2020). Titrating gene expression using libraries of systematically attenuated CRISPR guide RNAs. Nat. Biotechnol [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kahan FM, Kahan JS, Cassidy PJ, and Kropp H (1974). The mechanism of action of fosfomycin (phosphonomycin). Ann. N. Y. Acad. Sci 235, 364–386. [DOI] [PubMed] [Google Scholar]
  27. Kampmann M, Bassik MC, and Weissman JS (2013). Integrated platform for genome-wide screening and construction of high-density genetic interaction maps in mammalian cells. Proc. Natl. Acad. Sci. U. S. A 110, E2317–E2326. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Keren L, Hausser J, Lotan-Pompan M, Vainberg Slutskin I, Alisar H, Kaminski S, Weinberger A, Alon U, Milo R, and Segal E (2016). Massively Parallel Interrogation of the Effects of Gene Expression Levels on Fitness. Cell 166, 1282–1294.e18. [DOI] [PubMed] [Google Scholar]
  29. Kim K, Jeong JH, Lim D, Hong Y, Yun M, Min J-J, Kwak S-J, and Choy HE (2013). A novel balanced-lethal host-vector system based on glmS. PLoS One 8, e60511. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Koo B-M, Kritikos G, Farelli JD, Todor H, Tong K, Kimsey H, Wapinski I, Galardini M, Cabal A, Peters JM, et al. (2017). Construction and Analysis of Two Genome-Scale Deletion Libraries for Bacillus subtilis. Cell Syst 4, 291–305.e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Lalanne J-B, Taggart JC, Guo MS, Herzel L, Schieler A, and Li G-W (2018). Evolutionary Convergence of Pathway-Specific Enzyme Expression Stoichiometry. Cell 173, 749–761 .e38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Langmead B, Trapnell C, Pop M, and Salzberg SL (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Liu X, Gallay C, Kjos M, Domenech A, Slager J, van Kessel SP, Knoops K, Sorg RA, Zhang J-R, and Veening J-W (2017). High-throughput CRISPRi phenotyping identifies new essential genes in Streptococcus pneumoniae. Mol. Syst. Biol 13, 931. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Lutz R, and Bujard H (1997). Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements. Nucleic Acids Res. 25, 1203–1210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. McLennan N, and Masters M (1998). GroE is vital for cell-wall synthesis. Nature 392, 139. [DOI] [PubMed] [Google Scholar]
  36. Mengin-Lecreulx D, and van Heijenoort J (1985). Effect of growth conditions on peptidoglycan content and cytoplasmic steps of its biosynthesis in Escherichia coli. J. Bacteriol 163, 208–212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Mengin-Lecreulx D, Texier L, Rousseau M, and van Heijenoort J (1991). The murG gene of Escherichia coli codes for the UDP-N-acetylglucosamine: N-acetylmuramyl-(pentapeptide) pyrophosphoryl-undecaprenol N-acetylglucosamine transferase involved in the membrane steps of peptidoglycan synthesis. J. Bacteriol 173, 4625–4636. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Nichols RJ, Sen S, Choo YJ, Beltrao P, Zietek M, Chaba R, Lee S, Kazmierczak KM, Lee KJ, Wong A, et al. (2011). Phenotypic landscape of a bacterial cell. Cell 144, 143–156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Nomura M, Yates JL, Dean D, and Post LE (1980). Feedback regulation of ribosomal protein gene expression in Escherichia coli: structural homology of ribosomal RNA and ribosomal protein MRNA. Proc. Natl. Acad. Sci. U. S. A 77, 7084–7088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Ogura T, Inoue K, Tatsuta T, Suzaki T, Karata K, Young K, Su LH, Fierke CA, Jackman JE, Raetz CR, et al. (1999). Balanced biosynthesis of major membrane components through regulated degradation of the committed enzyme of lipid A biosynthesis by the AAA protease FtsH (HflB) in Escherichia coli. Mol. Microbiol 31, 833–844. [DOI] [PubMed] [Google Scholar]
  41. Peters JM, Colavin A, Shi H, Czarny TL, Larson MH, Wong S, Hawkins JS, Lu CHS, Koo B-M, Marta E, et al. (2016). A Comprehensive, CRISPR-based Functional Analysis of Essential Genes in Bacteria. Cell 165, 1493–1506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Peters JM, Koo B-M, Patino R, Heussler GE, Hearne CC, Qu J, Inclan YF, Hawkins JS, Lu CHS, Silvis MR, et al. (2019). Enabling genetic analysis of diverse bacteria with Mobile-CRISPRi. Nat Microbiol 4, 244–250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Qi LS, Larson MH, Gilbert LA, Doudna JA, Weissman JS, Arkin AP, and Lim WA (2013). Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152, 1173–1183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Rancati G, Moffat J, Typas A, and Pavelka N (2018). Emerging and evolving concepts in gene essentiality. Nat. Rev. Genet 19, 34–49. [DOI] [PubMed] [Google Scholar]
  45. Rauch BJ, Silvis MR, Hultquist JF, Waters CS, McGregor MJ, Krogan NJ, and Bondy-Denomy J (2017). Inhibition of CRISPR-Cas9 with Bacteriophage Proteins. Cell 168, 150–158.e10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Reis AC, Halper SM, Vezeau GE, Cetnar DP, Hossain A, Clauer PR, and Salis HM (2019). Simultaneous repression of multiple bacterial genes using nonrepetitive extra-long sgRNA arrays. Nat. Biotechnol [DOI] [PubMed] [Google Scholar]
  47. Rest JS, Morales CM, Waldron JB, Opulente DA, Fisher J, Moon S, Bullaughey K, Carey LB, and Dedousis D (2013). Nonlinear fitness consequences of variation in expression level of a eukaryotic gene. Mol. Biol. Evol 30, 448–456. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Reverend BD-L, Boitel M, Deschamps AM, Lebeault J-M, Sano K, Takinami K, and Patte J-C (1982). Improvement of Escherichia coli strains overproducing lysine using recombinant DNA techniques. European Journal of Applied Microbiology and Biotechnology 15, 227–231. [Google Scholar]
  49. Rock JM, Hopkins FF, Chavez A, Diallo M, Chase MR, Gerrick ER, Pritchard JR, Church GM, Rubin EJ, Sassetti CM, et al. (2017). Programmable transcriptional repression in mycobacteria using an orthogonal CRISPR interference platform. Nat Microbiol 2, 16274. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Rodionov DA, Vitreschak AG, Mironov AA, and Gelfand MS (2003). Regulation of lysine biosynthesis and transport genes in bacteria: yet another RNA riboswitch? Nucleic Acids Res. 31, 6748–6757. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Rousset F, Cui L, Siouve E, Becavin C, Depardieu F, and Bikard D (2018). Genome-wide CRISPR-dCas9 screens in E. coli identify essential genes and phage host factors. PLoS Genet. 14, e1007749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Schaechter M, Maaloe O, and Kjeldgaard NO (1958). Dependency on medium and temperature of cell size and chemical composition during balanced grown of Salmonella typhimurium. J. Gen. Microbiol 19, 592–606. [DOI] [PubMed] [Google Scholar]
  53. Scott M, Gunderson CW, Mateescu EM, Zhang Z, and Hwa T (2010). Interdependence of cell growth and gene expression: origins and consequences. Science 330, 1099–1102. [DOI] [PubMed] [Google Scholar]
  54. Scott M, Klumpp S, Mateescu EM, and Hwa T (2014). Emergence of robust growth laws from optimal regulation of ribosome synthesis. Mol. Syst. Biol 10, 747. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Thomason LC, Costantino N, and Court DL (2007). E. coli Genome Manipulation by P1 Transduction. Curr. Protoc. Mol. Biol 2, 1.17.1–1.17.8. [DOI] [PubMed] [Google Scholar]
  56. Thomason LC, Sawitzke JA, Li X, Costantino N, and Court DL (2014). Recombineering: Genetic Engineering in Bacteria Using Homologous Recombination. Curr. Protoc. Mol. Biol 47, 1.16.1–1.16.39. [DOI] [PubMed] [Google Scholar]
  57. Urban JH, and Vogel J (2008). Two seemingly homologous noncoding RNAs act hierarchically to activate glmS mRNA translation. PLoS Biol. 6, e64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Vigouroux A, Oldewurtel E, Cui L, Bikard D, and van Teeffelen S (2018). Tuning dCas9’s ability to block transcription enables robust, noiseless knockdown of bacterial genes. Mol. Syst. Biol 14, e7899. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Wang T, Guan C, Guo J, Liu B, Wu Y, Xie Z, Zhang C, and Xing X-H (2018). Pooled CRISPR interference screening enables genome-scale functional genomics study in bacteria with superior performance. Nat. Commun 9, 2475. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Zheng Y, Struck DK, Bernhardt TG, and Young R (2008). Genetic analysis of MraY inhibition by the phiX174 protein E. Genetics 180, 1459–1466. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1
2

Table S1. Related to Figure 1.

FACS-seq data (enrichment ratios, training features for model, and relative gfp knockdown).

3

Table S2. Related to Figure 2.

Linear model parameters and weights from trained models.

4

Table S3. Related to Figure 3.

Fitness Data for Essential Gene sgRNA guides from Competitive Pooled-Growth Experiments.

5

Table S4. Related to Figure 3.

Lysis Phenotypes for B. subtilis and E. coli

6

Table S5. Related to Figure 4.

Ontological annotations and phenotype clustering data for essential genes.

7

Table S6. Related to Figure 4.

Statistically significant functional enrichments from k=9 kmeans clustering in E. coli and B. subtilis.

8

Table S7. Related to Figure 4.

Table of expression-fitness relationships of essential genes in E. coli and B. subtilis displayed in Figure 4.

9

Table S8. Related to Figure 5.

Similarity of the expression-fitness relationship between essential homologous genes in E. coli and B. subtilis.

10

Table S9. Related to STAR Methods.

List of primers used in this study.

11

Table S10. Related to STAR Methods.

List of strains used in the study.

12

Table S11. Related to Figure 3, Figure 4, and Figure 5

Table of values for the murAA-gfp transcriptional fusion validation experiments.

13

Table S12. Related to Figure 6.

Table containing the median fitness differences between essential gene depletion in the wildtype, and the Δmpl & ΔampG backgrounds.

14
15

Data Availability Statement

  • Source data:
    • All raw sequencing data is deposited in the Short Read Archive under accession PRJNA574461.
    • All analyzed fitness data are available in the paper’s Supplemental Information as indicated.
    • All analyzed FACS-seq data are available in the paper’s Supplemental Information as indicated.
  • Code:
    • All custom analysis scripts referenced here and in the Key Resources Table are publicly available.
  • Scripts:
    • The scripts used to generate the figures reported in this paper involved using the “plot” and “image” functions available in the R software package, version 3.5+, available at https://www.r-project.org/, to plot data reported in the Supplemental Information, as described in the figure legends and STAR Methods.
    • Scripts were not used to generate the diagrams reported in this paper, which were manually constructed using Adobe Illustrator.
  • Any additional information required to reproduce this work is available from the Lead Contact.

KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER
Chemicals, Peptides, and Recombinant Proteins
Lysogeny broth (LB), Lennox Fisher scientific Cat# BP1427-2
Bacillus subtilis MC medium Koo et al., 2017 N/A
Bacillus subtilis competence medium Koo et al., 2017 N/A
IPTG Denville scientific Cat# C18280-13
Xylose
Ampicillin sodium salt Sigma-Aldrich Cat# A9518
Kanamycin sulfate Sigma-Aldrich Cat# K1377
Erythromycin Sigma-Aldrich Cat# E5389
Spectinomycin dihydrochloride pentahydrate Sigma-Aldrich Cat# S9007
Chloramphenicol Sigma-Aldrich Cat# C0378
Carbenicillin Millipore-Sigma Cat# 205805
Gentamicin sodium salt Fisher Scientific Cat# AAJ1605103
Trimethoprim Sigma-Aldrich Cat# T7883-5G
 
Q5 High-Fidelity DNA polymerase New England Biolabs Cat# M0493S
HiFi Assembly New England Biolabs Cat# E2621L
BsaI-HFv2 New England Biolabs Cat# R3733
T4 DNA Ligase New England Biolabs Cat# M0202L
Critical Commercial Assays
DNeasy Blood & Tissue Kit Qiagen Cat# 69506
Midiprep Kit Qiagen Cat# 12143
QIAprep Spin miniprep kit Qiagen Cat# 27106
Deposited Data
Raw sequencing data (FASTQs) for relative fitness experiments and FACS-seq experiments This study SRA: PRJNA574461
 
 
 
 
Experimental Models: Organisms/Strains
Bacillus subtilis 168 BGSC 1A1
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm) Peters et al., 2016 CAG74209
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), amyE∷Pveg-sgRNA(cat) (CRISPRi libraries: sgRNA spacers listed in Table S3) This study N/A
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), thrC∷Pveg-gfp(Spc) This study CAG78920
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), thrC∷Pveg-gfp(Spc), pJSHA77 (CRISPRi libraries: sgRNA spacers listed in Table S1) This study N/A
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), thrC∷Pveg-rfp(Spc) This study CAG78921
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), thrC∷Pveg-rfp(Spc), pJSHA77 (CRISPRi libraries: sgRNA spacers listed in Table S1) This study N/A
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), sacA∷Pveg-rfp This study CAG78922
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), sacA∷Pveg-rfp, murAA-gfp(Kan) This study CAG78923
Bacillus subtilis 168 lacA∷Pxyl-dcas9(Erm), sacA∷Pveg-rfp, murAA-gfp(Kan), thrC∷Pveg-murAA*(Spc) This study CAG78924
Escherichia coli BW25113 Baba et al., 2006 N/A
Escherichia coli BW25113 Tn7att∷PlLac-O1-dcas9(Gent) This study CAG78830
Escherichia coli BW25113 Tn7att∷PlLac-O1-dcas9(Gent), pJSHA77 (CRISPRi libraries: sgRNA spacers listed in Table S3) This study N/A
Escherichia coli BW25113 Tn7att∷PBBa_J23105-dcas9(Gent), yjaA:Pveg-gfp(Cat):yjaB This study CAG78108
Escherichia coli BW25113 Tn7att∷PBBa_J23105-dcas9(Gent), yjaA:Pveg-gfp(Cat):yjaB, pJSHA77 (CRISPRi libraries: sgRNA spacers listed in Table S1) This study N/A
Escherichia coli BW25113 Tn7att∷PBBa_J23105-dcas9(Gent), yjaA:Pveg-rfp(Cat):yjaB This study CAG78107
Escherichia coli BW25113 Tn7att∷PBBa_J23105-dcas9(Gent), yjaA:Pveg-rfp(Cat):yjaB, pJSHA77 (CRISPRi libraries: sgRNA spacers listed in Table S1) This study N/A
10-beta Electrocompetent Escherichia coli New England Biolabs Cat# C3020K
Oligonucleotides
Primers used in this study are listed in Table S9 This study N/A
Recombinant DNA
pDG1731 Radeck et al., 2013 pBS4S (Addgene# 55170)
pDG1731-gfp This study N/A
pDG1731-rfp This study N/A
pDG1622 BGSC ECE119
pJSHA77 This study N/A
pJSHA77-rfp This study N/A
pJSHA77-gfp This study N/A
 
 
 
 
 
 
 
Software and Algorithms
Bowtie2 Langmead and Salzberg, 2012 http://bowtie-bio.sourceforge.net/bowtie2/index.shtml
sgRNA design (fully matched sgRNAs) This study https://github.com/traeki/sgrna_design
Linear model training (train_linear_model.py) This study https://github.com/traeki/mismatch_crispri
Design a subset of mismatch sgRNA (choose_guides.py) This study https://github.com/traeki/mismatch_crispri
FASTQ analysis to calculate sgRNA abundance and relative fitness (count_guides.py, compute_gammas.py, gamma_to_relfit.py) This study https://github.com/traeki/mismatch_crispri
FlowJo v10 FlowJo, LLC
Other
 
 
 
 
 

RESOURCES