Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2022 Oct 22;50(19):10964–10980. doi: 10.1093/nar/gkac914

Molecular basis for lethal cross-talk between two unrelated bacterial transcription factors - the regulatory protein of a restriction-modification system and the repressor of a defective prophage

Aleksandra Wisniewska 1, Ewa Wons 2, Katarzyna Potrykus 3, Rebecca Hinrichs 4,5, Katarzyna Gucwa 6, Peter L Graumann 7,8, Iwona Mruk 9,
PMCID: PMC9638921  PMID: 36271797

Abstract

Bacterial gene expression depends on the efficient functioning of global transcriptional networks, however their interconnectivity and orchestration rely mainly on the action of individual DNA binding proteins called transcription factors (TFs). TFs interact not only with their specific target sites, but also with secondary (off-target) sites, and vary in their promiscuity. It is not clear yet what mechanisms govern the interactions with secondary sites, and how such rewiring affects the overall regulatory network, but this could clearly constrain horizontal gene transfer. Here, we show the molecular mechanism of one such off-target interaction between two unrelated TFs in Escherichia coli: the C regulatory protein of a Type II restriction-modification system, and the RacR repressor of a defective prophage. We reveal that the C protein interferes with RacR repressor expression, resulting in derepression of the toxic YdaT protein. These results also provide novel insights into regulation of the racR-ydaST operon. We mapped the C regulator interaction to a specific off-target site, and also visualized C protein dynamics, revealing intriguing differences in single molecule dynamics in different genetic contexts. Our results demonstrate an apparent example of horizontal gene transfer leading to adventitious TF cross-talk with negative effects on the recipient's viability. More broadly, this study represents an experimentally-accessible model of a regulatory constraint on horizontal gene transfer.

INTRODUCTION

Control of bacterial gene expression is one of the most intriguing examples of biological plasticity in response to changing environmental conditions (1,2). It is mediated by multiple combinatorial entities, including DNA sequence and topology, proteins, modifications of those proteins via co-regulatory molecules or post-translational modification, and non-coding RNAs. Complex regulatory networks coordinate communication between individual genes or their groups to exert certain cascades of signals (3). The interconnectivity of these networks relies mainly on small DNA binding proteins called transcription factors (TFs), which interact with their DNA target sites (TSs) to modulate gene expression over a wide range, from a complete shut-down to full gene induction (4). The action of most TFs is directed at the stage of transcription initiation, where repressors prevent binding and progression of RNA polymerase by steric hinderance, while activators stimulate gene expression, often by stabilizing association of RNA polymerase with promoter sites or facilitation promoter DNA opening (5,6). The characteristic mode of action of each TF is adjusted to operate either less specifically and affect large groups of genes (regulons) like global TFs, or to be highly specific and regulate a single gene or operon (dedicated gene target, local regulator) (7). TF action is determined in part by diffusion on DNA to reach their specific sequence recognition site (TS). However, only a small fraction of TFs have their TSs comprehensively characterized (8). In addition, most TFs also recognize secondary TSs (off-target sites), making the cellular regulatory landscape even more complex, and sometimes noisy (9). It is not clear yet to what extent regulatory interactions between the regulons of certain TFs overlap with one another within the cell's regulatory network. For example, it is challenging to predict whether two TFs within distinct regulatory networks are neutral with respect to one another, or if they exhibit positive or negative synergies (10). Some multilayered regulatory networks are clearly subject to regulatory cross-talk, a situation where one TF could affect the regulatory loop of another TF (11–13).

The TFs comprise a fairly high fraction of the entire gene set in Escherichia coli K-12 (about 8%), which may facilitate better adjustment in regulation of genetic networks (14). In addition, TFs are themselves the main drivers of gene network flexibility, which allows gene expression circuits to evolve much more rapidly than the cell's overall genetic content (15). In contrast to E. coli and other free-living bacteria, endosymbiotic microorganisms tend to have relatively low TF gene content, perhaps reflecting reduced need to adjust their gene expression to a changing environment. For example, Rickettsia prowazekii has only nine TFs (16), while E. coli K-12 has nearly 300 (17).

In a recent report, we studied the TF-controlled Csp231I restriction-modification (R-M) system, during transfer to a new host, in real-time (18,19). The C regulatory protein (TF) precisely controls expression of the two enzymes comprising R-Ms: restriction endonuclease (REase) and DNA methyltransferase (MTase) (20–27). This is especially crucial during the operon transfer to a new host, when C protein acts to delay REase expression so the MTase has time to completely modify the genome to prevent its damage due to REase cutting (28–30). Even though the Csp231I R-M system is located on the chromosome, when placed on a plasmid it disseminates easily within the Enterobacteriaceae members which naturally exchange their genetic material via conjugation/mobilization and transduction (28,31). Using a combination of genetics and transcriptomics we observed that, due to this TF’s action, the host's genetic network expression was changed, affecting cell fitness, as manifested by cell filamentation (Figure 1). Our results indicated that the primary source of cell defects was interference by the introduced TF (C regulator) with the host's network controlled by RacR, the essential regulator of the cryptic Rac prophage (32–34). We also established that the C protein exerted transcriptional cross-talk via an off-target DNA binding, rather than by action at the native site (target) within its own promoter. By cross-talk, we mean here the adventitious cross-reaction of TF with its unrelated (off-target, non-cognate) DNA site, sometimes called TF promiscuity (13). Thus, the main objective in the report we present here, was to further determine the molecular basis of the C protein interaction with its off-target DNA site. We also aim to test our hypothesis on C protein silencing the RacR repressor, which in turn triggered the toxicity of the YdaS-YdaT operon of unknown yet function (Figure 1), but with high resemblance to the immunity regions of lambdoid phages in terms of gene organization, function and regulatory properties (35,36). We also dissected differences in single molecule dynamics for C protein in the presence or absence of target and off-target binding sites. We observed detectable, but surprisingly small differences in vivo, for C protein mobility in the presence of both, only one, or no binding sites. These findings confirm specificity of both interactions, but also show that mobility of this transcription factor is largely governed by non-specific DNA binding.

Figure 1.

Figure 1.

Model for horizontal gene transfer event (HGT) of the TF-linked R-M operon into a new host. If entering foreign TF (red) interferes with one or more host TFs (blue), there may be negative consequences for the recipient cell, up to and including lethality. In our model, the C regulatory protein from an R-M system participates in cross-talk with the RacR repressor, from the defective E. coli Rac prophage. When the C protein is not present (upper panel), RacR repressor binds within the intergenic region between the racR and ydaS genes in the prophage. Expression of the ydaS and ydaT genes is blocked and no toxic cell effects are observed (regular cell morphology, upper right box). However, when C is expressed, it binds not only to its own promoter, but also to its off-target site within the racR gene (bottom panel). This leads to inhibition of racR expression, which in turn de-represses ydaST expression. YdaS and YdaT synthesis triggers cell division or other defects and abnormal cell filamentation occurs (bottom right box), which overall leads to loss of cell viability and/or fitness.

MATERIALS AND METHODS

Strains, plasmids and oligonucleotides

E. coli K-12 strains used in this study are described in Supplementary Table S1 of Supplementary Materials. E. coli Δrac strain was used to carry out in vivo assays with plasmids harboring the ydaS-racR operon (37). The source of genes of Csp231I R-M system was Citrobacter sp. RFL231 (19,38). The plasmids used are listed in Supplementary Table S2, the oligonucleotides/DNA fragments used are shown in Supplementary Table S3, Supplementary Data.

Effect of TFs delivered in trans on racR expression

The genes for selected TFs were cloned downstream of the arabinose-inducible ParaBAD promoter in vector pBAD33 (39) (Supplementary Table S2). A second compatible plasmid carried the racR gene with its natural promoter/operator sequence fused to a reporter gene (lacZ). Arabinose induction experiments were performed in M9-minimal medium with 0.2% glycerol as the carbon source (40). Briefly, single colonies were used to inoculate overnight cultures in M9 media supplemented with appropriate antibiotics. The cells were then resuspended and divided among flasks containing M9-minimal media with varied concentrations of L-arabinose. After about 5 h of subsequent cultivation, ONPG hydrolysis was measured and Miller units were calculated as previously described (23,41). The ONPG assay was done in the E. coli K-12 Δrac strain (37), so that genomic presence of the Rac repressor did not affect expression quantitation. Details on plasmid cloning and features are outlined in the Supplementary Data (Supplementary Table S2).

Overproduction and purification of the C protein and RacR repressor

For the C-terminally-His-tagged C protein (WT, or Cmut with A33G R34E Q37A, unable to bind its target site), as well as the RacR repressor, overproduction and purification were performed as before (19), except that the purified proteins were resuspended in Tris buffer: 10 mM Tris–HCl (pH 8.0) and 1 mM EDTA (Supplementary Table S2, Supplementary Figure S1).

Electrophoretic mobility shift assays (EMSA)

DNA substrates (271 bp) were PCR-amplified using fluorescently Cy5-labeled primer at one end (Supplementary Table S3 of Supplementary Data). They covered the entire racR promoter/operator/intergenic region (IGR) and part of its coding sequence (racR), as well as the ydaS gene; i.e. region −186 to +85 bp, where +1 refers to racR translational start site. Reactions containing 25 nM DNA and the indicated purified C protein/RacR repressor concentrations were prepared in a binding buffer [10 mM Tris–HCl (pH 8.0), 50 mM NaCl, 10 mM MgCl2, 1 μg of poly(dIdC) as a non-specific competitor] in a final volume of 20 μl, and incubated for 20 min at 22°C. Reactions for competitive EMSA experiments were prepared in the same way, with the first protein incubated for 20 min at identical concentrations, followed by addition on increased concentration of the second protein and 20 min incubation time. DNA fragments used as substrates run as two specific bands due to presence of many inverted repeats. Samples were electrophoresed on 5% native polyacrylamide gels in 0.5×TBE buffer at 22°C or on 1% agarose gels. Detection of the Cy5-labeled DNA was performed using the Typhoon 9200 variable mode imager (Molecular Dynamics, USA).

DNaseI footprinting assays

Protein-DNA binding reactions were performed with 20 nM DNA in 20 μl of the following buffer: 10 mM Tris–HCl (pH 8.0), 10mM MgCl2, 50 mM NaCl. DNA substrates were PCR-amplified using fluorescently Cy5-labeled primer at one end (Supplementary Table S3 of Supplementary Data). For C protein footprints: two DNA fragments were used. One was 271 bp, covering region from −186 to + 85bp, and the second was 175 bp (−90 to +85 bp, where + 1 refers to racR translational start site). For RacR footprints, the 271-bp DNA fragments were used (−186 to +85 bp) (Supplementary Table S3 in Supplementary Data). For reactions, the following protein concentrations were used: CWT or Cmut at 0, 264, 378, 422, 528 nM, and RacR at 0, 33, 231, 264 and 300nM. The samples were incubated at 22°C for 20 min, followed by an addition of DNaseI (0.075U; Eurx, Gdansk, Poland) and further incubation at 22°C for 4 min. Reactions were terminated by addition of EDTA to 25 mM, pH 8.0, and concentrated by vacuum evaporation. Next, the samples were resuspended in 40 μl of loading solution (30% formamide, 6 M urea, 10 mM NaOH), denatured at 100°C for 2 min, and loaded (7μl) on 6M urea, 8% acrylamide gels along with sequencing reactions (5′-Cy5-labeled primers) according to manufacturer's recommendations (ddNTPs from Jena Bioscience, Jena, Germany and TERMIPol DNA Polymerase for efficient incorporation of ddNTPs from Solis BioDyne). K lanes indicate control reactions (DNA substrate with buffer and DNaseI), but with shorter incubation time than other samples, where time was dedicated to bind DNA. The Typhoon imaging system (GE Healthcare) was used for gel scanning and documentation.

RNA isolation, determination of transcription start points and promoter activity

E. coli MC1061 carrying plex3bracRlong (for racR) or plex3BracRydaSinaktyracR (for ydaS) plasmids were cultivated to OD600 = 0.4, and pellets from 15 ml of each bacterial culture were collected. These pellets were suspended in 100 μl of TE with lysozyme (0.1 mg/ml), subjected twice to immediate freezing at −80°C/thawing cycles, and 1 ml of TRI reagent (Merck) was added to ensure complete bacterial lysis. Next, 0.2 ml of chloroform was added, the tubes were vortexed, incubated for 5 min at RT and centrifuged (12 000 g, 10 min, 4°C). The top aqueous phase (0.6 ml) was collected and total RNA was precipitated with 0.5 ml isopropanol (10 min RT), centrifuged (12 000 g, 10 min, 4°C), washed with 70% ethanol and dried. RNA was suspended in 30 μl of DEPC-water and its concentration was assessed spectrophotometrically.

Transcription start points of the racR and ydaS genes were determined by primer extension (40). The mixture (12.5 μl) containing 5 pmol of appropriate Cy-5 labeled primer and 10 μg total RNA was incubated for 10 min at 75°C, and then 20 min at 58°C. In the next step, preheated buffer (final concentration: 50 mM Tris–HCl (pH 8.3), 50 mM KCl, 4 mM MgCl2, 10 mM DDT), 1 mM (final concentration) of each dNTPs, 10 U of RiboLock RNase Inhibitor and 200 u of RevertAid H Minus Reverse Transcriptase (Fermentas) were added to a final volume of 20 μl, and the samples were incubated at 42°C for 1h, followed by RNA digestion with 0.25 U RNase H (EurX) (15 min, 37°C). Finally, the cDNA was precipitated overnight at −20°C with 300 μl of absolute ethanol and 40 μl of 3M sodium acetate (pH 5.5), centrifuged (12 000 g, 1h, 4°C), dried and suspended in 10 μl of water and 2 μl of loading buffer (80% formamide, 6 M urea, 10 mM NaOH). Sequencing reactions were performed in parallel on DNA templates using the same labeled primers. The samples were resolved on 8% acrylamide (acrylamide:bis ratio 19:1) – 7 M urea 1×TBE gels (Sambrook1986) and visualized with Typhoon 9200 variable mode imager (Molecular Dynamics, USA). Promoters activity or their inactivation by substitutions was confirmed by ONPG assay (40).

In vitro transcription

Single round in vitro transcription was performed on linear templates with the PydaS–PracR region spanning −184 to +113 bp, relative to the RacR translation start site. The expected transcript lengths were 81 nt, 111nt, 145 nt and 154 nt for the PydaS, PracR3, PracR2 and PracR1 promoters, respectively. The following conditions were used: 10 nM template, 30 nM RNAP holoenzyme (New England Biolabs), 75 mM potassium glutamate, 50 mM Tris-acetate (pH 8.0), 10 mM magnesium acetate, 10 mM β-ME, 0.01 mg/ml BSA, 100 μM ATP, CTP and GTP, 10 μM UTP (10 μCi/reaction [α-32P]UTP). The DNA template was pre-incubated with either RNAP or CWT/Cmut/RacR for 10 min at 37°C, which was followed by another 10 min incubation, but this time either CWT/Cmut/RacR or RNAP were added. Transcription was initiated by addition of NTPs and heparin (100 μg/ml, final). The reactions (final volume: 20 μl) were carried out for 10 min at 37°C, terminated by addition of an equal volume of stop solution (95% formamide, 20 mM EDTA, 0.05% bromphenol blue, and 0.05% xylene cyanol), run on 7 M urea 8% polyacrylamide sequencing gels in 1×TBE, and quantified by phosphorimaging with a Typhoon imaging system.

Single-molecule tracking (SMT) and the C protein dynamics

The C protein gene was fused to the mVenus fluorescent protein gene, and tested for use in SMT technology (Supplementary Figures S4 and S5 of Supplementary Materials). The SMT setup we used is explained in (28,42). Briefly, the central part of a 514 nm laser beam was used for stream acquisition (20 ms integration time) of mVenus fusions, and movies were captured by a Hamamatsu ImageEM EMCCD camera (128×128 pixel area of chip used). About 160 W cm−2 were applied onto the image plane. Protein fusions were expressed at very low levels, such that the single molecule level was reached after few frames of bleaching of most molecules present in the cell, followed by single step bleaching of the fluorophores. Expressing very few molecules also avoids localization artefacts due to overproduction of the respective protein. The images were taken for an average of 8 time intervals, with about 10% of tracks being longer than 10 steps. Only tracks of 5 and more steps were included in the analyses. Data analyses were done using the SMTracker 2.0 program (43).

RESULTS

Transcription factors and their effect on YdaST-RacR operon expression

We showed previously that operon transfer of the Csp231I R-M system, carrying the C regulator as TF, heavily affected the expression of several genes within the rac cryptic prophage region. Dissection of detailed effects indicated that the C protein directly affected expression solely of the RacR repressor - the essential master regulator of the Rac prophage (18). To pursue the molecular basis of such cross-talk, here we first tested selected TFs related to the YdaST-RacR operon, as well as the C protein for comparison, to establish their potential activation/repression effects on this operon's expression and their relationships in in vivo assays. Our experimental genetic system was based on two compatible plasmids, one of which produced a linear gradient of the selected TF in a controlled fashion, using the arabinose PBAD promoter, while the second plasmid carried the lacZ reporter gene fused to the racR and ydaS gene fragments preceded by the intergenic region (IGR) encompassing their promoters and putative TF target sites on both DNA strands (Figure 2). Based on genetic and structural analysis (our unpublished data and (36)), we predicted that YdaS is also a TF. Its autoregulatory properties and effect on racR expression could also be tested quantitatively in the above described assay. The results reveal, as expected, that WT RacR can inhibit its own transcription from the racR promoter, yielding about a 3-fold decrease at higher TF concentration, as compared to the promoter activity in the absence of a functional RacR repressor (RacRmut) (Figure 2A). Interestingly, the profile of racR gene expression for WT RacR repressor and the unrelated C protein regulator are comparable, as the C protein is able to repress racR expression efficiently, to the same extent as the natural RacR repressor. Inactive C protein (Cmut – A33G R34E Q37A unable to bind its natural target site (19)) is also deficient in repressing the racR promoter. In addition, it seems that YdaS is not involved in RacR expression control, as a similar overall effect was observed on promoter activity over the gradient of TF concentrations for YdaS WT, as well as its inactive variant YdaSmut (K37A R40A) (Figure 2A).

Figure 2.

Figure 2.

C protein cross-talk in the context of TF effects on the racR-ydaST operon. The in vivo effect of the C protein and other TFs on expression of racR (A) and ydaS expression (B). In this experimental system, the E. coli K-12 Δrac strain cells contained two compatible plasmids: one with racR or ydaS promoter fused in the same open reading frame with the lacZ reporter gene; the second plasmid delivered a wild-type or mutated selected TF under the inducible ParaBAD promoter. TF transcriptional profiles were measured as LacZ specific activity (Miller units), at increasing TF concentrations from nearly 0 (in glucose) to the highest concentration (in 0.1% arabinose). The TFs are represented as follows: C regulator (circles – WT dark orange, Cmut light orange); YdaS (squares – WT dark green, YdaSmut light green); RacR (diamonds – WT dark blue, RacRmut light blue). The error bars indicate the averages (± SD) of at least three independent measurements. The trend lines are also shown to facilitate visualization of TF effects.

The same set of TFs was used in an identical approach to test the ydaST expression from the IGR region oriented in reverse direction to the racR expression. Strikingly, the RacR TF was the most efficient repressor, able to almost completely inhibit ydaST expression at its highest concentration, whereas YdaS only reduced its own expression to 80% of its activity at the same concentration (0.001% arabinose) (Figure 2B). In contrast, increasing C protein concentrations had no effect on ydaST expression, indicating its indirect role in this operon's regulation via the RacR repressor.

C protein and RacR repressors bind to the racR region

We next tested if the observed in vivo effects on transcription were due to direct C protein and RacR binding to the racR region, by conducting EMSA assays. As a substrate, we used DNA fragments corresponding to −186 to +85 bp with respect to the annotated racR translation start codon (A of ATG). The EMSA reactions were performed with the same amount of Cy5 end-labeled dsDNA (25 nM) and increasing concentrations of purified repressors. The natural target site for C protein is a C-box, which was established experimentally and by sequence comparison of C.Csp231I-like proteins (19). The C-box is formed by the two inverted repeats of CTAAG-n5-CTTAG sequence separated by an extended AT-rich spacer located in the C protein gene promoter (19,44,45). For the C protein assays with off-target sites, such as here in the rac region, we expected weak binding, and thus for control we also used its defective variant (Cmut -A33G, R34E, Q37A) with impaired helix 3 involved in specific DNA recognition, as tested previously (19). The shift in DNA-protein migration was observed only for the wild type C protein (CWT), but not Cmut. The retarded complexes for CWT appeared rather smeary, and at higher C protein concentration the DNA substrate band was invisible (Figure 3A). In contrast, the RacR repressor, which specifically interacted with its target site, showed distinct retardation complexes at low (≤50 nM) protein concentrations, indicating strong and specific binding (Figure 3B).

Figure 3.

Figure 3.

The C protein and RacR bind to the rac region. The 271 bp substrate DNA covered the entire racR promoter/operator/ydaS intergenic region (IGR) and part of the racR coding sequence (positions −186 to +85 bp, in reference to the + 1 translation start site of racR). DNA was amplified by PCR with one primer introducing the Cy5-label at the 5′ end. Each binding reaction was carried out with the same amount of DNA (25 nM) and increasing concentrations of the proteins: the C protein and its defective variant Cmut (0; 500; 650; 800; 1000; 1150; 1300; 1500; 1750 nM) (A) and RacR (0; 27; 54; 81; 108; 135; 162; 189; 243; 300 nM) (B). For competitive EMSA, the same substrate DNA was pre-bound with one protein, as indicated, followed by addition of the second protein under the same reaction conditions (C). The reactions were assembled and processed as outlined in Materials and Methods, resolved on 5% native polyacrylamide gels, and visualized by Typhoon scanner (GE Healthcare). Open and filled arrows denote positions of unbound DNA and shifted DNA–protein complexes, respectively.

In addition, we also performed competitive EMSA reactions with one repressor pre-bound to DNA, in order to detect a possible interference in binding between these two repressors (Figure 3C). First, the C protein was added to DNA fragment to form a stable complex. Subsequent addition of increasing concentrations of the RacR repressor induced obvious higher-mobility complexes, indicating that C protein might not disturb RacR binding to its target site within the DNA substrate, nor does RacR repressor possibly displace the C protein (Figure 3C, left). In a similar way, RacR was first pre-bound to the same DNA substrate and later the C protein was added. Again, we observed that these two repressors formed visible, multiple low-mobility complexes, poorly resolved in the gel. In fact, the two lanes with 108 nM RacR and 1500 nM C-WT were indistinguishable, despite different binding orders. Thus, we can neither rule out that these two repressors compete for the same binding site, nor that their different stoichiometry impacts their DNA binding, at least under in vitro conditions tested here.

Mapping the C protein off-target binding site by DNaseI footprinting

To reinforce the binding assay results, and to confirm that the reduction in RacR expression was due to direct C protein binding to the racR region, we performed binding site mapping assays using DNaseI footprinting. We used the same DNA fragment as a substrate that we used in EMSA studies. First, we searched the sequence to find a DNA site resembling the C protein consensus site CTAAG-n5-CTTAG with the first box being more conserved (shown as the LOGO in Figure 4A). Indeed, we found two exact CTAAG half-sites on the complementary DNA strand, which correspond to the coding sequence of racR gene, close to the start codon (named box1 and box2) (Figure 4A, top right). Analysis of the extended sequence of these boxes on the complementary strand indicated that box1 (5′-CTAAG-n5-CTTAA-3′; nearly inverted repeats) is much closer to the C protein consensus sequence than box2 (5′-CTAAG-n5-ACCAC-3′) (Figure 4A). Thus, we decided to focus on the two boxes 1 and 2 close to the racR gene start codon, due to our in silico search and some data obtained previously (18). We thus performed DNaseI footprinting assays using WT C protein and its Cmut variant. A distinct region of protection was observed from −10 to +15 bp in reference to the start codon of racR gene, which covered the inverted repeats of box1 (Figure 4BC, left, red bar). Such an area of protection was not detected when Cmut was used. To further verify that indeed the first CTTAG region within racR coding sequence (complementary 5′-CTAAG-n5-CTTAA-3′) is bound by the C protein, we prepared DNA substrates with altered box1 or box2 sequences (changing CTTAG into CGCAT in box1 or CTTAG in box2) and repeated DNaseI footprinting in the presence of the C protein (Figure 4C). We noticed sharp bands corresponding to DNaseI digestion products for the DNA fragment with CTTAGbox1 mutated, an area which had been protected in previous assays. However, this was not observed for the mutated CTTAGbox2; instead, a similar region seems to be protected as in the case of the wild type template, although the degree of protection is not as strong (Figure 4C).

Figure 4.

Figure 4.

Mapping of the C protein off-target (Rac) site by DNaseI footprinting. (A) The genetic map of the ydaSracR operon with two CTAAG/CTTAG boxes in the racR coding sequence (in blue font) is shown, as well as the LOGO of the C protein target site (C-box consists of two inverted repeats CTAAG-n5-CTTAG). 20 nM Cy5-labeled DNA fragments were incubated without or with increasing amounts of WT C protein or its inactive variant Cmut (0; 264; 378; 422; 528 nM) for 20 min at 22°C. The resulting complexes were briefly digested with DNaseI, and the products were run on 8% denaturing acrylamide gels along with sequencing reactions (GATC), as outlined in Materials and Methods. Panel (B) shows comparison of the protected area between the WT C protein and its defective Cmut variant for the same WT DNA fragments. Panel (C) focuses on the C protein interaction with WT and two mutated DNA substrates, where either CTTAG of box1 or of box2 is changed into CGCAT (box1mut or box2mut). Protected areas are indicated by red bars. K lanes represent control – DNA substrate subjected to DNaseI only.

In addition, we also performed DNaseI footprinting of C protein within the racR upstream region (IGR) to locate possible additional off-target sites there (Supplementary Figure S2). We did not see any clear area of protection comparable to the spot found for CTTAGbox1, however there was a characteristic pattern of hypersensitivity in the DNaseI footprinting indicating that C protein might bend the DNA strongly, unlike in the experiment with Cmut used. This region corresponds to an A-rich spot within the leader of the ydaS gene (5′-AAAAACACACA-3′) (Supplementary Figure S2).

The YdaS–RacR operon intergenic region contains multiple promoter sites with potential to be inhibited by TFs

To verify and further investigate the process of induction of cell toxicity via C protein interaction with the rac region, we decided to establish the molecular mechanism of gene expression regulation at the intergenic region between ydaS and racR genes, and possible impact of the C protein as an external adventitious transcription factor. First, we located promoters within this region, identifying transcription start sites by primer extension on total RNA prepared from E. coli cells carrying the ydaS-racR operon. Putative positions of promoter sequences were identified based on comparison with E. coli consensus sequences: TATAAT (−10box) and TTGACA (−35box), with 17nt spacer (46). For the racR gene, we obtained several products, but we analyzed in detail only three of the most intense bands, in agreement with unpublished observations of others (34). We assigned probable −10 and −35 boxes for the racR promoters (PracR1; PracR2; PracR3), with the PracR3 promoter for the leaderless transcript being the strongest (as it seems from the primer extension experiment; Figure 5AC).

Figure 5.

Figure 5.

Mapping of the transcription start sites within the racR and ydaS intergenic region, and mapping of the RacR binding sites. For each reaction, total RNA isolated from E. coli carrying racR-ydaS genes on a plasmid (in Δrac background) was used as a template for the primer extension method using Cy5-labeled primers and was performed as indicated in Materials and Methods. The primer extension products (marked as +1) were resolved on a denaturing 6% polyacrylamide gel along with sequencing reactions (G,A,T,C) performed with the same labeled primer and appropriate DNA template. Ct(+) represent racR mRNA from rac + genome context and Ct(–) from Δrac genome served as a negative control. (A) For the racR gene, three transcription start sites of different intensity are shown by arrows: PracR1, PracR2 and PracR3. (B) For the ydaS gene a single transcription start site was found, as indicated by an arrow. (C) The transcript start sites were analyzed in the context of upstream sequences and the appropriate σ70 promoter sequences are indicated. The −10 and −35 promoter motifs are boxed (racR promoters) or in bold (ydaS promoter), respectfully. The putative RacR repressor binding site motives are marked by blue bars on both DNA strands. (AB) The activity of identified promoters were tested in vivo in a Δrac genetic background. WT promoters and their mutants (−10 box substitutions) are as follows: PracR2 (WT AGTAAG→CCCGGG); PracR3 (WT GAGAAT→CCCGGG); PydaS (WT TAATAT→CCCGGG). Vec abbreviates the promotorless vector. Transcription activity was measured as LacZ specific activity (Miller units) and error bars indicate the average (± SD) of at least three measurements. (D) Mapping of the RacR repressor binding sites using DNaseI footprinting with IGR DNA. Increasing amounts of RacR repressor (0, 33, 231, 264, 300 nM) were added to the DNA substrates and incubated for 20 min at 22°C, briefly digested by DNaseI, and the products were resolved on 8% denaturing acrylamide gel along with sequencing reactions (G,A,T,C), as outlined in Materials and Methods. Digestion protected areas are indicated by red bars. The IGR sequence (between the ydaS and racR genes, with start codons in bold) is shown with the protected three areas marked with red font. Putative nine RacR repressor binding sites were found and are boxed in blue; their sequence was used to prepare a LOGO as a possible RacR monomer binding site.

The positions of identified promoter sequences were confirmed by directed mutagenesis of −10 boxes followed by racR -lacZ gene expression assessment in the Δrac genetic context (Figure 5A). As a result, the leaderless PracR3 promoter seems to be essential for the racR promoter strength, but all promoters contribute to overall racR gene expression.

In contrast, the same approach identified only a single primer extension product for the ydaS transcript with exceptional nucleotide characteristics, and nearly perfect consensus sequences for −10 box (TAATAT) and a perfect one for −35box (TTGACA) (Figure 5BC). Substitution of the −10 box with the CCCGGG sequence completely abolished this promoter's activity. We also determined that the ydaT gene does not have a separate promoter, as the cloned intergenic region between ydaS and ydaT does not yield any promoter activity in reporter assay (unpublished data). Thus, it seems that close-to-perfect promoter consensus sequence, PydaS drives the bicistronic mRNA for the ydaS and ydaT genes.

In addition, we also asked how RacR repressor exerts its TF inhibiting property by binding to its promoters under in vitro conditions. Thus, we again conducted DNaseI footprinting using purified RacR repressor and the IGR between ydaS and racR as a substrate. Our data show large protection areas divided by short patches of DNaseI digested fragments, which may indicate several binding sites (Figure 5D, red bars on gel). Notably, gel analysis revealed that for the lowest RacR concentration, the distinct protection of sequence corresponded to the region of the ydaST promoter on the DNA complementary strand. At higher concentrations, a large region just upstream of the initiation codon of RacR was protected, where RacR promoters are located, mainly the PracR3 promoter. A summary of regulatory elements in the context of RacR repressor and C protein binding to the intergenic region is presented as Supplementary Figure S3.

We tried to determine a distinct sequence motif within the protected sequence (Figure 5D, red font in IGR sequence), which could be assigned as RacR binding site. We found several repeat sequences in this region (Figure 5D, nine blue boxes), which coincide with RacR protection areas. Five of them are located on the coding strand (in the 5′-TARGC orientation), whereas the other four are on the complementary strand. Interestingly, most of them are exactly covering the positions of all mapped promoters for the ydaS and racR genes, and the other two repeats are located within the racR coding sequence (Figure 5C, blue bars). We prepared positional alignment of these seven sequences to obtain a LOGO (47) and found a 5′-ATTAGGC-3′ motif, which is presumably the RacR monomer binding site (Figure 5D, LOGO). We also searched for the same motives in extended sequences upstream of the racR gene present in several species of the Enterobacteriaceae family, and we found three highly conserved inverted repeats 5′-T/ATAGGC-n3-GCCTAT/A-3′ to be present in that region, which we assigned as putative RacR operators: O1, O2 and O3 (Figure 6A). The O1 position coincides with the ydaST promoter located on the complementary strand of DNA (Figure 5C), whereas the O3 position (its n3 spacer) overlaps with the initiating ATG and two more codons of the racR gene. In addition, there is another single 5′-TTAGGC-3′ site further downstream in the racR coding sequence. Sequence comparison of these three inverted repeats by the LOGO software revealed a perfect core sequence in O1, but there is a one nt difference in the other two operators, which may affect RacR binding affinity at these sites. The linker regions between these sites are A/T rich, which indicates their propensity towards DNA bending (Figure 6A, LOGO). In line with these observations, comparison of the RacR amino-acid sequences between the Enterobacteriaceae species revealed a high degree of identity. However, we noticed that the RacR orthologs found in the Escherichia, Morganella and Shigella genera consist of a much longer protein with 8 helices organized in a two domain architecture (Figure 6BC), where helix-turn-helix motives are connected by an 18-aa linker to the core protein fragment found among all species evaluated here, as predicted by AlphaFold at Uniprot P76062 database (Figure 6C). Notably, other RacR orthologs from Proteus, Salmonella, Serratia and Klebsiella are shorter, one-domain variants without an unstructured linker. In addition, we did not find any RacR ortholog in Citrobacter freundii/koseri, where the C regulatory gene was originally identified. This reinforces the notion that co-existence of the C protein and RacR repressor in one host may lead to possible viability defects (genetic conflict), which we proposed in our previous work (18).

Figure 6.

Figure 6.

RacR repressor homologs and their putative operators within the racR and ydaS intergenic region. (A) Upstream regions of the racR gene, within several species of the Enterobacteriaceae family, revealed high identity sites, which coincide with inverted repeats, possibly RacR binding sites (operators: O1, O2 and O3), indicated by arrows. The positional nucleotide alignment is presented as a LOGO for three operators separately, showing slightly difference from established consensus sequences, as the inverted repeats separated by 3nt: 5′–T/ATAGGC-n3-GCCTAT/A–3′. The translation initiation ATG codon for racR is present within O3, as marked. (B) Alignment of amino-acid sequences for RacR homologs using MultAlin software. Position of 8 helices is indicated, as well as the linker between two HTH domains. Species codes: E.col (Escherichia coli), E.alb (Escherichia albertii), E.sp (Enterobacter sp.), E.mar (Escherichia marmotae), M.sp (Morganella sp.), C.ama (Citrobacter amalonaticus), S.flex (Shigella flexneri), P.mir (Proteus mirabilis), K.qua (Klebsiella quasipneumonia), K.pne (Klebsiella pneumonia), S.mar (Serratia marcescens), S.ent (Salmonella enterica). (C) RacR repressor monomer structure from E. coli as predicted by AlphaFold software deposited at Uniprot P76062 database. The 8 helices marked with light and dark blue are predicted with highest model confidence. The 18-aa unstructured linker (low confidence) connects two protein domains: N-terminal α1–5 domain present in all RacR homologs and C-terminal α6–8 domain present only in some Enterobacteriaceae species, as indicated in protein alignment.

In vitro transcription assays confirm direct RacR and C protein involvement in transcriptional regulation of the ydaST and racR region

Next, we performed in vitro transcription assays to corroborate our in vivo and in vitro findings obtained so far. Single round transcription assays were performed on linear templates with E. coli RNA polymerase in the presence and absence of RacR and the C protein. Two different conditions were tested – when RacR or CWT/Cmut were allowed to bind to DNA before or after RNA polymerase (RNAP).

We found, that PydaS is indeed a very strong promoter while, under in vitro conditions that were used, PracR1-3 promoters seem to be much weaker (Figure 7AB). Also, in accordance with our previous observations, RacR is a strong inhibitor of transcription initiating at the PydaS, as well as the PracR1 promoter (Figure 7BD). However, this effect is only observed when RacR is allowed to bind before (and not after) RNAP addition, indicating that this repressor probably acts by preventing RNAP binding to the PydaS and PracR1 promoter regions, thus inhibiting transcription initiation. Interestingly, RacR seems to activate transcription initiating from the PracR3 promoter when added after RNAP (with its maximum level reached at the lowest concentration tested, i.e. 50 nM), which may reflect a regulatory loop ensuring that appropriate RacR levels are always present in the cell.

Figure 7.

Figure 7.

The effect of C protein and RacR on transcription initiating from the PydaS-PracR promoter region depends on whether they are allowed to bind DNA before or after RNAP addition. (A, B) Sample gels; (C, D) quantified intensities of transcripts initiating from the corresponding promoters. Data points were normalized to no C/Cmut/RacR samples and reflect average values ± S.D. (n ≥ 3). Single round in vitro transcription was performed on linear templates, as described in the Materials and Methods section. (A, C) CWT or Cmut protein titration (0–4 μM) when RNAP was pre-bound to DNA before CWT/Cmut (RNAP→CWT/Cmut) or after CWT/Cmut addition (CWT/Cmut→RNAP). (B, D) The same as in A and C, but RacR was titrated (0–300 nM).

In contrast, the C protein inhibits all PracR promoters when added before RNAP (with full repression of PracR2 and PracR3; PracR1 is repressed down to ∼10% of its full activity), and to a certain degree also the PydaS promoter (also down to ∼10% of its full activity). These effects are specific, as they were not observed for the Cmut variant (Figure 7AC). When RNAP was allowed to bind before the C protein, inhibition was not observed, which indicates that (similarly to RacR) the C protein-mediated repression is probably due to physical blocking of RNAP binding to DNA, possibly by inducing bending or distortions in the whole promoter region. Interestingly, under these conditions transcriptional activation is observed for the PracR1 and PracR2 promoters, which reflects complexity of this region and its regulation. We note that the effect of C protein requires higher protein concentrations than does the effect of RacR. This agrees with the EMSA experiments in Figure 3.

Single molecule tracking of C protein reveals considerable changes in mobility associated with the presence or absence of high affinity binding sites

It has recently been shown that mobility of DNA binding proteins is largely determined by non-specific DNA interactions, and that the size of the proteins has very little influence on their diffusion properties (48). The C protein binding investigated in this study provided a perfect system for determining how specific versus non-specific DNA binding affect a DNA-binding protein's dynamics.

A typical TF’s action relies on a mixture of free diffusion and scanning of the cellular DNA via non-specific DNA binding; non-DNA-bound fractions of TFs can vary between about 1 and 40% in vivo (48,49). To reach a specific binding site, TFs diffuse in 3D and perform local motions such as 1D sliding, hopping or intersegmental transfer. We wished to gain further insights into the mode of action of C protein in vivo, in different genetic contexts, so we resolved protein dynamics at the single molecule level. We employed single molecule tracking (SMT), using YFP fusion proteins (42). Specifically, we expressed C protein-mVenus fusions (mVenus is a brighter variant of YFP; Supplementary Table S2 and Supplementary Figure S4 for corresponding western blot analysis) from a moderate-copy plasmid, using extremely low induction levels. Single fluorophores were obtained almost instantaneously, after few initial frames in which molecules bleached, and single molecule tracks could be captured with 20 ms stream acquisition. The C protein was fused to mVenus by a flexible linker –GSAGSAAGSGEF– to allow for proper folding and achievement of the optimal biological activity of the fusion protein (50).

Different genetic contexts were designed to test the dynamics of C protein mobility: (a) the natural context of a single C-box target site in the ∼5 Mbp Citrobacter genome (which naturally lacks the Rac prophage, based on analysis of related sequenced Citrobacter genomes), (b) plasmid-borne C-box target sites in E. coli cells deleted for rac promoter binding sites (‘off-target site’), (c) the same as (b) but containing a genome-located off-target Rac site, (d) off-target sites within the genome of E. coli cells (but lacking C-box ‘target’ sites) and (e) E. coli cells devoid of both C-box and Rac sites, as presented in the schemes (Figure 8A). Whenever the target site is present on plasmids carrying the R-Ms system, the produced C regulatory protein is unable to bind a DNA (Cmut), so that only fluorescently labeled WT C protein is able to bind DNA (Supplementary Figure S5). Interestingly, one or two sites of strong confined motion (molecules not leaving a radius of 120 nm for at least 7 steps, red tracks, Figure 8B) were found in Citrobacter cells, while E. coli cells devoid of any binding sites (‘control’) largely contained freely diffusing molecules (blue tracks). Strong chromosomal binding can also be seen in the heat maps for Citrobacter cells, where the probability of finding a track is projected into an average sized-cell of 3×1 μm, while there is almost completely homogeneous localization in the control strain.

Figure 8.

Figure 8.

Single molecule tracking of C protein::mVenus reveals considerable changes in mobility in the context of the target and off-target sites. (A) The fusion protein of C::mVenus (scheme) was introduced on a plasmid in the various cell genetic contexts, as presented on schemes: with target, natural sites upstream of C gene within Csp231I R-M system operon (green circles); with the off-target sites located within racR gene on genome (magenta circles). i) the natural context of a single target site in the Citrobacter genome coding for Csp231I R-M system; ii) plasmid-borne R-M systems carrying mutated C gene, but with WT target sites in E. coli cells, iii) genomic off-target sites as well as target plasmid-borne sites in E. coli cells; iv) only genomic off-target sites in E. coli cells, v) E. coli cells devoid of any binding sites. The red crossing on C gene indicates the presence of Cmut variant unable to bind its specific DNA site, so that only the C::mVenus protein can bind DNA. (B) Examples of trajectories in cells. Red tracks show confined motion, not leaving a radius of 120 nm (three times the localization error) for at least 6 steps, blue tracks are freely-diffusing molecules, green tracks molecules changing between confined and free diffusion. (C) Heat maps showing the probability of motion of the C protein. All tracks obtained were projected into a medium-size cell of 1×3 μm. To avoid asymmetry artefacts, heat maps were mirrored twice along the two central axes. Darker to yellow shading indicates higher to lower presence of molecules. (D) Bubble plot showing size and diffusion coefficients of the three populations, determined from SQD analyses. (E) Summary of data obtained from SQD analyses, diffusion constants D1-3 correspond to populations: static, slow-mobile and fast-mobile.

These experiments suggest that C protein is strongly bound to the RM gene C-box locus, which is duplicated into two loci following DNA replication. Of note, any freely diffusing molecule can stay without motion for some time, so confinement will be somewhat convoluted with non-bound molecules, or molecules staying bound in a non-specific manner for some period of time. In Citrobacter cells, about half of molecules showed confined motion, and half free diffusion, while only a quarter to a third were confined in control cells (Figure 8B). Thus, apparently one or two high-specificity sites on the genome can lead to strong confinement on the DNA in the native cell system (assuming that there are no fundamental differences between E. coli and Citrobacter cells; see Discussion).

An off-target site in E. coli cells marginally increased confined motion to 32%, while the presence of target and off-target sites strongly increased confined motion, led to cell elongation as expected, and increased localization of C protein to membrane-proximal sites (Figure 8B and C), which can be explained by the preferential localization of medium – to high copy plasmids to sites surrounding the nucleoids (51). Of note, the plasmid used for the RM system has a low to medium copy number (15–20 copies), yielding more target sites compared with chromosomal sites. Curiously, the presence of target sites and lack of the off-target region further increased confined motion to 45% (almost Citrobacter conditions), possibly because of a lack of titration of C protein away from plasmid-borne binding sites.

As a further means to characterize target DNA-binding binding versus off-target binding, we used squared displacement (SQD) analysis, in which the probability of a certain jump distance (JD) is plotted, and data are fit with Rayleigh distributions that explain a probability distribution for nonnegative-valued random variables. JDs observed for C protein could not be explained by a single Rayleigh fit, but quite well by two distributions (Supplementary Figure S7). Assuming three distributions (and thus three distinct populations) increased the quality of fitting only marginally: quantile/quantile plots in Supplementary Figure S7 show the difference between experimental data versus modelled data based on free Brownian diffusion, which shows very little deviation already for the two-population fit, but less deviation for three fits. In order to be able to distinguish between free protein diffusion, constrained motion due to non-specific DNA binding, and static motion based on specific DNA binding, we used three Rayleigh fits; results for two fits are shown in Supplementary Figure S8, which are qualitatively similar, but less discriminatory. SMTracker program searched for the best fit for average diffusion constants of populations for all 5 conditions, to better compare changes, because these are reflected in population size only, rather than in a combination of diffusions constants as well as population sizes.

SQD analyses revealed that in the Citrobacter native system, about 30% of C protein molecules showed a high, average diffusion constant of 1 μm2 s−1, most compatible with free diffusion, and about 22% a >50 fold lower value of 0.02 μm2 s−1 (Figure 8D and E). The low value can only be explained by relatively tight binding to DNA target sites. 48% of molecules were found to diffuse with 0.13 μm2 s−1, which would fit to constrained motion through the nucleoids, based on fast on – and off binding to low-affinity sites. According to this analysis, more than 22% of C protein is bound to its genomic sites in Citrobacter, and close to half are moving through the genome in search of specific binding sites. The population of tightly DNA-bound molecules decreased to 20% in E. coli cells carrying target sites on plasmids, and similarly in cells with target and off-target sites (Figure 8D). Between these two systems, marginally more molecules showed medium-fast mobility, 43 versus 36%, in cells containing target sites but lacking off target sites, reflecting a higher degree of confined motion (Figure 8B). It was surprising to find that mobility of C protein was higher in cells with target and off-target sites than in cells with target sites only. Cells carrying off-target sites only had decreased slow and medium-fast mobility in favor of free diffusion, and still, 15% showed low mobility in the control cells, and 26% medium mobility. Possibly, there are additional, unknown high affinity binding sites for C protein other than target and off target sites in E. coli. Alternatively, even non-specific DNA binding may result in a significant number of TF molecules that remain static for an extended period of time.

In any event, the presence of a genomic off target site leads to a decrease in freely diffusing molecules, and an increase in medium – and slow-diffusing C protein. Of note, experiments were performed in rich medium, such that there are likely two to 4 off target sites on average in the growing culture (overlapping rounds of replication). Constrained motion (as derived from the medium mobile population) further increased by the addition of target sites on plasmids, and likewise slow/static motion. Thus, changes in binding-site availability lead to considerable, but not drastic changes in C protein mobility. Our data also suggest that specific binding sites on the chromosome (in this case the ‘off target’ sites) increase low protein mobility (specific DNA binding) to a similar extent as specific binding sites on several plasmids (C protein ‘target’ sites), or even more so in the Citrobacter cell. We propose that constrained motion through nucleoids based on non-specific DNA binding leads to efficient target search, while target sites on plasmid are less efficiently found, as most plasmids are excluded from the nucleoids (51).

DISCUSSION

Transcriptional cross-talk

Full knowledge of a complete bacterial transcriptional genetic network does not exist. While E. coli K-12 MG1655 has one of the most widely-studied genomes (52), even there about a third of its genes have not yet been characterized, including 50–80 TFs out of nearly 300 predicted TFs (17,53–55). E. coli K-12 TFs were classified into four groups based on the number of regulatory targets, which form a hierarchical structure resembling a ball of cobwebs with layers of different density. Usually the highest density nodes in the web are represented by global TFs with multiple target sites, in contrast to local and single-target TFs, which are a minority group (56,57). The other group consists of nucleoid-associated TFs engaged in organization and maintenance of nucleoid structure, which heavily affects gene expression, and some TFs play intermediate or dual roles in this respect (58). Altogether in concert, they control over 1000 genes, however it is hard to dissect a single TF’s in vivo effect due to the whole repertoire of co-existing regulators (14,59,60). In vitro studies are also biased, as mapping the TF’s targets usually requires TF purification and tagging, and target determination is more likely to be established for strong binders, though new technologies have been developed to tackle this problem, as reported recently (61). Thus, it is still difficult to study the TF’s weak interactions and to discriminate between direct and indirect targets. Global systematic analyses are usually focused on direct targets and secondary interactions are ignored, and thus implication of the off-target actions seems to be underestimated. Moreover, TF/DNA interactions depend on multiple cellular parameters, such as the nucleoid environment, and in particular, the cooperativity with other DNA-binding proteins within the regulatory networks (62,63).

In the course of our previous work, we noticed that upon Citrobacter-derived TF introduction (as an R-M system operon) into a new genetic species (E. coli), the host genetic network was changed, which affected cellular fitness (18). Here, we aimed to characterize the molecular basis of this interesting phenomenon, as well as to map the cross-talk. We have established earlier via transcriptomics that the primary, direct causative effect is due to silencing of the RacR TF by the action of the C regulatory protein, which is an unrelated TF linked to the RM system. The C proteins are small, cis-acting local TFs working as autoactivators and/or autorepressors dedicated to control the level and timing of the restriction endonuclease action so that the DNA nucleolytic activity is not unleashed until genomic DNA is completely modified by a cognate DNA methyltransferase (28–30,64). The consensus binding site of the studied C regulator is relatively simple (CTAAG-n5-CTTAG) and reflects its dimeric nature (44). It naturally occurs within the C gene promoter (19). We have confirmed that this exact recognition site does not exist in the E. coli MG1655 genome. [This is not unexpected, as the 10-bp sequence should occur once in > 106 bp in a random sequence.]

It has been suggested that the transcriptional cross-talk is more likely to occur for TFs recognizing shorter DNA sequences or partially degenerate ones, where there is a high chance that many less specific sites are present at genome (65). In addition, target sites also face constant genetic pressure to adjust TF binding via point mutations, which drives regulatory network evolvability (66). Nevertheless, we expected to find C protein binding to a similar sequence, or to its half consensus sequence in the proper DNA context, as the C proteins also recognize phosphates of the non-contacted nucleotides in a process of indirect read-out, in a manner similar to some phage repressors (45,67,68).

Guided by the RNA-seq data we obtained earlier (18), as well as the sequence analysis, we determined that the C protein induces transcriptional cross-talk within the Rac region, where master regulator RacR was adventitiously prevented from binding to its target site, leading to perturbation in RacR regulation. Here, by in vitro and in vivo assays we mapped the C regulator interaction region within the racR gene. We determined the C protein binding site as box1 (5′-CTAAG-n5-CTTAA-3′; only 1nt apart from consensus) located within discriminator of the racR promoter Prac3 and extending downstream into the translation start codon of the racR gene (Figure 9, Supplementary Figure S3). The cross-talk spot located within racR transcription initiation site, and the tendency of C proteins to strongly bend DNA (69–71), may both affect racR repression, which we confirmed by an in vivo reporter assay (Figure 2). In addition, the C protein interaction did not directly affect ydaS expression in vivo, and no binding site was found near the ydaS promoter, within upper IGR sequence. In contrast, our in vitro transcription experiments indicate that although PydaS is fully inhibited by RacR, it is also inhibited by the C protein, albeit not fully (transcript level is down to ∼10%) (Figure 7). On the other hand, in vitro transcription experiments did confirm C protein inhibition of transcription initiating from the racR promoters. Perhaps discrepancies between in vivo and in vitro data result from involvement of an additional factor (missing in vitro) or presence of a DNA fragment (missing in the DNA template used in vitro) influencing interactions between the two factors and DNA.

Figure 9.

Figure 9.

The cross-talk spot in the context of rac prophage region of E. coli genome. The off-target C binding site is located on the complementary strand of the racR gene coding sequence (in red font). The racR promoter sequence (−10 box) boxed. The C binding site is boxed in red, whereas RacR binding site of operator 3 is boxed in blue.

We predict the C protein dimer rather than a monomer to interact with its off-target site, which is in accordance with the DNaseI protection region (Figure 4BC). Interestingly, it seems that the C protein and RacR binding site in operator 3 overlap, 5′-ACTAAGCATTGCTTAATATTCTC-3′ (C protein consensus site underlined, RacR consensus site in bold) (Figure 9). Still, we do not know how other upstream RacR binding sites within the IGR region (operator 1 and 2, Figure 6) contribute to the autoinhibitory effect. This could be so because our in vitro binding tests show that pre-binding of one TF does not prevent binding of the other TF, at least using the WT DNA substrates where all recognition sites are present (Figure 3). Also, our multiround in vitro transcription data indicates that when both proteins (C and RacR) are present together, transcription initiation at PydaS is no longer fully repressed by RacR, allowing for a low level (5–8%) of transcripts to initiate (Supplementary Figure S6). These results may reflect multi-layered interactions between the C and RacR proteins and the region surrounding their cross-talk spot, awaiting further verification.

Theoretical models have tried to outline possible scenarios for cross-talk interactions and to test whether gene misregulation can affect adaptive gene expression (72,73). In fact, the dynamics of TFs interaction creates a balance between maintaining binding specificity and allowing some relaxation in DNA interaction to enable adaptation of networks. If newly re-wired TF links are beneficial, they are likely to facilitate network evolution (13). Sometimes, as in our case, the cross-talk triggers toxic gene activity, which is silenced under normal conditions (18). In a similar example, another C regulatory protein linked to the Type I R-M system of Mycoplasma, showed cross-talk activity via binding to secondary sites within promoters for a protease gene and tRNA gene cluster leading to cell death (74). Overall, studies show that adding a new network connection into an existing network can result in a large gene network perturbation; the higher connections in the TF hierarchy that are re-wired, the larger the perturbation (12,75). Thus, as horizontal gene transfer events involving TF genes happen constantly, there is a genetic pressure on the host regulatory circuitry to keep it stable and to prevent any regulatory perturbation in accepting host cell (11,76–81).

Single molecule dynamics of a TF in different genetic backgrounds

The finding that the C protein binds with high affinity to both the promoter region of the R-M system as well as to the racR site motivated us to study changes in single molecule dynamics in the presence or absence of these binding sites. In the absence of any known high affinity binding sites in E. coli, C protein shows predominantly free diffusion, however, we found 15% of molecules to be essentially static, likely corresponding to strong binding to DNA sequences. Whether there are additional, unknown high affinity sites is unknown. About a quarter of molecules showed dynamics that can be best described as constrained diffusion through the nucleoids, characteristic of any DNA-binding protein. The presence of the ‘off-target’ Rac site in the E. coli genome led to a modest but detectable increase in the static fraction, and interestingly, also an increase in slow motion, i.e. an increased number of molecules showing constrained motion. Thus, specific binding sites appear to not only increase static binding, but also target search via constrained diffusion. The presence of target sites on plasmids (between 15–20 based on plasmid copy number) lead to a considerable increase in static motion, as expected, as well as in medium-mobility, strongly decreasing the number of freely diffusing molecules. This is in line with our finding that specific binding sites on the genome increase general target search via constrained motion. Surprisingly, the presence of both, target and off target sites did not strongly change C protein dynamics compared with cells having target-sites only, but in fact somewhat increased protein mobility. We cannot rule out that this effect is secondary, caused by cell filamentation due to induction of the ydaST genes. The strongest effect on chromosome-based motion was seen in Citrobacter cells, where we found the highest number of static as well as medium mobile molecules. Compared with the E. coli control cells having 60% freely diffusing molecules, only 30% showed free diffusion (based on SQD analyses), and 70% low mobility in Citrobacter cells. Thus, genomic target sites have a stronger effect on DNA-based mobility of C protein when compared with plasmid-borne target sites in E. coli cells. These analyses show that few high affinity binding sites have a strong effect on TF mobility, and suggest that target search via constrained motion through the genome has a higher efficiency for finding target sites than for target sites on nucleoid-excluded plasmids (assuming that the shift of C protein abundance towards the cell periphery in the presence of plasmids observed in this study reflects the general rule of nucleoid occlusion for medium/high copy number plasmids).

Novel insights into the ydaS-racR region with RacR as the master regulator

The Rac prophage was the first defective lambdoid prophage discovered in E. coli K-12 genome, and has lost 60% of its original DNA content. Most E. coli sequenced genomes have at least part of the rac locus (34,82). The rac region comprises only 29 genes, although some of them have not yet been characterized and their genetic relationships have not been determined (83). It seems that the RacR repressor is a master regulator of the Rac expression network, which together with neighboring genes ydaS and ydaT form the operon in the E. coli genome, featured as essential (32–34). The Keio collection of E. coli single gene deletion mutants does not contain a ΔracR mutant (84), consistent with the fact that such mutation is possible only in ΔydaST background. In the course of our work, we found very close analogy of the IGR region located between the racR and ydaST genes to the lysis/lysogeny decision region between the CI and Cro genes of lambda bacteriophage (35), as it was also noted by others (34,85). We confirm here that RacR shares functional similarities to the λCI repressor, as well as YdaS resembling λCro protein promoting the transcriptional switch into λ phage lytic cycle. In addition, RacR transcripts initiating from PracR3 are leaderless, similar to the λcI transcript (86). YdaS acts as an autorepressor, but it seems that RacR inhibits ydaS expression more strongly, even though racR promoters appear to be weaker than the single ydaS promoter (Figures 2, 5 and 7). Efficient ydaS inhibition was seen at very low RacR concentration (both, in vitro and in vivo – Figures 2 and 7), which may indicate that RacR’s affinity is greater for operator 1 within promoter sequence of ydaS (Figure 6A), similar to CI/Cro operator (OR) binding (35). The strong ydaS promoter also controls expression of YdaT, as we did not detect any promoter activity upstream of ydaT gene (not shown). Interestingly, the ydaS promoter activity in vivo is barely detectable (even in the Δrac strain), in contrast to our in vitro transcription data, where we see a very strong ydaS promoter signal. Thus, more cellular factors could be involved in ydaS promoter activity regulation. We tested some global TFs, like HNS and FIS and did not find any effect on the ydaS promoter (not shown), but some other TFs or other regulatory elements might be engaged.

Based on the analogy to the intergenic region of λCI and λCro, as well as other observations indicating possible three binding sites for RacR repressor (34), we used footprinting methods to test their exact location (Figure 5). Indeed, we found multiple binding sites, such as three inverted repeats, which slightly differ among themselves (Figure 6A), just like the three λ operators OR for CI and Cro (35). It seems likely that RacR binding also relies on cooperative interactions with its operators, just like λCI, as well as on positive and negative feedback loops involving presence of the YdaS anti-repressor. However, unlike the CI/Cro system, YdaS does not seem to repress racR expression (while λCro inhibits cI) (Figure 2), which suggests that even though there are promoter and operator organization similarities with the λ system, the racR/ydaS region regulation is distinct from that of λ phage. In addition, the racR operator 3 site overlaps the coding sequence of the racR gene.

Recent studies on the E. coli CP-933P lambdoid cryptic prophage, which shares similarity with the Rac prophage (including racR-ydaST operon), also suggest an analogy to the λ genetic switch (36). PaaR2 seems to play a role of RacR TF, with an additional role of regulating a downstream toxin-antitoxin system, not present in the Rac prophage. There, PaaR2 binding sites are present in the intergenic region between paaR2 and ydaS genes, forming four inverted repeats that do not overlap the coding sequence. Interestingly, the YdaT-like protein of prophage CP-933P does not have any toxic effects (36). This is unlike the ydaT gene of Rac (18,33,34), which could be cloned solely under expression silencing conditions in certain E. coli genetic contexts (not shown). We have shown previously that Rac YdaT directly leads to cell filamentation, which does not result from induction of the SOS response (18). Thus again, Rac prophage regulation and the gene functions it encodes are distinct from other systems described so far, and require further studies to reveal its peculiarities.

Supplementary Material

gkac914_Supplemental_File

ACKNOWLEDGEMENTS

We thank Drs Thomas K. Wood and Yunxue Guo for the gift of strains/plasmids. We also acknowledge the excellent help of Marian Sektas and Bob Blumenthal at the stage of critical reading our manuscript.

Contributor Information

Aleksandra Wisniewska, Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, Gdansk 80-308, Poland.

Ewa Wons, Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, Gdansk 80-308, Poland.

Katarzyna Potrykus, Department of Bacterial Molecular Genetics, Faculty of Biology, University of Gdansk, Wita Stwosza 59, Gdansk 80-308, Poland.

Rebecca Hinrichs, SYNMIKRO, LOEWE Center for Synthetic Microbiology, Philipps Universität Marburg, Germany; Department of Chemistry, Philipps Universität Marburg, Hans-Meerwein-Strasse 6, 35032 Marburg, Germany.

Katarzyna Gucwa, Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, Gdansk 80-308, Poland.

Peter L Graumann, SYNMIKRO, LOEWE Center for Synthetic Microbiology, Philipps Universität Marburg, Germany; Department of Chemistry, Philipps Universität Marburg, Hans-Meerwein-Strasse 6, 35032 Marburg, Germany.

Iwona Mruk, Department of Microbiology, Faculty of Biology, University of Gdansk, Wita Stwosza 59, Gdansk 80-308, Poland.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

National Science Centre (Poland) [UMO-2019/35/B/NZ2/00701 to I.M.]; SMT studies were funded by a grant from the Deutsche Forschungsgemeinschaft and the state of Hessen (Research consortium MOSLA) (to P.L.G.). Funding for open access charge: National Science Centre (Poland) [UMO-2019/35/B/NZ2/00701].

Conflict of interest statement. None declared.

REFERENCES

  • 1. Martínez-Antonio A., Janga S.C., Thieffry D. Functional organisation of Escherichiacoli transcriptional regulatory network. J. Mol. Biol. 2008; 381:238–247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. van Hijum S.A., Medema M.H., Kuipers O.P.. Mechanisms and evolution of control logic in prokaryotic transcriptional regulation. Microbiol. Mol. Biol. Rev. 2009; 73:81–509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Seshasayee A.S., Sivaraman K., Luscombe N.M.. An overview of prokaryotic transcription factors : a summary of function and occurrence in bacterial genomes. Subcell. Biochem. 2011; 52:7–23. [DOI] [PubMed] [Google Scholar]
  • 4. Todeschini A.L., Georges A., Veitia R.A.. Transcription factors: specific DNA binding and specific gene regulation. Trends Genet. 2014; 30:211–219. [DOI] [PubMed] [Google Scholar]
  • 5. Chen J., Boyaci H., Campbell E.A.. Diverse and unified mechanisms of transcription initiation in bacteria. Nat. Rev. Microbiol. 2021; 19:95–109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Browning D.F., Busby S.J.. The regulation of bacterial transcription initiation. Nat. Rev. Microbiol. 2004; 2:57–65. [DOI] [PubMed] [Google Scholar]
  • 7. Martínez-Antonio A., Salgado H., Gama-Castro S., Gutiérrez-Ríos R.M., Jiménez-Jacinto V., Collado-Vides J.. Environmental conditions and transcriptional regulation in Escherichiacoli: a physiological integrative approach. Biotechnol. Bioeng. 2003; 84:743–749. [DOI] [PubMed] [Google Scholar]
  • 8. Badis G., Berger M.F., Philippakis A.A., Talukder S., Gehrke A.R., Jaeger S.A., Chan E.T., Metzler G., Vedenko A., Chen X.et al.. Diversity and complexity in DNA recognition by transcription factors. Science. 2009; 324:1720–1723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Wunderlich Z., Mirny L.A.. Different gene regulation strategies revealed by analysis of binding motifs. Trends Genet. 2009; 25:434–440. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Nagy-Staron A., Tomasek K., Caruso Carter C., Sonnleitner E., Kavčič B., Paixão T., Guet C.C.. Local genetic context shapes the function of a gene regulatory network. Elife. 2021; 10:e65993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Rowland M.A., Abdelzaher A., Ghosh P., Mayo M.L.. Crosstalk and the dynamical modularity of feed-forward loops in transcriptional regulatory networks. Biophys. J. 2017; 112:1539–1550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Baumstark R., Hänzelmann S., Tsuru S., Schaerli Y., Francesconi M., Mancuso F.M., Castelo R., Isalan M.. The propagation of perturbations in rewired bacterial gene networks. Nat. Commun. 2015; 6:10105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Taylor T.B., Shepherd M.J., Jackson R.W., Silby M.W.. Natural selection on crosstalk between gene regulatory networks facilitates bacterial adaptation to novel environments. Curr. Opin. Microbiol. 2022; 67:102140. [DOI] [PubMed] [Google Scholar]
  • 14. Ishihama A. Prokaryotic genome regulation: multifactor promoters, multitarget regulators and hierarchic networks. FEMS Microbiol. Rev. 2010; 34:628–645. [DOI] [PubMed] [Google Scholar]
  • 15. Lozada-Chávez I., Janga S.C., Collado-Vides J.. Bacterial regulatory networks are extremely flexible in evolution. Nucleic Acids Res. 2006; 34:3434–3445. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Andersson S.G., Zomorodipour A., Andersson J.O., Sicheritz-Pontén T., Alsmark U.C., Podowski R.M., Näslund A.K., Eriksson A.S., Winkler H.H., Kurland C.G.. The genome sequence of Rickettsiaprowazekii and the origin of mitochondria. Nature. 1998; 396:133–140. [DOI] [PubMed] [Google Scholar]
  • 17. Gao Y., Lim H.G., Verkler H., Szubin R., Quach D., Rodionova I., Chen K., Yurkovich J.T., Cho B.K., Palsson B.O.. Unraveling the functions of uncharacterized transcription factors in Escherichiacoli using ChIP-exo. Nucleic Acids Res. 2021; 49:9696–9710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Negri A., Jąkalski M., Szczuka A., Pryszcz L.P., Mruk I.. Transcriptome analyses of cells carrying the type II Csp231I restriction-modification system reveal cross-talk between two unrelated transcription factors: C protein and the Rac prophage repressor. Nucleic Acids Res. 2019; 47:9542–9556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Rezulak M., Borsuk I., Mruk I.. Natural C-independent expression of restriction endonuclease in a C protein-associated restriction-modification system. Nucleic Acids Res. 2016; 44:2646–2660. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Bogdanova E., Djordjevic M., Papapanagiotou I., Heyduk T., Kneale G., Severinov K.. Transcription regulation of the type II restriction-modification system AhdI. Nucleic Acids Res. 2008; 36:1429–1442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Bogdanova E., Zakharova M., Streeter S., Taylor J., Heyduk T., Kneale G., Severinov K.. Transcription regulation of restriction-modification system Esp1396I. Nucleic Acids Res. 2009; 37:3354–3366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Cesnaviciene E., Mitkaite G., Stankevicius K., Janulaitis A., Lubys A.. Esp1396I restriction-modification system: structural organization and mode of regulation. Nucleic Acids Res. 2003; 31:743–749. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Mruk I., Rajesh P., Blumenthal R.M.. Regulatory circuit based on autogenous activation-repression: roles of C-boxes and spacer sequences in control of the PvuII restriction-modification system. Nucleic Acids Res. 2007; 35:6935–6952. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Kita K., Tsuda J., Nakai S.Y.. C.EcoO109I, a regulatory protein for production of EcoO109I restriction endonuclease, specifically binds to and bends DNA upstream of its translational start site. Nucleic Acids Res. 2002; 30:3558–3565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Semenova E., Minakhin L., Bogdanova E., Nagornykh M., Vasilov A., Heyduk T., Solonin A., Zakharova M., Severinov K.. Transcription regulation of the EcoRV restriction-modification system. Nucleic Acids Res. 2005; 33:6942–6951. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Tao T., Bourne J.C., Blumenthal R.M.. A family of regulatory genes associated with type II restriction-modification systems. J. Bacteriol. 1991; 173:1367–1375. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Ives C.L., Sohail A., Brooks J.E.. The regulatory C proteins from different restriction-modification systems can cross-complement. J. Bacteriol. 1995; 177:6313–6315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Negri A., Werbowy O., Wons E., Dersch S., Hinrichs R., Graumann P.L., Mruk I.. Regulator-dependent temporal dynamics of a restriction-modification system's gene expression upon entering new host cells: single-cell and population studies. Nucleic Acids Res. 2021; 47:9542–9556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Mruk I., Blumenthal R.M.. Real-time kinetics of restriction-modification gene expression after entry into a new host cell. Nucleic Acids Res. 2008; 36:2581–2593. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Morozova N., Sabantsev A., Bogdanova E., Fedorova Y., Maikova A., Vedyaykin A., Rodic A., Djordjevic M., Khodorkovskii M., Severinov K.. Temporal dynamics of methyltransferase and restriction endonuclease accumulation in individual cells after introducing a restriction-modification system. Nucleic Acids Res. 2016; 44:790–800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Werbowy O., Kaczorowski T.. Plasmid pEC156, a naturally occurring Escherichiacoli genetic element that carries genes of the EcoVIII restriction-modification system, is mobilizable among Enterobacteria. PLoS One. 2016; 11:e0148355. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Sorek R., Zhu Y., Creevey C.J., Francino M.P., Bork P., Rubin E.M.. Genome-wide experimental determination of barriers to horizontal gene transfer. Science. 2007; 318:1449–1452. [DOI] [PubMed] [Google Scholar]
  • 33. Bindal G., Krishnamurthi R., Seshasayee A.S.N., Rath D.. CRISPR-Cas-mediated gene silencing reveals RacR to be a negative regulator of YdaS and YdaT toxins in Escherichia coli K-12. Msphere. 2017; 2:e00483-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Krishnamurthi R., Ghosh S., Khedkar S., Seshasayee A.S.N.. Repression of YdaS toxin is mediated by transcriptional repressor RacR in the cryptic rac prophage of Escherichia coli K-12. Msphere. 2017; 2:e00392-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Ptashne M. Principles of a switch. Nat. Chem. Biol. 2011; 7:484–487. [DOI] [PubMed] [Google Scholar]
  • 36. Jurėnas D., Fraikin N., Goormaghtigh F., De Bruyn P., Vandervelde A., Zedek S., Jové T., Charlier D., Loris R., Van Melderen L.. Bistable expression of a toxin-antitoxin system located in a cryptic prophage of Escherichiacoli O157:H7. Mbio. 2021; 12:e0294721. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Guo Y., Quiroga C., Chen Q., McAnulty M.J., Benedik M.J., Wood T.K., Wang X.. RalR (a DNase) and RalA (a small RNA) form a type I toxin-antitoxin system in Escherichiacoli. Nucleic Acids Res. 2014; 42:6448–6462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Mruk I., Kaczorowski T.. A rapid and efficient method for cloning genes of type II restriction-modification systems by use of a killer plasmid. Appl. Environ. Microbiol. 2007; 73:4286–4293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Guzman L.M., Belin D., Carson M.J., Beckwith J.. Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter. J. Bacteriol. 1995; 177:4121–4130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Sambrook J., Fritsch E.F., Maniatis T.. Molecular Cloning: A Laboratory Manual. 1986; 2nd ednNY: Cold Spring Harbor Laboratory Press. [Google Scholar]
  • 41. Mruk I., Liu Y., Ge L., Kobayashi I.. Antisense RNA associated with biological regulation of a restriction-modification system. Nucleic Acids Res. 2011; 39:5622–5632. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Hernández-Tamayo R., Oviedo-Bocanegra L.M., Fritz G., Graumann P.L.. Symmetric activity of DNA polymerases at and recruitment of exonuclease ExoR and of PolA to the Bacillussubtilis replication forks. Nucleic Acids Res. 2019; 47:8521–8536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Oviedo-Bocanegra L.M., Hinrichs R., Rotter D.A.O., Dersch S., Graumann P.L.. Single molecule/particle tracking analysis program SMTracker 2.0 reveals different dynamics of proteins within the RNA degradosome complex in Bacillussubtilis. Nucleic Acids Res. 2021; 49:e112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Sorokin V., Severinov K., Gelfand M.S.. Systematic prediction of control proteins and their DNA binding sites. Nucleic Acids Res. 2009; 37:441–451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Shevtsov M.B., Streeter S.D., Thresh S.J., Swiderska A., McGeehan J.E., Kneale G.G.. Structural analysis of DNA binding by C.Csp231I, a member of a novel class of R-M controller proteins regulating gene expression. Acta. Crystallogr. D Biol. Crystallogr. 2015; 71:398–407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Shultzaberger R.K., Chen Z., Lewis K.A., Schneider T.D.. Anatomy of Escherichiacoli sigma70 promoters. Nucleic Acids Res. 2007; 35:771–788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Crooks G.E., Hon G., Chandonia J.M., Brenner S.E.. WebLogo: a sequence logo generator. Genome Res. 2004; 14:1188–1190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Stracy M., Schweizer J., Sherratt D.J., Kapanidis A.N., Uphoff S., Lesterlin C.. Transient non-specific DNA binding dominates the target search of bacterial DNA-binding proteins. Mol. Cell. 2021; 81:1499–1514. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Hammar P., Walldén M., Fange D., Persson F., Baltekin O., Ullman G., Leroy P., Elf J.. Direct measurement of transcription factor dissociation excludes a simple operator occupancy model for gene regulation. Nat. Genet. 2014; 46:405–408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Waldo G.S., Standish B.M., Berendzen J., Terwilliger T.C.. Rapid protein-folding assay using green fluorescent protein. Nat. Biotechnol. 1999; 17:691–695. [DOI] [PubMed] [Google Scholar]
  • 51. Bouet J.Y., Funnell B.E.. Plasmid localization and partition inEnterobacteriaceae. EcoSal Plus. 2019; 8: 10.1128/ecosalplus.ESP-0003-2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Hu P., Janga S.C., Babu M., Díaz-Mejía J.J., Butland G., Yang W., Pogoutse O., Guo X., Phanse S., Wong P.et al.. Global functional atlas of Escherichiacoli encompassing previously uncharacterized proteins. PLoS Biol. 2009; 7:e96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Gao Y., Yurkovich J.T., Seo S.W., Kabimoldayev I., Dräger A., Chen K., Sastry A.V., Fang X., Mih N., Yang L.et al.. Systematic discovery of uncharacterized transcription factors in Escherichiacoli K-12 MG1655. Nucleic Acids Res. 2018; 46:10682–10696. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Pérez-Rueda E., Collado-Vides J.. The repertoire of DNA-binding transcriptional regulators in Escherichiacoli K-12. Nucleic Acids Res. 2000; 28:1838–1847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Fang X., Sastry A., Mih N., Kim D., Tan J., Yurkovich J.T., Lloyd C.J., Gao Y., Yang L., Palsson B.O.. Global transcriptional regulatory network for Escherichiacoli robustly connects gene expression to transcription factor activities. Proc. Natl. Acad. Sci. U.S.A. 2017; 114:10286–10291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Shimada T., Ogasawara H., Kobayashi I., Kobayashi N., Ishihama A.. Single-target regulators constitute the minority group of transcription factors in EscherichiacoliK-12. Front Microbiol. 2021; 12:697803. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57. Yan K.K., Fang G., Bhardwaj N., Alexander R.P., Gerstein M.. Comparing genomes to computer operating systems in terms of the topology and evolution of their regulatory control networks. Proc. Natl. Acad. Sci. U.S.A. 2010; 107:9186–9191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58. Kroner G.M., Wolfe M.B., Freddolino P.L.. Lrp regulates one-third of the genome via direct, cooperative, and indirect routes. J. Bacteriol. 2019; 201:e00411-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Ishihama A., Shimada T., Yamazaki Y.. Transcription profile of Escherichiacoli: genomic SELEX search for regulatory targets of transcription factors. Nucleic Acids Res. 2016; 44:2058–2074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Ishihama A. Prokaryotic genome regulation: a revolutionary paradigm. Proc. Jpn. Acad. Ser. B Phys. Biol. Sci. 2012; 88:485–508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Baumgart L.A., Lee J.E., Salamov A., Dilworth D.J., Na H., Mingay M., Blow M.J., Zhang Y., Yoshinaga Y., Daum C.G.et al.. Persistence and plasticity in bacterial gene regulation. Nat. Methods. 2021; 18:1499–1505. [DOI] [PubMed] [Google Scholar]
  • 62. Suter D.M. Transcription factors and DNA play hide and seek. Trends Cell Biol. 2020; 30:491–500. [DOI] [PubMed] [Google Scholar]
  • 63. de Jonge W.J., Patel H.P., Meeussen J.V.W., Lenstra T.L.. Following the tracks: how transcription factor binding dynamics control transcription. Biophys. J. 2022; 121:1583–1592. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Mruk I., Kobayashi I.. To be or not to be: regulation of restriction-modification systems and other toxin-antitoxin systems. Nucleic Acids Res. 2014; 42:70–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Stewart A.J., Plotkin J.B.. The evolution of complex gene regulation by low-specificity binding sites. Proc. Biol. Sci. 2013; 280:20131313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Payne J.L., Wagner A.. The robustness and evolvability of transcription factor binding sites. Science. 2014; 343:875–877. [DOI] [PubMed] [Google Scholar]
  • 67. Mauro S.A., Pawlowski D., Koudelka G.B.. The role of the minor groove substituents in indirect readout of DNA sequence by 434 repressor. J. Biol. Chem. 2003; 278:12955–12960. [DOI] [PubMed] [Google Scholar]
  • 68. Koudelka G.B., Mauro S.A., Ciubotaru M.. Indirect readout of DNA sequence by proteins: the roles of DNA sequence-dependent intrinsic and extrinsic forces. Prog. Nucleic Acid Res. Mol. Biol. 2006; 81:143–177. [DOI] [PubMed] [Google Scholar]
  • 69. McGeehan J.E., Streeter S.D., Thresh S.J., Taylor J.E., Shevtsov M.B., Kneale G.G.. Structural analysis of a novel class of R-M controller proteins: C.Csp231I from Citrobacter sp. RFL231. J. Mol. Biol. 2011; 409:177–188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Mruk I., Blumenthal R.M.. Tuning the relative affinities for activating and repressing operators of a temporally regulated restriction-modification system. Nucleic Acids Res. 2009; 37:983–998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71. Klimuk E., Bogdanova E., Nagornykh M., Rodic A., Djordjevic M., Medvedeva S., Pavlova O., Severinov K.. Controller protein of restriction-modification system Kpn2I affects transcription of its gene by acting as a transcription elongation roadblock. Nucleic Acids Res. 2018; 46:10810–10826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Friedlander T., Prizak R., Guet C.C., Barton N.H., Tkačik G.. Intrinsic limits to gene regulation by global crosstalk. Nat. Commun. 2016; 7:12307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73. Wagner A. Adaptive gene misregulation. Genetics. 2021; 217:iyaa044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74. Fisunov G.Y., Evsyutina D.V., Manuvera V.A., Govorun V.M.. Binding site of restriction-modification system controller protein in Mollicutes. BMC Microbiol. 2017; 17:26. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75. Bhardwaj N., Kim P.M., Gerstein M.B.. Rewiring of transcriptional regulatory networks: hierarchy, rather than connectivity, better reflects the importance of regulators. Sci. Signal. 2010; 3:ra79. [DOI] [PubMed] [Google Scholar]
  • 76. San Millan A., Toll-Riera M., Qi Q., MacLean R.C.. Interactions between horizontally acquired genes create a fitness cost in Pseudomonasaeruginosa. Nat. Commun. 2015; 6:6845. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77. Miyazaki R., Yano H., Sentchilo V., van der Meer J.R.. Physiological and transcriptome changes induced by Pseudomonasputida acquisition of an integrative and conjugative element. Sci. Rep. 2018; 8:5550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78. Harrison E., Guymer D., Spiers A.J., Paterson S., Brockhurst M.A.. Parallel compensatory evolution stabilizes plasmids across the parasitism-mutualism continuum. Curr. Biol. 2015; 25:2034–2039. [DOI] [PubMed] [Google Scholar]
  • 79. Dorman C.J. Regulatory integration of horizontally-transferred genes in bacteria. Front. Biosci. (Landmark Ed.). 2009; 14:4103–4112. [DOI] [PubMed] [Google Scholar]
  • 80. Lercher M.J., Pál C.. Integration of horizontally transferred genes into regulatory interaction networks takes many million years. Mol. Biol. Evol. 2008; 25:559–567. [DOI] [PubMed] [Google Scholar]
  • 81. Davids W., Zhang Z.. The impact of horizontal gene transfer in shaping operons and protein interaction networks–direct evidence of preferential attachment. BMC Evol. Biol. 2008; 8:23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82. Casjens S. Prophages and bacterial genomics: what have we learned so far. Mol. Microbiol. 2003; 49:277–300. [DOI] [PubMed] [Google Scholar]
  • 83. Ghatak S., King Z.A., Sastry A., Palsson B.O.. The y-ome defines the 35% of Escherichiacoli genes that lack experimental evidence of function. Nucleic Acids Res. 2019; 47:2446–2454. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84. Baba T., Ara T., Hasegawa M., Takai Y., Okumura Y., Baba M., Datsenko K.A., Tomita M., Wanner B.L., Mori H.. Construction of Escherichiacoli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2006; 2:2006.0008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85. Jobling M.G. Ectopic expression of the ydaS and ydaT genes of the cryptic prophage Rac of Escherichia coli K-12 may be toxic but do they really encode toxins?: a case for using genetic context to understand function. Msphere. 2018; 3:e00163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86. Balakin A.G., Skripkin E.A., Shatsky I.N., Bogdanov A.A.. Unusual ribosome binding properties of mRNA encoding bacteriophage lambda repressor. Nucleic Acids Res. 1992; 20:563–571. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

gkac914_Supplemental_File

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES