Abstract
Bacterial nucleoid-associated proteins play important roles in chromosome organization and global gene regulation. We find that Lsr2 of Mycobacterium tuberculosis is a unique nucleoid-associated protein that binds AT-rich regions of the genome, including genomic islands acquired by horizontal gene transfer and regions encoding major virulence factors, such as the ESX secretion systems, the lipid virulence factors PDIM and PGL, and the PE/PPE families of antigenic proteins. Comparison of genome-wide binding data with expression data indicates that Lsr2 binding results in transcriptional repression. Domain-swapping experiments demonstrate that Lsr2 has an N-terminal dimerization domain and a C-terminal DNA-binding domain. Nuclear magnetic resonance analysis of the DNA-binding domain of Lsr2 and its interaction with DNA reveals a unique structure and a unique mechanism that enables Lsr2 to discriminately target AT-rich sequences through interactions with the minor groove of DNA. Taken together, we provide evidence that mycobacteria have employed a structurally distinct molecule with an apparently different DNA recognition mechanism to achieve a function similar to the Enterobacteriaceae H-NS, likely coordinating global gene regulation and virulence in this group of medically important bacteria.
Keywords: H-NS, virulence
The bacterial chromosome is organized into a compact structure composed of topologically independent loops in part as a consequence of interactions with nucleoid-associated proteins (NAPs) (1). H-NS, one of the most abundant NAPs in the Enterobacteriaceae (2), is thought to stabilize the loops by forming long patches that bridge distant DNA to form loop structures of variable size (3). In addition to its role in chromatin organization and compaction, H-NS is thought to play a role in global gene regulation. H-NS exists essentially as a homodimer and binds DNA nonspecifically, but has a preference for AT-rich or curved DNA (4, 5). High-affinity binding sites have also been reported with the current paradigm because they serve as initiation sites for nucleation of H-NS to form higher-order nucleoprotein structures (6). H-NS is responsible for binding and repressing >400 genes in Salmonella (4, 7) and in Escherichia coli (8, 9), many of which are DNA sequences obtained through horizontal gene transfer and involved in adaptive stress responses and virulence (10). Numerous phenotypes associated with hns mutations have been described, and the effects of H-NS on gene expression are largely inhibitory (2, 11), which is partially explained by the ability of H-NS to bridge adjacent helices of DNA (12, 13), causing either the trapping or the occlusion of RNA polymerase in the promoter regions (2, 14). H-NS homologs are widespread in the Gram-negative α-, β-, and γ-proteobacteria but have not been identified in Gram-positive bacteria or in any other groups of bacteria, leaving it unclear as to how these bacterial species regulate the genes that they obtain through genetic exchange.
Lsr2 is a small, basic protein that is highly conserved in mycobacteria and related actinomycetes (15). Previous studies showed that Lsr2 is involved in several cellular processes including cell-wall lipid biosynthesis (16, 17) and antibiotic resistance (18). We recently demonstrated by in vitro biochemical experiments that, like H-NS, Lsr2 of Mycobacterium tuberculosis (M. tb) binds DNA in a sequence-independent manner and is capable of bridging distant DNA segments (19). Moreover, we showed through in vivo complementation assays that Lsr2 is a functional analog of H-NS—specifically, that lsr2 fully complements independent phenotypes associated with hns mutations in E. coli (15). These results suggest that Lsr2 may play a role in M. tb that is equivalent to that of H-NS. However, the in vivo binding sites of Lsr2 in the mycobacterial genomes and the scope of the biological function of Lsr2 remain unknown. In addition, because Lsr2 and H-NS share little sequence homology, the molecular basis for their functional similarity is not understood. In this study, we have addressed these questions by performing a high-resolution, genome-wide mapping of Lsr2 binding sites and by determining the tertiary structure of the Lsr2 DNA-binding domain.
Results
Lsr2 Binds AT-Rich Sequences in Mycobacterial Genomes.
To gain more insight into the biological function of Lsr2, we mapped the Lsr2 binding sites in two mycobacterial species (M. tb and Mycobacterium smegmatis) by performing chromatin immunoprecipitation (ChIP) of in vivo cross-linked Lsr2–DNA complexes, followed by microarray (ChIP-chip) analysis on a 244,000-feature oligonucleotide tiling array. Lsr2 coprecipitated with 21% of the M. tb genome (840 of 4009 protein-encoding ORFs; Dataset S1) and 13% of the genome in M. smegmatis (904 of 6716 ORFs; Dataset S2). Like H-NS (4, 7), Lsr2 binds genome regions with low GC content, irrespective of the position relative to the ORF (Fig. 1 A and B). Lsr2 preferentially bound regions with a GC content of ∼47% or less, compared to the genome average GC content of 65–67% of these species.
Fig. 1.
Lsr2 binds preferentially to high-AT-content regions of the genome, including horizontally transferred genes and virulence-associated genes in M. tb. Results of ChIP-chip analysis of M. tb (A) or M. smegmatis (B) are organized according to GC content. Individual box plots were generated for the Lsr2 log2-binding ratios encompassing all probes (60 bp each) from the microarray at each specific GC-content value. The band near the middle of the box is the 50th percentile (the median), and the bottom and top of the box are the 25th and 75th percentiles, respectively. The whiskers represent the minimum and maximum of all of the data. (C–E) Lsr-binding peaks (black) in the M. tb genome measured by ChIP-chip for (C) Rv0986-Rv0989c and Rv2336-Rv2339, which are horizontally acquired genes in M. tb (20) that exhibit lower-than-average GC content. The normalized GC content is calculated by subtracting the median GC content (65.6%) across the genome; (D) the ESX-1 (Rv3864-Rv3883c) and ESX-2 (Rv3884c-Rv3895c) regions, with effector genes esxA, B, C, and D highlighted (red); and (E) PDIM/PGL locus (tesA-Rv2962c), with genes bound by Lsr2 highlighted (red).
As foreign DNA acquired by horizontal gene transfer often displays GC content different from the rest of genome (21), we investigated whether Lsr2 has a preference for foreign genes. Of the 76 potential horizontally acquired genes in the M. tb complex (these are genes common in M. tb and M. bovis but absent from other mycobacterial species) (20), Lsr2 bound 44 genes (57.8%) (Dataset S1), including the well-studied Rv0986-Rv0989c region (22) (Fig. 1C), which is statistically significant (the hypergeometric P-value is 1.1e−12). The average GC content of these 44 genes is significantly lower than the rest of 32 genes not bound by Lsr2 (P < 0.001) (Fig. S1). The 76 horizontally transferred genes are distributed in 48 genomic islands (20), and Lsr2 binds one or more genes in 34 of these islands (70.8%)—in total, 101 genes of the 267 genes in the 48 islands (37.8%) (Dataset S1). Together, these results suggest that Lsr2 has a binding preference for horizontally acquired AT-rich DNA.
Comparing the genome-wide binding data to previously published expression microarray data for the M. smegmatis lsr2 mutant (18), we found that Lsr2 bound to 26 of the 41 (63%) genes displaying an equal or greater than twofold increase in expression levels in the mutant (Dataset S2). Conversely, none of the 18 genes found to have an equal or greater than twofold decrease in expression levels in the mutant were bound by Lsr2, indicating that, like H-NS, Lsr2 is a pleiotropic factor that has a negative effect on gene expression. Consistent with this hypothesis, Lsr2 was previously shown to bind the promoter sequence of the mps operon and repress its expression, preventing the production of glycopeptidolipids in M. smegmatis (17, 19). The reasons for the relatively low degree of correlation between previously reported expression data (18) and the genes targeted by Lsr2 identified in this study (Dataset S2) are currently unknown. A comparison of ChIP data with the expression microarray data in M. tb was complicated by the fact that Lsr2 is suggested to be essential in M. tb; attempts by independent groups to delete lsr2 were unsuccessful (18, 23).
Lsr2 Binds Genes Involved in Virulence and Immunogenicity of M. tb.
A total of 2557 gene orthologs were identified between M. tb and M. smegmatis on the basis of bidirectional best hits in a reciprocal Blastp analysis. From these orthologs, 401 and 272 genes were bound by Lsr2 in M. tb and M. smegmatis, respectively, and 110 genes were bound by Lsr2 in both species (Dataset S3). The overlap is statistically significant with the hypergeometric P-value of 4.13e−26.
A survey of the annotated functional categories of M. tb genes (24) reveals that Lsr2 binds a number of genes involved in intermediary metabolism/respiration and in cell-wall/cell processes, which are proportional to their distribution in the genome (Table S1). Lsr2 binds genes involved in energy metabolism (ATP synthase atpB, -E, -F, and -H), aerobic respiration (cytochrome C oxidase subunits ctaC, -D, -E), cell-wall peptidoglycan synthesis (dacB2, murA, murI, ponA1, ponA2), and mycolic acid synthesis (fabG1, fbpC, pcaA, umaA) (Dataset S1). Genes involved in chromosomal DNA replication (dnaA, recF, dnaE1, nrdE), transcription (sigA, rho), and protein synthesis (rrf, rrs, rrl, rplI, N, O, X, infC, rpmB2, rpmH, rpsT) are also bound by Lsr2. In addition, Lsr2 binds genes involved in stress response (ahpC, cspA, dnaJ1, sodA) and many regulatory genes including two-component systems (mprB, trcR, senX3) and the whiB family of transcription factors (whiB1–4).
ChIP-chip data also reveal that Lsr2 binds several virulence-associated genes in M. tb. The M. tb genome contains gene clusters encoding five ESX family-type VII protein-secretion systems (25–27). ESX-1 is essential for M. tb virulence (25), and a role of ESX-5 in virulence has been demonstrated in M. marinum (28). Lsr2 binds multiple genes in four of the five ESX regions (Fig. 1D and Dataset S1), except ESX-4. A second locus involved in ESX-1 secretion (Rv3614c-espA) (29) is also a target of Lsr2 (Dataset S1). The equivalent ESX-1 region in M. smegmatis (MSMEG_0055-MSMEG_0083) that is involved in DNA transfer (30, 31) is also bound by Lsr2 (Datasets S2 and S3). Other well-established virulence factors include the cell-wall lipid phthiocerol dimycocerosates (PDIMs) (32, 33) and their closely related phenolic glycolipids (PGLs) (34). Lsr2 binds multiple genes at the biosynthetic locus of PDIM/PGL (35) (Fig. 1E).
Approximately 10% of the coding capacity of the M. tb genome encodes two large families of proteins: the acidic asparagine- or glycine-rich proteins, referred to as PE (n = 99) and PPE (n = 69) proteins (24). Remarkably, Lsr2 binds over half of the total PE/PPE genes (89 of 168), which is more than double the number expected by chance (Table S1). Lsr2 binds overwhelmingly to PPE genes (54 of 69) and PE genes that do not belong to the PE_PGRS subclass (22/36). By contrast, Lsr2 binds fewer PE_PGRS genes (13/63), which is likely explained by the high glycine content (up to 50%) of PE_PGRS proteins that makes their corresponding genes GC-rich. PE/PPE proteins are surface-exposed antigens characterized by extensive repetitive homologous sequences (36, 37). Considerable sequence polymorphism has been reported for members of the PE/PPE proteins among the M. tb complex and clinical strains of M. tb (38). As such, PE/PPE proteins are thought to represent a source of antigenic variation (24, 39). The binding of Lsr2 to the majority of PE/PPE genes suggests that this factor may negatively affect the expression of these antigenic proteins to modulate interactions with the host.
Lsr2 Has an N-Terminal Dimerization Domain and a C-Terminal DNA-Binding Domain.
The functional similarities between Lsr2 and H-NS could not be predicted from their primary sequence because Lsr2 exhibits <20% sequence identity with H-NS and has a different predicted secondary structure (15). H-NS has two functional domains: an N-terminal dimerization domain and a C-terminal DNA-binding domain connected by an unstructured linker (2). Less is known about Lsr2, but it has been shown to form dimers in vivo (16). To map the functional domains of Lsr2, six chimeric protein constructs containing different combinations of the N- and C-terminal sequences of Lsr2 and H-NS were generated, and their ability to complement phenotypes associated with hns mutations in E. coli and lsr2 mutations in M. smegmatis was tested (15). Three constructs (C1, C4, and C6) could complement several phenotypes in hns mutants but were ineffective at complementing an lsr2 mutation in M. smegmatis (Table 1). One construct (C5) containing the N-terminal region (residues 1–65) of Lsr2 and the C-terminal region of H-NS (residues 77–137) complemented all phenotypes associated with hns and lsr2 mutations (Table 1). To determine whether repression by the chimeric proteins was specific to H-NS-regulated genes, the binding specificity of HA-tagged chimeric proteins in E. coli was analyzed by ChIP followed by quantitative real-time PCR. C1, C5, and C6 coprecipitated with known H-NS-binding sites (proV, bglG, yjcF, and xapR) with significantly higher levels of enrichment than the sites known not to interact with H-NS (narZ and phnE) (8) (Fig. 2A). These results indicate that Lsr2 contains an N-terminal dimerization domain between residues 1–65 and a C-terminal DNA-binding domain within residues 51–112. Moreover, the purified N-domain protein (residues 1–65) formed dimers but failed to bind DNA, whereas the purified C-domain (51–112) bound DNA but was deficient in dimer formation (Fig. 2B–D). Thus, Lsr2 and H-NS share a similar overall domain organization, which provides a molecular explanation for their functional similarity.
Table 1.
Summary of in vivo complementation experiments of chimeric proteins of Lsr2 and H-NS
Chimeric proteins* | Composition | Phenotypes associated with E. coli hns mutants | Colony morphology of M. smegmatis lsr2 mutant | |||
Mucoidy | Motility | Salicin metabolism | Hemolytic activity | |||
C1 | H-NS (aa 1–89)-Lsr2 (aa 51–112) | + | + | + | + | − |
C2 | Lsr2 (aa 1–50)-H-NS (aa 77–137) | − | + | − | − | − |
C3 | Lsr2 (aa 1–50)-H-NS (aa 65–137) | − | + | − | − | − |
C4 | H-NS (aa 1–89)-Lsr2 (aa 74–112) | − | + | + | + | − |
C5 | Lsr2 (aa 1–65)-H-NS (aa 77–137) | + | + | + | + | + |
C6 | Lsr2 (aa 1–65)-H-NS (aa 65–137) | + | + | + | + | − |
+, phenotype complemented; −, not complemented.
The expression of the chimeric proteins in E. coli and M. smegmatis were confirmed by Western blot analysis (Fig. S1B).
Fig. 2.
Structure–function analysis of Lsr2. (A) Binding of Lsr2-H-NS chimeric proteins to E. coli genes. The binding of chimeric proteins (C1–C6; see text for description) to H-NS-regulated genes (proV, bglG, yjcF, and xapR) and genes not regulated by H-NS (narZ and phnE) was assayed by ChIP and RT–PCR. (B and C) Cross-linking of purified Lsr2 proteins. Glutaraldehyde (1%) was added to purified protein (6 μg/sample). Aliquots were removed at the indicated time points (min) and analyzed by Western blotting with an anti-His antibody. (B) N-terminal domain Lsr2 protein (residues 1–65). (C) C-terminal domain Lsr2 protein (residues 51–112). (D) Electrophoretic mobility shift assays. DNA fragment P104lmo (50 ng) was incubated with the indicated amounts (μg) of N- and C-terminal Lsr2 proteins and analyzed on 4% polyacrylamide gel.
DNA-Binding Domain of Lsr2 Exhibits a Unique Structure.
The structure of the C-domain (Lsr2C, residues 66–112) was solved by nuclear magnetic resonance (NMR) methods. The resonance assignments and structural statistics are shown in Fig. S2 and Table S2, respectively. The domain consists of two α-helices (α1, residues 78–89; α2, residues 102–112) linked by a long loop (residues 90–101) (Fig. 3 A and B). The first nine residues (66–74) are completely flexible. The two helices are perpendicular to and packed against each other through hydrophobic interactions among residues Ile83, Arg84, Ala87, Val94, Ile100, and Val104 and aromatic stacking between Trp86 and Tyr108. The structure of Lsr2C is unique and distinct from that of H-NS, which consists of two short β-strands and an α-helix linked by a loop (40, 41). A protein structure database search using DALI did not identify any structure with a z-score over 3.0.
Fig. 3.
Solution structure of Lsr2C and interactions between Lsr2C and AT-rich DNA. (A) Superimposition of backbone traces for the ensemble of 20 structures of Lsr2C. (B) Ribbon representation of the Lsr2C mean structure; the flexible N-terminal region (residues 66–74) is not included. Side-chains of residues that form the hydrophobic core are shown in red, except those of W86 and Y108, which are shown in orange. (C) Overlay of 2D 1H-15N HSQC spectra of free (blue) and DNA-bound Lsr2C (red; concentration ratio 1:1). Residues show significant NH chemical shift changes. Side-chain NH peak of Arg is labeled “Rsc.” NH peaks of residues, labeled in green, disappeared upon DNA binding. (D) Overlay of the fingerprint region showing intraresidue 1H′–H6/H8 NOE peaks of 2D 1H NOSEY spectra of free (blue) and Lsr2C-bound DNA (red; concentration ratio 1:1). Intraresidue 1H′–H6/H8 NOE peaks of free DNA are labeled by base type and number; DNA residues undergoing obvious chemical shift changes are indicated in green. The sequence and secondary structure of the 27-mer DNA is shown above, with residues affected by Lsr2C binding indicated in green. (E) Combined 1H and 15N chemical shift differences (∆δcomb ═ [δHN2 + (δN/6.5)2]1/2) between free and DNA-bound Lsr2C are plotted against residue number (blue bar). Residues with missing NH peak in DNA-bound state are indicated by red bars. (F) 1H chemical shift difference (∆δ) for 1H′ (blue bar) and H6/H8 (yellow bar) chemical shifts between free and Lsr2C-bound DNA are plotted against residue number.
The DNA-binding site of Lsr2C was mapped by NMR titration experiments using a 27-mer DNA containing 9 consecutive A-T base pairs (42) (Fig. 3D). Comparison of 2D 1H-15N HSQC spectra of Lsr2C, free and in complex with DNA, reveals that significant chemical shift changes (Δδcomb > 0.25 ppm) occur at residues Gly73, Ala74, Ser80–Glu85, Ser95, Ile100, Ala102, and Asp103 (Fig. 3 C and E). NH signals of residues Thr96–Arg99 gradually disappeared as DNA concentration increased, presumably due to intermediate exchange on the NMR time scale. These residues, constituting the DNA-binding sites, are located mainly on the α1 helix and the nearby linker loop. Comparison of 2D 1H NOSEY spectra of the DNA, free or in complex with Lsr2C, reveals that residues with significant intraresidue 1H′-H6/H8 NOE peak shift (1H′ or H6/H8 Δδ > 0.025 ppm) are T3, A5-T9, T18, and A21–T24 (Fig. 3 D and F and Fig. S3).
On the basis of the mapped binding interfaces, a structure model for the Lsr2C/DNA complex was calculated using HADDOCK 2.0 (43), and the result revealed that Lsr2 can bind DNA in two orientations by grabbing either edge of the minor groove like a clamp (Fig. 4). In one orientation, Lsr2 clamps the A5–T9 region of one edge, whereas it occupies the A21–A25 region of another edge in the other orientation (Fig. S4). There are two major components of Lsr2 involved in DNA binding. Residues Arg97-Gly98-Arg99 of the linker loop adopt an extended conformation and are inserted into and oriented parallel to the minor groove, and the two Arg side-chains point away from each other and occupy a region covering five A-T base pairs. This resembles the central Arg-Gly-Arg core conformation of the DNA-binding AT-hook motif (i.e., Pro-Arg-Gly-Arg-Pro, flanked by positively charged residues) of the mammalian nonhistone chromatin protein HMGA (44) (Fig. 4A). HMGA proteins bind AT-rich DNA and are involved in regulation of chromatin structure and gene expression (44). The conformation of the Arg-Gly-Arg core in the AT-hook motif and the nature of its interactions with bases of A-T pairs are the main determinants for HMGA binding to the minor groove of AT-rich DNA (44). Therefore, the presence of an AT-hook Arg-Gly-Arg core-like conformation in the Lsr2 structure provides a molecular explanation for the preferential binding of Lsr2 to AT-rich DNA and is consistent with the previous finding that poly(dI-dC), with hydrogen bond patterns in minor grooves identical to that of AT tracts, is preferentially bound by Lsr2 (19).
Fig. 4.
Structure model of Lsr2C/DNA complex generated by HADDOCK 2.0. (A) Ribbon representation of Lsr2C/DNA complex in which Lsr2C clamps the A21–A25 edge of DNA minor groove. Residues of Lsr2C involved in DNA binding are in pink, and residues of DNA affected by protein binding are in yellow. Side-chains of Arg77, Ser80, Arg84, Ser95, Arg97, and Arg99 are in red. An AT-hook motif of HMG-I (Protein Data Bank code: 2ezd) is superimposed onto the Lsr2C linker loop with the backbone of residues 7–15 and two Arg side-chains shown in blue. (B) A different view of the same complex. The electrostatic potential surface of Lsr2C is shown, and covalent bonds between heavy atoms are illustrated for DNA. Blue: positively charged residues; red: negatively charged residues; gray: uncharged residues.
Outside of the minor groove, side-chains of Arg77, Ser80, Arg84, and Ser95 interact with the sugar–phosphate backbone on either edge of the minor groove. These additional interactions increase the binding affinity of Lsr2 for DNA, which is in the micromolar range on the basis of NMR titration experiments, compared to that of short AT-hook peptides (e.g., Pro-Arg-Gly-Arg-Pro), which is in the millimolar range (45). As a result, Arg77 and Arg80, along with Arg97-Gly98-Arg99 of the linker loop, form a positively charged cleft and act like a clamp to grab one strand of DNA (Fig. 4B), representing a unique mechanism of DNA recognition. These residues, particularly Arg84 and Arg97-Gly98-Arg99, are highly conserved among Lsr2 homologs (15).
Discussion
In this study, we provide evidence that mycobacteria have employed a structurally distinct molecule (Lsr2) with an apparently different DNA-recognition mechanism to achieve an H-NS equivalent function. Lsr2 is similar in function to H-NS in several aspects, including its ability to target AT-rich sequences and silence gene expression by trapping or occluding RNA polymerase at the promoter region via DNA looping or bridging activity (2, 19). As a consequence of their binding preference, both proteins have a predilection for xenogeneic (foreign-derived) genetic material, including multiple loci important for virulence. Our structural analysis suggests that Lsr2 targets AT-rich DNA through interactions with the minor groove of DNA. There are conflicting data concerning the DNA recognition mechanism of H-NS. Although previous biochemical studies indicate that H-NS binds DNA through interactions with the major groove (46, 47), a recent NMR analysis of interactions between the DNA-binding domain of H-NS and an H-NS high-affinity, AT-rich sequence suggests that H-NS binds to the minor groove of DNA (48). However, H-NS binds DNA via an electropositive surface formed by four residues (Thr109, Arg113, Thr114, and Ala116) of the linker loop (48), which appears to be different from the DNA recognition mechanism of Lsr2. Therefore, although H-NS and Lsr2 share a similar overall domain organization, the two proteins are distinct from one another in both tertiary structure and DNA-recognition mechanism. As such, Lsr2 represents a previously undescribed class of bacterial nucleoid-associated proteins. The fact that phylogenetically distant bacteria (Gram-negative and high-GC Gram-positive bacteria) employ distinct molecules to achieve similar functions appears to be an example of convergent evolution at the molecular level.
Our ChIP-on-chip analysis using high-resolution oligonucleotide tiling arrays revealed a correlation between the percentage of GC content and Lsr2 binding. Lsr2 preferentially targets genome regions with a GC content of ∼47% or less, which is essentially the same as that of H-NS-repressed genes; the average GC content of an H-NS-repressed ORF in Salmonella is 46.8% (4, 7). This provides an explanation for our previous finding that Lsr2 fully complements independent phenotypes associated with hns mutations in E. coli and that H-NS is capable of complementing an lsr2 mutation in M. smegmatis (15). Similar to H-NS (4, 7), Lsr2 binding is not restricted to promoter regions, and many of the Lsr2-binding sites are within coding regions, suggesting that, like H-NS (49), Lsr2 can silence gene expression by polymerizing along DNA and bridging adjacent helices, a biochemical property that has previously been demonstrated for Lsr2 (19). The functional similarity of Lsr2 and H-NS is further explained by our finding that Lsr2 and H-NS share equivalent domain composition and that these domains can be swapped to generate functional chimeric molecules despite the fact that these two proteins share <20% identity in sequences and exhibit different tertiary structures (15).
The DNA-binding domain of Lsr2 exhibits a unique structure and appears to have some attributes of the eukaryotic chromatin protein HMGA. HMGA proteins are required for the assembly of higher-order transcription enhancer complexes critical for the transcriptional activation of a number of important genes involved in diverse cellular processes (50). Each HMGA protein possesses a set of three AT hooks, which are unstructured while free in solution. However, each assumes a planar, crescent-shaped conformation when bound to DNA, which is dictated by the shape of a narrow minor groove of the AT-rich DNA substrate (44). As such, HMGA proteins recognize substrate structure rather than nucleotide sequence, and the presence of multiple AT-hook peptides is necessary to confer high-affinity binding (44). Our finding that Lsr2 contains an AT-hook-like motif with a similar conformation in its DNA-binding domain (Fig. 4A) provides a molecular mechanism to explain the preferential binding of Lsr2 to AT-rich DNA sequences. Unlike HMGA proteins, however, the linker loop containing the Arg-Gly-Arg motif in Lsr2 is relatively ordered in structure even in the absence of DNA, which is likely mediated by hydrophobic interactions among residues of the hydrophobic core (Fig. 3B). In addition, the affinity of Lsr2 is predicted to be achieved by additional interactions provided by residues on α-helices, rather than by the presence of multiple AT-hook peptides as shown in HMGA (44).
Our results suggest that Lsr2 plays a role in M. tb equivalent to that of H-NS in Gram-negative bacteria. Lsr2 likely plays a role in silencing laterally acquired foreign genes and/or in coordinating global gene regulation. Although many H-NS repressed genes are regulated by environmental conditions such as pH, osmolarity, and growth temperature, expression of H-NS appears to be relatively constant under a range of environmental conditions (51, 52), suggesting that gene regulation by H-NS is not mediated by a change in the protein abundance. It has been postulated that H-NS may undergo structural and functional alteration under these environmental conditions (2, 11); however, the correlation between the observed structural changes with H-NS-dependent gene expression has been less than perfect (10, 11). Instead, it was recently proposed that the primary role of H-NS is to silence AT-rich DNA, presumably as a mechanism of defense against foreign sequences (10). In E. coli and Salmonella, silencing is the “default” state, and the repression is relieved only when H-NS is displaced from the gene by some alterations in nucleoid structure or competition by a binding factor (10). Within this context, it is not surprising that an “H-NS-like” molecule exists in mycobacteria and related actinomycetes, which are characterized by the high GC content of their genomes (∼70%). Like H-NS, the primary function of Lsr2 could be to exploit the high GC bias of the mycobacterial genome to silence horizontally acquired AT-rich sequences. Unlike H-NS, however, expression of Lsr2 is not constitutive but rather is up-regulated under a number of environmental stress conditions, including nutrient starvation, long-term hypoxia, and antibiotic exposures (18, 53–56), conditions thought to be encountered by M. tb during latent infection in the host (53, 54). Our ChIP-chip data show that Lsr2 binds genes involved not only in virulence and antigenicity, but also in essential cellular processes such as DNA replication, transcription, and protein synthesis. Thus, it is conceivable that Lsr2 could function as a master regulator of M. tb latency, which is characterized by the absence of markedly reduced bacterial metabolism. Under the hostile conditions within the host, lsr2 is up-regulated, binds to multiple sites in the chromosome, and inhibits the expression of genes involved in virulence, antigenicity, and metabolism, allowing M. tb to enter and remain in latency. Future studies to test this hypothesis are warranted; such studies could identify Lsr2 as an attractive target for unique drug development for the control of latent tuberculosis infection.
Experimental Procedures
For ChIP-on-chip experiments, HA-tagged Lsr2 present in cell lysates was precipitated with an anti-HA antibody. Input and ChIP DNA were amplified and labeled with monofunctional reactive Cy3 or Cy5 dyes on the basis of the T7-based protocol (57). Subsequently, labeled ChIP cRNA and input cRNA were hybridized to a 244,000 M. tb H37Rv or M. smegmatis mc2155 whole-genome tiling array and analyzed. Genetic complementation experiments were performed as previously described (15). The ChIP experiments and quantitative real-time PCR performed with chimeric proteins in E. coli were performed as described (15). The procedures for in vitro cross-linking experiments and gel-mobility shift assays were described previously (19). The C-domain of Lsr2 (Lsr2C, residues 66–112) was obtained by partial trypsin digestion of the full-length Lsr2 protein on a Ni-NTA column followed by gel-filtration purification. NMR experimental data were collected at 298 K on Bruker Avance 500- or 800-MHz spectrometers with a triple-resonance cryoprobe. A complete description is included in SI Experimental Procedures.
Supplementary Material
Acknowledgments
This work was supported by funding from the Canadian Institutes of Health Research (MOP-15107 to J.L. and MOP-86683 to W.W.N.) and by the 973 Program of China (2009CB521703 to B.X.). H.v.B. was supported by the Netherlands Organization for Scientific Research (825.06.033).
Footnotes
The authors declare no conflict of interest.
Data deposition: Atomic coordinates and structure factors for the reported NMR structure of Lsr2C have been deposited in the Protein Data Bank under accession no. 2kng. The ChIP-chip data have been deposited in the NCBI GEO database under accession no. GSE18652. NMR, atomic coordinates, chemical shifts, and restraints.
This article contains supporting information online at www.pnas.org/cgi/content/full/0913551107/DCSupplemental.
References
- 1.Luijsterburg MS, Noom MC, Wuite GJ, Dame RT. The architectural role of nucleoid-associated proteins in the organization of bacterial chromatin: A molecular perspective. J Struct Biol. 2006;156:262–272. doi: 10.1016/j.jsb.2006.05.006. [DOI] [PubMed] [Google Scholar]
- 2.Dorman CJ. H-NS: A universal regulator for a dynamic genome. Nat Rev Microbiol. 2004;2:391–400. doi: 10.1038/nrmicro883. [DOI] [PubMed] [Google Scholar]
- 3.Noom MC, Navarre WW, Oshima T, Wuite GJ, Dame RT. H-NS promotes looped domain formation in the bacterial chromosome. Curr Biol. 2007;17:R913–R914. doi: 10.1016/j.cub.2007.09.005. [DOI] [PubMed] [Google Scholar]
- 4.Navarre WW, et al. Selective silencing of foreign DNA with low GC content by the H-NS protein in Salmonella. Science. 2006;313:236–238. doi: 10.1126/science.1128794. [DOI] [PubMed] [Google Scholar]
- 5.Owen-Hughes TA, et al. The chromatin-associated protein H-NS interacts with curved DNA to influence DNA topology and gene expression. Cell. 1992;71:255–265. doi: 10.1016/0092-8674(92)90354-f. [DOI] [PubMed] [Google Scholar]
- 6.Bouffartigues E, Buckle M, Badaut C, Travers A, Rimsky S. H-NS cooperative binding to high-affinity sites in a regulatory element results in transcriptional silencing. Nat Struct Mol Biol. 2007;14:441–448. doi: 10.1038/nsmb1233. [DOI] [PubMed] [Google Scholar]
- 7.Lucchini S, et al. H-NS mediates the silencing of laterally acquired genes in bacteria. PLoS Pathog. 2006;2:e81. doi: 10.1371/journal.ppat.0020081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Grainger DC, Hurd D, Goldberg MD, Busby SJ. Association of nucleoid proteins with coding and non-coding segments of the Escherichia coli genome. Nucleic Acids Res. 2006;34:4642–4652. doi: 10.1093/nar/gkl542. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Oshima T, Ishikawa S, Kurokawa K, Aiba H, Ogasawara N. Escherichia coli histone-like protein H-NS preferentially binds to horizontally acquired DNA in association with RNA polymerase. DNA Res. 2006;13:141–153. doi: 10.1093/dnares/dsl009. [DOI] [PubMed] [Google Scholar]
- 10.Navarre WW, McClelland M, Libby SJ, Fang FC. Silencing of xenogeneic DNA by H-NS-facilitation of lateral gene transfer in bacteria by a defense system that recognizes foreign DNA. Genes Dev. 2007;21:1456–1471. doi: 10.1101/gad.1543107. [DOI] [PubMed] [Google Scholar]
- 11.Atlung T, Ingmer H. H-NS: A modulator of environmentally regulated gene expression. Mol Microbiol. 1997;24:7–17. doi: 10.1046/j.1365-2958.1997.3151679.x. [DOI] [PubMed] [Google Scholar]
- 12.Dame RT, Wyman C, Goosen N. H-NS mediated compaction of DNA visualised by atomic force microscopy. Nucleic Acids Res. 2000;28:3504–3510. doi: 10.1093/nar/28.18.3504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Dame RT, Noom MC, Wuite GJ. Bacterial chromatin organization by H-NS protein unravelled using dual DNA manipulation. Nature. 2006;444:387–390. doi: 10.1038/nature05283. [DOI] [PubMed] [Google Scholar]
- 14.Dame RT. The role of nucleoid-associated proteins in the organization and compaction of bacterial chromatin. Mol Microbiol. 2005;56:858–870. doi: 10.1111/j.1365-2958.2005.04598.x. [DOI] [PubMed] [Google Scholar]
- 15.Gordon BR, Imperial R, Wang L, Navarre WW, Liu J. Lsr2 of Mycobacterium represents a novel class of H-NS-like proteins. J Bacteriol. 2008;190:7052–7059. doi: 10.1128/JB.00733-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Chen JM, et al. Roles of Lsr2 in colony morphology and biofilm formation of Mycobacterium smegmatis. J Bacteriol. 2006;188:633–641. doi: 10.1128/JB.188.2.633-641.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kocíncová D, et al. Spontaneous transposition of IS1096 or ISMsm3 leads to glycopeptidolipid overproduction and affects surface properties in Mycobacterium smegmatis. Tuberculosis (Edinb) 2008;88:390–398. doi: 10.1016/j.tube.2008.02.005. [DOI] [PubMed] [Google Scholar]
- 18.Colangeli R, et al. Transcriptional regulation of multi-drug tolerance and antibiotic-induced responses by the histone-like protein Lsr2 in M. tuberculosis. PLoS Pathog. 2007;3:e87. doi: 10.1371/journal.ppat.0030087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Chen JM, et al. Lsr2 of Mycobacterium tuberculosis is a DNA-bridging protein. Nucleic Acids Res. 2008;36:2123–2135. doi: 10.1093/nar/gkm1162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Becq J, et al. Contribution of horizontally acquired genomic islands to the evolution of the tubercle bacilli. Mol Biol Evol. 2007;24:1861–1871. doi: 10.1093/molbev/msm111. [DOI] [PubMed] [Google Scholar]
- 21.Jang J, Becq J, Gicquel B, Deschavanne P, Neyrolles O. Horizontally acquired genomic islands in the tubercle bacilli. Trends Microbiol. 2008;16:303–308. doi: 10.1016/j.tim.2008.04.005. [DOI] [PubMed] [Google Scholar]
- 22.Rosas-Magallanes V, et al. Horizontal transfer of a virulence operon to the ancestor of Mycobacterium tuberculosis. Mol Biol Evol. 2006;23:1129–1135. doi: 10.1093/molbev/msj120. [DOI] [PubMed] [Google Scholar]
- 23.Park KT, et al. Demonstration of allelic exchange in the slow-growing bacterium Mycobacterium avium subsp. paratuberculosis, and generation of mutants with deletions at the pknG, relA, and lsr2 loci. Appl Environ Microbiol. 2008;74:1687–1695. doi: 10.1128/AEM.01208-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Cole ST, et al. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature. 1998;393:537–544. doi: 10.1038/31159. [DOI] [PubMed] [Google Scholar]
- 25.Abdallah AM, et al. Type VII secretion: Mycobacteria show the way. Nat Rev Microbiol. 2007;5:883–891. doi: 10.1038/nrmicro1773. [DOI] [PubMed] [Google Scholar]
- 26.Brodin P, Rosenkrands I, Andersen P, Cole ST, Brosch R. ESAT-6 proteins: Protective antigens and virulence factors? Trends Microbiol. 2004;12:500–508. doi: 10.1016/j.tim.2004.09.007. [DOI] [PubMed] [Google Scholar]
- 27.Gey VPN, et al. The ESAT-6 gene cluster of Mycobacterium tuberculosis and other high G+C Gram-positive bacteria. Genome Biol. 2001;2 doi: 10.1186/gb-2001-2-10-research0044. RESEARCH0044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Abdallah AM, et al. A specific secretion system mediates PPE41 transport in pathogenic mycobacteria. Mol Microbiol. 2006;62:667–679. doi: 10.1111/j.1365-2958.2006.05409.x. [DOI] [PubMed] [Google Scholar]
- 29.Fortune SM, et al. Mutually dependent secretion of proteins required for mycobacterial virulence. Proc Natl Acad Sci USA. 2005;102:10676–10681. doi: 10.1073/pnas.0504922102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Coros A, Callahan B, Battaglioli E, Derbyshire KM. The specialized secretory apparatus ESX-1 is essential for DNA transfer in Mycobacterium smegmatis. Mol Microbiol. 2008;69:794–808. doi: 10.1111/j.1365-2958.2008.06299.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Flint JL, Kowalski JC, Karnati PK, Derbyshire KM. The RD1 virulence locus of Mycobacterium tuberculosis regulates DNA transfer in Mycobacterium smegmatis. Proc Natl Acad Sci USA. 2004;101:12598–12603. doi: 10.1073/pnas.0404892101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Camacho LR, Ensergueix D, Perez E, Gicquel B, Guilhot C. Identification of a virulence gene cluster of Mycobacterium tuberculosis by signature-tagged transposon mutagenesis. Mol Microbiol. 1999;34:257–267. doi: 10.1046/j.1365-2958.1999.01593.x. [DOI] [PubMed] [Google Scholar]
- 33.Cox JS, Chen B, McNeil M, Jacobs WRJ., Jr Complex lipid determines tissue-specific replication of Mycobacterium tuberculosis in mice. Nature. 1999;402:79–83. doi: 10.1038/47042. [DOI] [PubMed] [Google Scholar]
- 34.Reed MB, et al. A glycolipid of hypervirulent tuberculosis strains that inhibits the innate immune response. Nature. 2004;431:84–87. doi: 10.1038/nature02837. [DOI] [PubMed] [Google Scholar]
- 35.Onwueme KC, Vos CJ, Zurita J, Ferreras JA, Quadri LE. The dimycocerosate ester polyketide virulence factors of mycobacteria. Prog Lipid Res. 2005;44:259–302. doi: 10.1016/j.plipres.2005.07.001. [DOI] [PubMed] [Google Scholar]
- 36.Sampson SL, et al. Expression, characterization and subcellular localization of the Mycobacterium tuberculosis PPE gene Rv1917c. Tuberculosis (Edinb) 2001;81:305–317. doi: 10.1054/tube.2001.0304. [DOI] [PubMed] [Google Scholar]
- 37.Cascioferro A, et al. PE is a functional domain responsible for protein translocation and localization on mycobacterial cell wall. Mol Microbiol. 2007;66:1536–1547. doi: 10.1111/j.1365-2958.2007.06023.x. [DOI] [PubMed] [Google Scholar]
- 38.Fleischmann RD, et al. Whole-genome comparison of Mycobacterium tuber-culosis clinical and laboratory strains. J Bacteriol. 2002;184:5479–5490. doi: 10.1128/JB.184.19.5479-5490.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Choudhary RK, et al. PPE antigen Rv2430c of Mycobacterium tuberculosis induces a strong B-cell response. Infect Immun. 2003;71:6338–6343. doi: 10.1128/IAI.71.11.6338-6343.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Shindo H, et al. Solution structure of the DNA binding domain of a nucleoid-associated protein, H-NS, from Escherichia coli. FEBS Lett. 1995;360:125–131. doi: 10.1016/0014-5793(95)00079-o. [DOI] [PubMed] [Google Scholar]
- 41.Shindo H, et al. Identification of the DNA binding surface of H-NS protein from Escherichia coli by heteronuclear NMR spectroscopy. FEBS Lett. 1999;455:63–69. doi: 10.1016/s0014-5793(99)00862-5. [DOI] [PubMed] [Google Scholar]
- 42.Ulyanov NB, Bauer WR, James TL. High-resolution NMR structure of an AT-rich DNA sequence. J Biomol NMR. 2002;22:265–280. doi: 10.1023/a:1014987532546. [DOI] [PubMed] [Google Scholar]
- 43.Dominguez C, Boelens R, Bonvin AM. HADDOCK: A protein-protein docking approach based on biochemical or biophysical information. J Am Chem Soc. 2003;125:1731–1737. doi: 10.1021/ja026939x. [DOI] [PubMed] [Google Scholar]
- 44.Huth JR, et al. The solution structure of an HMG-I(Y)-DNA complex defines a new architectural minor groove binding motif. Nat Struct Biol. 1997;4:657–665. doi: 10.1038/nsb0897-657. [DOI] [PubMed] [Google Scholar]
- 45.Geierstanger BH, Volkman BF, Kremer W, Wemmer DE. Short peptide fragments derived from HMG-I/Y proteins bind specifically to the minor groove of DNA. Biochemistry. 1994;33:5347–5355. doi: 10.1021/bi00183a043. [DOI] [PubMed] [Google Scholar]
- 46.Tippner D, Wagner R. Fluorescence analysis of the Escherichia coli transcription regulator H-NS reveals two distinguishable complexes dependent on binding to specific or nonspecific DNA sites. J Biol Chem. 1995;270:22243–22247. doi: 10.1074/jbc.270.38.22243. [DOI] [PubMed] [Google Scholar]
- 47.Tippner D, Afflerbach H, Bradaczek C, Wagner R. Evidence for a regulatory function of the histone-like Escherichia coli protein H-NS in ribosomal RNA synthesis. Mol Microbiol. 1994;11:589–604. doi: 10.1111/j.1365-2958.1994.tb00339.x. [DOI] [PubMed] [Google Scholar]
- 48.Sette M, et al. Sequence-specific recognition of DNA by the C-terminal domain of Escherichia coli nucleoid-associated protein H-NS. J Biol Chem. 2009;284:30453–30462. doi: 10.1074/jbc.M109.044313. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Dame RT, Wyman C, Wurm R, Wagner R, Goosen N. Structural basis for H-NS-mediated trapping of RNA polymerase in the open initiation complex at the rrnB P1. J Biol Chem. 2002;277:2146–2150. doi: 10.1074/jbc.C100603200. [DOI] [PubMed] [Google Scholar]
- 50.Reeves R, Beckerbauer L. HMGI/Y proteins: Flexible regulators of transcription and chromatin structure. Biochim Biophys Acta. 2001;1519:13–29. doi: 10.1016/s0167-4781(01)00215-9. [DOI] [PubMed] [Google Scholar]
- 51.Hinton JC, et al. Expression and mutational analysis of the nucleoid-associated protein H-NS of Salmonella typhimurium. Mol Microbiol. 1992;6:2327–2337. doi: 10.1111/j.1365-2958.1992.tb01408.x. [DOI] [PubMed] [Google Scholar]
- 52.Free A, Dorman CJ. Coupling of Escherichia coli hns mRNA levels to DNA synthesis by autoregulation: Implications for growth phase control. Mol Microbiol. 1995;18:101–113. doi: 10.1111/j.1365-2958.1995.mmi_18010101.x. [DOI] [PubMed] [Google Scholar]
- 53.Betts JC, Lukey PT, Robb LC, McAdam RA, Duncan K. Evaluation of a nutrient starvation model of Mycobacterium tuberculosis persistence by gene and protein expression profiling. Mol Microbiol. 2002;43:717–731. doi: 10.1046/j.1365-2958.2002.02779.x. [DOI] [PubMed] [Google Scholar]
- 54.Rustad TR, Harrell MI, Liao R, Sherman DR. The enduring hypoxic response of Mycobacterium tuberculosis. PLoS One. 2008;3:e1502. doi: 10.1371/journal.pone.0001502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Stewart GR, et al. Dissection of the heat-shock response in Mycobacterium tuberculosis using mutants and microarrays. Microbiology. 2002;148:3129–3138. doi: 10.1099/00221287-148-10-3129. [DOI] [PubMed] [Google Scholar]
- 56.Wong DK, Lee BY, Horwitz MA, Gibson BW. Identification of fur, aconitase, and other proteins expressed by Mycobacterium tuberculosis under conditions of low and high concentrations of iron by combined two-dimensional gel electrophoresis and mass spectrometry. Infect Immun. 1999;67:327–336. doi: 10.1128/iai.67.1.327-336.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Liu CL, Schreiber SL, Bernstein BE. Development and validation of a T7 based linear amplification for genomic DNA. BMC Genomics. 2003;4:19. doi: 10.1186/1471-2164-4-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.