Abstract
A systematic search for motifs associated with CcrM DNA methylation sites revealed four long (>100-bp) motifs (CIR sequences) present in up to 21 copies in Caulobacter crescentus. The CIR1 and CIR2 motifs exhibit a conserved inverted repeat organization, with a CcrM site in the center of one of the repeats.
Methylation of DNA performs key functions in eukaryotic and prokaryotic cells. Bacterial adenine DNA methylation usually occurs in restriction-modification systems, which differentiate between self and non-self DNA (26). Two prominent bacterial methyltransferases, however, are not part of restriction-modification systems: Dam in Escherichia coli and other γ-proteobacteria (4, 6) and CcrM in Caulobacter crescentus and other α-proteobacteria (22). Dam and CcrM regulate gene transcription and the timing of DNA replication initiation and can be important for virulence (7, 8, 20).
Dam is not essential in E. coli. Regulation of transcription by Dam methylation in E. coli requires sequences in addition to the GATC methylation site. Two well-studied examples are phase variation in the pyelonephritis-associated pili (pap) operon (9, 25) and the outer membrane protein antigen 43 promoter (9). In both cases, regulation depends on specific Dam methylation sites, which are distinguished by their surrounding sequence.
In contrast, CcrM is an essential gene in the α-proteobacteria C. crescentus (22), Brucella abortus (20), Sinorhizobium meliloti, and Agrobacterium tumefaciens (12). DNA methylation in C. crescentus regulates transcription in the promoter for ccrM itself (23) and the P1 promoter of ctrA, a global transcriptional regulator (19). Therefore, we sought to determine whether the CcrM recognition site, GANTC, is associated with conserved motifs. We identified four large (>100-bp) intergenic motifs in C. crescentus that contain conserved CcrM sites. Two of these motifs and several other motifs in other α-proteobacteria share three features: (i) they are composed of two inverted repeats; (ii) a CcrM site is in the center of one of the inverted repeats; and (iii) a conserved central linker joins the two inverted repeats. These novel motifs in α-proteobacteria may mediate regulatory functions of CcrM.
Genome sequences were downloaded from GenBank (ftp://ftp.ncbi.nih.gov/GenBank/genomes/Bacteria) and processed with the Genome-Tools package (http://genome-tools.sourceforge.net) (13). Sequence alignments were done with the CLUSTALW 1.82 software program (24) and BLAST (1). Consensus RNA secondary structures were predicted by using ConStruct 2.0 (14, 15), which uses the RNAfold 1.4 algorithm (10, 16, 28). Default settings were used for CLUSTALW and ConStruct.
We examined 15 bp of sequence centered on each CcrM site (5 bp upstream and downstream of each GANTC) in C. crescentus. Excluding those which were associated with known transposases or insertion elements (17), four 15-mers occurred more than four times in intergenic sequences (Table 1; also shown are results for other α-proteobacteria). Sequence conservation around each of these 15-mers extended to over 100 bp (for alignments, see supplementary materials at http://caulobacter.stanford.edu/CIR). Using BLASTN to identify matches to each long motif, we found that only one or two matches do not contain CcrM sites. These long conserved motifs are therefore called Caulobacter CcrM-associated intergenic repeat 1 (CIR1) to CIR4. Two of these motifs, Caulobacter CIR1 and CIR2 (present in 21 and 16 copies, respectively) (Fig. 1A and B), appear to be conserved in other bacteria; only these two motifs in C. crescentus and related motifs in other α-proteobacteria are discussed below.
TABLE 1.
List of repeated 15-mers centered on CcrM sites in α-proteobacteria
| Organism | Sequence | No. of perfect matchesa | No. with one mismatchb | No. intergenicc | Approximate size (bp)d | Name |
|---|---|---|---|---|---|---|
| C. crescentus | AGATG GANTC ATCTA | 21 | 3 | 24 | 120 | Caulobacter CIR1 |
| AAACG GANTC GTTTT | 11 | 3 | 13 | 120 | Caulobacter CIR2 | |
| AGAGG GANTC TGTTT | 7 | 1 | 8 | 240 | Caulobacter CIR3 | |
| GCCCA GANTC ATCCG | 8 | 11 | 13 | 120 | Caulobacter CIR4 | |
| A. tumefaciens | CACAG GANTC CAGCC | 19 | 17 | 34 | 120 | Agrobacterium CIR1 |
| GAAAA GANTC TTCCC | 2 | 4 | 6 | 120 | Agrobacterium CIR1 | |
| AGATC GANTC GCGGT | 2 | 4 | 6 | 140 | Agrobacterium CIR2 | |
| B. melitensis | TGATT GANTC AGATC | 28 | 12 | 39 | 110 | Brucella CIR1 |
| TAACA GANTC GCCGG | 9 | 14 | 20 | 110 | Brucella CIR2 | |
| AAACG GANTC GTTGG | 9 | 9 | 17 | 110 | Brucella CIR2 | |
| TGTTG GANTC ACAAA | 7 | 0 | 7 | 160 | Brucella CIR3 | |
| TGTTG GANTC AGACC | 6 | 4 | 10 | 160 | Brucella CIR3 | |
| CGTTT GANTC CCGGC | 5 | 4 | 9 | 160 | Brucella CIR3 | |
| GTTTT GANTC CGATC | 4 | 0 | 4 | 160 | Brucella CIR4 | |
| GTTTT GANTC TGTGA | 4 | 0 | 4 | 160 | Brucella CIR4 | |
| M. lotie | GACCT GANTC TCAAC | 16 | 11 | 25 | 170 | Mesorhizobium CIR1 |
| AGGCC GANTC GAAAA | 6 | 10 | 13 | 170 | Mesorhizobium CIR1 | |
| GGTTC GANTC CTGTC | 4 | 2 | 6 | 70 | Mesorhizobium CIR2 | |
| GGTTC GANTC CCGCT | 3 | 4 | 7 | 120 | Mesorhizobium CIR3 | |
| R. prowazekii | TTTTA GANTC AATAT | 2 | 5 | 5 | 20 | None |
| TTAAA GANTC TGTAA | 2 | 4 | 3 | 15 | None | |
| TATTT GANTC AATCT | 2 | 2 | 3 | 440 | None | |
| TTACT GANTC TAAGA | 2 | 2 | 2 | 440 | None | |
| S. meliloti | TGATC GANTC CAAGA | 10 | 17 | 26 | 120 | Sinorhizobium CIR1 |
| GCCGC GANTC CCCTT | 10 | 13 | 22 | 160 | Sinorhizobium CIR2 | |
| GCCGC GANTC TCCTT | 8 | 18 | 24 | 160 | Sinorhizobium CIR2 | |
| GCCGC GANTC CCTTC | 5 | 12 | 16 | 160 | Sinorhizobium CIR2 |
The number of exact matches to the sequence in the entire genome, excluding plasmids (and megaplasmids in S. meliloti). Both chromosomes of B. melitensis were included.
The number of matches to the sequence with exactly one mismatch in the first or last five base pairs (no mismatches allowed in the central CcrM site).
That is, the number of exact-match and one-mismatch occurrences which are in intergenic regions, based on the middle of the CcrM site.
The approximate size of the conserved region in base pairs surrounding the sequence, as judged by manual examination.
That is, Mesorhizobium loti.
FIG. 1.
DNA sequence alignments for Caulobacter and Brucella CIR1 and CIR2 sequences. Sequences were identified by BLASTN on the entire genome sequence, and full (not truncated) matches were identified manually. Alignments are shown for Caulobacter CIR1 (A), Caulobacter CIR2 (B), Brucella CIR1 (C), and Brucella CIR2 (D). Nucleotides are color coded, with A in red, C in blue, G in black, and T in green. Sequences are annotated on the left with the chromosomal coordinate of the first (leftmost) base shown and on the right with the length of sequence shown. Negative coordinates indicate sequences that have been reversed and complemented. In panels C and D, an “I” indicates the sequence is from chromosome I, and an “II” indicates the sequence is from chromosome II. Asterisks above the sequences indicate strictly conserved bases. The gray bars at the bottom of the alignments indicate the level of conservation, with the tallest bars meaning strict conservation in all sequences and no bar meaning no conservation. The location of the conserved CcrM site is highlighted with a black box. Arrows in panel A highlight a hybrid CIR1/CIR2 sequence.
The local gene organization around each Caulobacter CIR1 and CIR2 sequence is shown in Table 2. CIR1 and CIR2 are often shortly downstream of flanking open reading frames (ORFs) (Fig. 2); the stop codon is often the beginning of the CIR1 or CIR2 consensus sequence (a distance of 1 in Table 2). In these cases, CIR1 and CIR2 have not truncated the flanking ORFs (based on BLASTP [1] compared to the GenBank [5] nonredundant database, only 1 of 26 ORFs whose stop codon is supplied by a CIR1 or CIR2 sequence is truncated). The identities of the flanking ORFs suggest no function for the CIR sequences.
TABLE 2.
List of ORFs flanking CIR1 and CIR2 sequences in Caulobactera
| CIR | Coordinatesb | Left dirc | Left named | Left diste | Left descriptionf | Right dir | Right name | Right dist | Right description |
|---|---|---|---|---|---|---|---|---|---|
| CIR1 | 533765 … 533876 | − | CC0505 | 89 | Hypothetical protein | − | CC0506 | 1 | Coproporphyrinogen III oxidase, aerobic |
| 539421 … 539536 | + | CC0512 | 1 | Hypothetical protein | + | CC0513 | 68 | Hypothetical protein | |
| 1118969 … 1119084 | + | CC0995 | 1 | TonB-dependent receptor, putative | + | CC0996 | 48 | Hypothetical protein | |
| 1437139 … 1437254 | + | CC1292 | 18 | IS298, transposase OrfB | − | CC1293 | 1 | DNA-binding response regulator | |
| 1750998 … 1751113 | + | CC1584 | 1 | Peptidyl-prolyl cis-trans isomerase, cyclophilin-type | − | CC1585 | 57 | Hypothetical protein | |
| 1755130 … 1755245 | + | CC1588 | 1 | Queuine tRNA ribosyltransferase | − | CC1589 | 1 | Hypothetical protein | |
| 2185689 … 2185804 | + | CC1982 | 1 | Hypothetical protein | − | CC1983 | 1 | DNA repair protein RecN | |
| 2289497 … 2289612 | + | CC2078 | 1 | Hypothetical protein | − | CC2079 | 1 | Hypothetical protein | |
| 2397000 … 2397115 | + | CC2187 | 1 | Conserved hypothetical protein | − | CC2188 | 8 | Transcriptional regulator, MarR family | |
| 2876316 … 2876431 | + | CC2658 | 14 | Hypothetical protein | − | CC2659 | 58 | Hypothetical protein | |
| 3116816 … 3116931 | + | CC2893 | 17 | Hypothetical protein | + | CC2894 | 52 | Bleomycin resistance family protein | |
| 3857996 … 3858111 | + | CC3604 | 1 | NADH-ubiquinone oxidoreductase 39-kDa subunit precursor | − | CC3605 | 91 | Bacitracin resistance protein | |
| 404110 … 403995 | + | CC0383 | 1 | Hypothetical protein | − | CC0384 | 18 | Hydrolase, alpha/beta hydrolase fold family | |
| 933153 … 933038 | + | CC0836 | 1 | Sensor histidine kinase, putative | − | CC0837 | 210 | Hypothetical protein | |
| 954735 … 954620 | + | CC0857 | 154 | GGDEF family protein | − | CC0858 | 7 | Hypothetical protein | |
| 1392417 … 1392302 | + | CC1234 | 1 | Hypothetical protein | + | CC1235 | 71 | Null | |
| 1619157 … 1619042 | + | CC1469 | 20 | Hypothetical protein | − | CC1470 | 224 | Acetyltransferase, GNAT family | |
| 1794840 … 1794725 | + | CC1620 | 18 | GMP synthase | − | CC1621 | 1 | Transcriptional regulator, LysR family | |
| 2802429 … 2802314 | + | CC2590 | 29 | Excinuclease ABC, subunit A | + | CC2591 | 71 | Null | |
| 3068980 … 3068865 | + | CC2841 | 20 | Hypothetical protein | − | CC2842 | 1 | Methyl-accepting chemotaxis protein McpO | |
| 3091603 … 3091488 | + | CC2867 | 113 | Virulence-associated protein, putative | − | CC2868 | 1 | NeuB protein, putative | |
| CIR2 | 396801 … 396688 | − | CC0375 | 51 | Conserved hypothetical protein | − | CC0376 | 81 | Hypothetical protein |
| 725348 … 725235 | + | CC0655 | 18 | Sensory box/GGDEF family protein | − | CC0656 | 110 | S4 domain protein | |
| 786567 … 786458 | − | CC0716 | 54 | l-Lysine 2,3-aminomutase, putative | − | CC0717 | 1 | Nodulin-related protein | |
| 1289526 … 1289413 | − | CC1135 | 178 | Amylosucrase | − | CC1136 | 1 | TonB-dependent receptor | |
| 3065634 … 3065521 | − | CC2838 | 54 | Conserved hypothetical protein | − | CC2839 | 1 | Hypothetical protein | |
| 3091490 … 3091376 | + | CC2867 | 1 | Virulence-associated protein, putative | − | CC2868 | 114 | NeuB protein, putative | |
| 3231006 … 3230893 | + | CC3011 | 71 | BolA protein | + | CC3012 | 40 | Cytidine deaminase | |
| 3300768 … 3300655 | − | CC3071 | 14 | Hypothetical protein | − | CC3072 | 120 | Hypothetical protein | |
| 3300880 … 3300767 | − | CC3071 | 126 | Hypothetical protein | − | CC3072 | 8 | Hypothetical protein | |
| 605631 … 605744 | + | CC0562 | 1 | TonB-dependent receptor | − | CC0563 | 115 | TonB-dependent receptor | |
| 779009 … 779122 | − | CC0710 | 89 | Hypothetical protein | + | CC0711 | 12 | Hypothetical protein | |
| 1152773 … 1152886 | − | CC1019 | 21 | Major facilitator family transporter | + | CC1020 | −46 | Hypothetical protein | |
| 2427390 … 2427499 | + | CC2224 | 23 | Cyclohexadienyl dehydrogenase | − | CC2225 | 1 | Hypothetical protein | |
| 3060541 … 3060654 | − | CC2833 | 35 | Conserved hypothetical protein | + | CC2834 | 300 | Carbamoyl-phosphate synthase, small subunit | |
| 3351841 … 3351954 | − | CC3117 | 5 | Hypothetical protein | − | CC3118 | 18 | Transcriptional regulator, TetR family | |
| 3616712 … 3616825 | + | CC3367 | 1 | Transcriptional regulator, AraC family | − | CC3368 | 1 | Conserved hypothetical protein |
Left refers to the ORF with smaller coordinates; right refers to the ORF with larger coordinates. dir, direction; dist, distance.
Coordinates for the CIR sequence according to the published genome sequence.
A “+” direction means transcription goes from lower to higher coordinate numbers; a “−” means transcription goes from higher to lower coordinate numbers.
The systematic name assigned to the ORF, according to the published genome sequence.
The number of nucleotides separating the closest extremes of the CIR and ORF sequences. A distance of 1 means the stop codon of the ORF is the start of the CIR sequence.
The annotation of the ORF, according to the published genome sequence.
FIG. 2.
Local genomic organization around repeated intergenic sequences. The orientation of genes flanking IRU/ERIC sequences in E. coli and CIR1 and CIR2 sequences in C. crescentus (“C.”) and B. melitensis (“B.”) is summarized. The vertical bar indicates the position of the intergenic repeat sequence. The term “overlap with ORF” means that the intergenic repeat sequence extends into the coding sequence of at least one of the flanking ORFs.
Because Caulobacter CIR1 and CIR2 are close to flanking genes, we expect them to be at least partially transcribed. Both motifs are composed of two 52-bp inverted repeat sequences (arms) separated by a 12-bp linker and thus, if transcribed, are predicted to form two long stem-loops joined by the linker (Fig. 3A). Of 38 differences between CIR1 and CIR2, 20 are compensatory changes preserving potential base pairing. The linkers are conserved and nonpalindromic, allowing CIR1 and CIR2 to be oriented. The CcrM site is in the middle of one of the arms (blue circles in Fig. 3A). The presence of exactly one CcrM site seems important: only 2 of 37 CIR1 and CIR2 sequences have CcrM sites in both arms. Additionally, the arms (each individually inverted repeats) are nearly inverted repeats of each other, but one arm contains a single difference which destroys what would otherwise be a complementary GANTC site.
FIG. 3.
Predicted consensus RNA secondary structures for putatively transcribed Caulobacter (A) and Brucella (B) CIR1 and CIR2 motifs. Structures were predicted based on combined alignments of CIR1 and CIR2 motifs in each bacterium. Colored lines connecting paired bases indicate the probabilities of base pairing as follows: red, high probability; magenta, intermediate probability; and blue, low probability. The location of the potentially transcribed GANTC site is circled in light blue. The sequence shown in panel A corresponds to the one labeled 539421 in Fig. 1A; the sequence in panel B corresponds to the one labeled 68730-I in Fig. 1C.
Two 110-bp motifs in Brucella melitensis (Brucella CIR1 and CIR2, present in 39 and 35 copies, respectively) are strikingly similar to the Caulobacter CIR1 and CIR2 motifs (Fig. 1C and D). The Brucella CIR1 and CIR2 motifs are (i) composed of two inverted repeat arms joined by a central linker (Fig. 3B), (ii) have a CcrM site in the center of one of the inverted repeats, (iii) have a conserved central linker, and (iv) sometimes provide stop codons for flanking ORFs (data not shown). The Brucella CIR1 motif is also often downstream of flanking ORFs (Fig. 2). These ORFs are not related to the flanking ORFs in Caulobacter (finding the best BLAST hit in C. crescentus of the 137 ORFs flanking Brucella CIR1 and CIR2 sequences results in only two ORFs flanking Caulobacter CIR1 or CIR2 sequences; by random chance, one would expect to find three). Thus, flanking ORFs again provide no suggestions for CIR functions.
Potentially related CIR motifs in other α-proteobacteria are diagrammed in Fig. 4 (for full sequences and alignments, see supplementary materials). The Mesorhizobium CIR1 motif is shorter than those in Caulobacter and Brucella, and the central linker is different. However, it is also composed of two inverted repeats (arms) with a conserved CcrM site in the center of one arm. The Sinorhizobium CIR1 is composed of two inverted repeats, but the conserved CcrM site is within the central linker, whose sequence differs from the Caulobacter and Brucella linkers. However, two motifs previously identified in S. meliloti, RIME1and RIME2 (for Rhizobium-specific intergenic mosaic elements 1 and 2) (18), also have two inverted repeat arms joined by a central linker. The linker sequence in RIME1 is similar to the Caulobacter and Brucella CIR1 and CIR2 linker, but RIME1 has no conserved CcrM site in its arms. The lack of conserved CcrM sites in RIME1 and RIME2 explains why these sequences were not found by our searches. We found only a previously identified 440-bp motif associated with CcrM sites in Rickettsia prowazekii, with no resemblance to other CIR sequences. Notably, R. prowazekii lacks a CcrM homolog.
FIG. 4.
Schematic diagram of CIR and related sequences in α-proteobacteria. Boxes with the same color and arrow markings represent sequences conserved between different CIRs. Half arrows pointing in opposite directions indicate complementary sequences that may form stem-loop secondary structures if transcribed. Conserved CcrM sites are indicated by a light blue circle. The central linker in red (orientation indicated by the full arrow is arbitrary) is conserved between Caulobacter CIR1 and CIR2, Brucella CIR1 and CIR2, and Sinorhizobium RIME1, but not the other sequences. CcrM sites are conserved at the loop within arms in the Caulobacter CIR1 and CIR2 motifs, the Brucella CIR1 and CIR2 motifs, and the Mesorhizobium CIR1 motif.
A similar search in E. coli for Dam-associated motifs yielded only three 14-mers (the Dam recognition site is 4 bp instead of 5 bp). These were associated with the IS5 transposase, the 23S rRNA gene cluster, and an Rhs element (for “rearrangement hot spot,” a large, protein-coding repeat element) (27). Accordingly, no previously identified repeated intergenic sequence in E. coli K-12 is associated with Dam sites. The Caulobacter and Brucella CIR1 and CIR2 motifs resemble IRU/ERIC sequences in E. coli (11, 21). IRU/ERIC sequences are ∼120 bp long, highly conserved, palindromic, and present in similar numbers. IRU/ERIC sequences were also found by sequence analysis; they are transcribed and have detectable transcriptional termination activity. However, gene regulation is probably not their primary role because this does not explain their extensive conservation (2, 11, 21). By a similar argument, then, gene regulation is likely not the primary function of the CIR sequences.
The IRU/ERIC sequences differ from CIR1 and CIR2 in important ways, however. IRU/ERIC sequences have no consensus methylation sites, appear usually between genes in an operon (Fig. 2), and have a single conserved stem-loop in their predicted RNA secondary structure (11, 21). No other previously identified repeated intergenic sequences outside of α-proteobacteria are analogous to the Caulobacter and Brucella CIR1 and CIR2 motifs; these CIR motifs are thus a new class of repeated intergenic sequences.
Like repeated intergenic sequences in other bacteria, the function of the CIR motifs is unknown. The association with methylation sites is novel, suggesting that understanding them may shed light on the functions of CcrM methylation. Their predilection for the end of genes suggests involvement in gene regulation, but they are not similar to known transcriptional terminators, and this would not explain their conservation. Their high conservation suggests a maintenance process, such as gene conversion (as has been postulated for the IRU/ERIC sequences). The GC content of the Caulobacter CIR1 and CIR2 sequences is 44.8% ± 6.3% (all other intergenic sequences are 64.8% ± 11.5%), which suggests a foreign origin. However, they are not similar to known transposases or insertion elements. Furthermore, these sequences may be modular, since there is one hybrid Caulobacter CIR1/CIR2 sequence (arrows in Fig. 1A; Fig. 4), and several other CIR sequences seem to have variants based on different arm sequences (see supplementary materials). Since repeated sequences seem to be found ubiquitously in intergenic sequences in all organisms (3), further characterization of CIR motifs and other intergenic sequences, both upstream and downstream of genes, is essential for understanding genome function and evolution.
Acknowledgments
This work was supported by National Institute of Health grant GM51426 and NIH grant 2T32GM07365 to the Medical Scientist Training Program (S.L.C.).
REFERENCES
- 1.Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403-410. [DOI] [PubMed] [Google Scholar]
- 2.Bachellier, S., E. Gilson, M. Hofnung, and C. W. Hill. 1996. Repeated sequences, p. 2012-2040. In F. C. Neidhardt, R. Curtiss III, J. L. Ingraham, E. C. C. Lin, K. B. Low, B. Magasanik, W. S. Reznikoff, M. Riley, M. Schaechter, and H. E. Umbarger (ed.), Escherichia coli and Salmonella: cellular and molecular biology, 2nd ed., vol. 2. ASM Press, Washington, D.C.
- 3.Bao, Z., and S. R. Eddy. 2002. Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res. 12:1269-1276. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Barbeyron, T., K. Kean, and P. Forterre. 1984. DNA adenine methylation of GATC sequences appeared recently in the Escherichia coli lineage. J. Bacteriol. 160:586-590. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Benson, D. A., I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, B. A. Rapp, and D. L. Wheeler. 2000. GenBank. Nucleic Acids Res. 28:15-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Brooks, J. E., R. M. Blumenthal, and T. R. Gingeras. 1983. The isolation and characterization of the Escherichia coli DNA adenine methylase (dam) gene. Nucleic Acids Res. 11:837-851. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Garcia-Del Portillo, F., M. G. Pucciarelli, and J. Casadesus. 1999. DNA adenine methylase mutants of Salmonella typhimurium show defects in protein secretion, cell invasion, and M cell cytotoxicity. Proc. Natl. Acad. Sci. USA 96:11578-11583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Heithoff, D. M., R. L. Sinsheimer, D. A. Low, and M. J. Mahan. 1999. An essential role for DNA adenine methylation in bacterial virulence. Science 284:967-970. [DOI] [PubMed] [Google Scholar]
- 9.Henderson, I. R., P. Owen, and J. P. Nataro. 1999. Molecular switches: the ON and OFF of bacterial phase variation. Mol. Microbiol. 33:919-932. [DOI] [PubMed] [Google Scholar]
- 10.Hofacker, I. L., W. Fontana, P. F. Stadler, S. Bonhoeffer, M. Tacker, and P. Schuster. 1994. Fast folding and comparison of RNA secondary structures. Monatsh Chem. 125:167-188. [Google Scholar]
- 11.Hulton, C. S., C. F. Higgins, and P. M. Sharp. 1991. ERIC sequences: a novel family of repetitive elements in the genomes of Escherichia coli, Salmonella typhimurium, and other enterobacteria. Mol. Microbiol. 5:825-834. [DOI] [PubMed] [Google Scholar]
- 12.Kahng, L. S., and L. Shapiro. 2001. The CcrM DNA methyltransferase of Agrobacterium tumefaciens is essential, and its activity is cell cycle regulated. J. Bacteriol. 183:3065-3075. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Lee, W., and S. L. Chen. 2002. Genome-Tools: a flexible package for genome sequence analysis. BioTechniques 33:1334-1341. [DOI] [PubMed] [Google Scholar]
- 14.Luck, R., S. Graf, and G. Steger. 1999. ConStruct: a tool for thermodynamic controlled prediction of conserved secondary structure. Nucleic Acids Res. 27:4208-4217. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Luck, R., G. Steger, and D. Riesner. 1996. Thermodynamic prediction of conserved secondary structure: application to the RRE element of HIV, the tRNA-like element of CMV and the mRNA of prion protein. J. Mol. Biol. 258:813-826. [DOI] [PubMed] [Google Scholar]
- 16.McCaskill, J. S. 1990. The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29:1105-1119. [DOI] [PubMed] [Google Scholar]
- 17.Nierman, W. C., T. V. Feldblyum, M. T. Laub, I. T. Paulsen, K. E. Nelson, J. A. Eisen, J. F. Heidelberg, M. R. Alley, N. Ohta, J. R. Maddock, I. Potocka, W. C. Nelson, A. Newton, C. Stephens, N. D. Phadke, B. Ely, R. T. DeBoy, R. J. Dodson, A. S. Durkin, M. L. Gwinn, D. H. Haft, J. F. Kolonay, J. Smit, M. B. Craven, H. Khouri, J. Shetty, K. Berry, T. Utterback, K. Tran, A. Wolf, J. Vamathevan, M. Ermolaeva, O. White, S. L. Salzberg, J. C. Venter, L. Shapiro, C. M. Fraser, and J. Eisen. 2001. Complete genome sequence of Caulobacter crescentus. Proc. Natl. Acad. Sci. USA 98:4136-4141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Osteras, M., J. Stanley, and T. M. Finan. 1995. Identification of Rhizobium-specific intergenic mosaic elements within an essential two-component regulatory system of Rhizobium species. J. Bacteriol. 177:5485-5494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Reisenauer, A., and L. Shapiro. 2002. DNA methylation affects the cell cycle transcription of the CtrA global regulator in Caulobacter. EMBO J. 21:4969-4977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Robertson, G. T., A. Reisenauer, R. Wright, R. B. Jensen, A. Jensen, L. Shapiro, and R. M. Roop II. 2000. The Brucella abortus CcrM DNA methyltransferase is essential for viability, and its overexpression attenuates intracellular replication in murine macrophages. J. Bacteriol. 182:3482-3489. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Sharples, G. J., and R. G. Lloyd. 1990. A novel repeated DNA sequence located in the intergenic regions of bacterial chromosomes. Nucleic Acids Res. 18:6503-6508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Stephens, C., A. Reisenauer, R. Wright, and L. Shapiro. 1996. A cell cycle-regulated bacterial DNA methyltransferase is essential for viability. Proc. Natl. Acad. Sci. USA 93:1210-1214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Stephens, C. M., G. Zweiger, and L. Shapiro. 1995. Coordinate cell cycle control of a Caulobacter DNA methyltransferase and the flagellar genetic hierarchy. J. Bacteriol. 177:1662-1669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.van der Woude, M., B. Braaten, and D. Low. 1996. Epigenetic phase variation of the pap operon in Escherichia coli. Trends Microbiol. 4:5-9. [DOI] [PubMed] [Google Scholar]
- 26.Wilson, G. G. 1988. Cloned restriction-modification systems: a review. Gene 74:281-289. [DOI] [PubMed] [Google Scholar]
- 27.Zhao, S., C. H. Sandt, G. Feulner, D. A. Vlazny, J. A. Gray, and C. W. Hill. 1993. Rhs elements of Escherichia coli K-12: complex composites of shared and unique components that have different evolutionary histories. J. Bacteriol. 175:2799-2808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Zuker, M., and P. Stiegler. 1981. Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 9:133-148. [DOI] [PMC free article] [PubMed] [Google Scholar]




