Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Oct 7.
Published in final edited form as: Mol Microbiol. 2008 Mar 25;68(3):600–614. doi: 10.1111/j.1365-2958.2008.06172.x

Small non-coding RNAs in Caulobacter crescentus

Stephen G Landt 1, Eduardo Abeliuk 1,2,3, Patrick T McGrath 1,4, Joseph A Lesley 1, Harley H McAdams 1, Lucy Shapiro 1,*
PMCID: PMC7540941  NIHMSID: NIHMS1569693  PMID: 18373523

Summary

Small non-coding RNAs (sRNAs) are active in many bacterial cell functions, including regulation of the cell’s response to environmental challenges. We describe the identification of 27 novel Caulobacter crescentus sRNAs by analysis of RNA expression levels assayed using a tiled Caulobacter microarray and a protocol optimized for detection of sRNAs. The principal analysis method involved identification of sets of adjacent probes with unusually high correlation between the individual intergenic probes within the set, suggesting presence of a sRNA. Among the validated sRNAs, two are candidate transposase gene antisense RNAs. The expression of 10 of the sRNAs is regulated by either entry into stationary phase, carbon starvation, or rich versus minimal media. The expression of four of the novel sRNAs changes as the cell cycle progresses. One of these shares a promoter motif with several genes expressed at the swarmer-to-stalked cell transition; while another appears to be controlled by the CtrA global transcriptional regulator. The probe correlation analysis approach reported here is of general use for large-scale sRNA identification for any sequenced microbial genome.

Introduction

A significant fraction of the bacterial transcriptome is comprised of RNA species other than mRNA, rRNA and tRNA. Small non-coding RNAs (sRNAs) perform a variety of functions, often as post-transcriptional regulators of pathways that modulate growth and development in response to environmental fluctuations (Repoila et al., 2003; Lenz et al., 2004; Babitzke and Romeo, 2007). In Escherichia coli, approximately 80 sRNAs, between 50 and 400 bases in size, have been identified (Argaman et al., 2001; Rivas and Eddy, 2001; Wassarman et al., 2001; Kawano et al., 2005). The function of the majority of these sRNAs is still unknown. Less exhaustive searches have verified smaller numbers of sRNAs in a range of other genomes (Ostberg et al., 2004; Livny et al., 2005; Pichon and Felden, 2005; Livny et al., 2006; Del Val et al., 2007; Mandin et al., 2007).

Comparative sequence analysis to identify conserved elements in intergenic regions (IGRs) has been used in the identification of the majority of sRNAs that have been biochemically confirmed. Additional sequence signatures, including intergenic promoter elements, sequences predicted to form stable RNA secondary structures, and intergenic rho-independent terminator elements have also been used for the detection of candidate sRNAs (Argaman et al., 2001; Rivas and Eddy, 2001; Livny et al., 2005). All of these methods are biased towards sRNAs that are conserved and expressed under the control of well-understood, often organism-specific, expression signatures. In addition, the less biased shotgun cloning approach has identified highly expressed sRNA candidates in E. coli that were missed by comparative analyses (Vogel et al., 2003).

We have used a different unbiased strategy to identify sRNAs in Caulobacter crescentus, using a custom tiled Affymetrix microarray combined with an experimental protocol and analysis scheme optimized for sRNA detection. C. crescentus is a freshwater oligotrophic α-proteobacterium that divides asymmetrically to produce a stalked cell which is competent to initiate DNA replication and a motile swarmer cell that is unable to initiate DNA replication until it differentiates into a stalked cell (Fig. 3I). Only four sRNAs, all highly conserved, have been identified previously in C. crescentus (Winzeler et al., 1997; Brown, 1999; Keiler et al., 2000; Barrick et al., 2005). A lack of multiple genome sequences for related species has limited the use of comparative genomics for sRNA identification (Del Val et al., 2007). We have identified and validated 23 new C. crescentus sRNAs based on correlated intergenic expression, and another four based on high levels of intergenic expression. Among these 27 novel sRNAs are four whose expression is under cell cycle control and 10 that are differentially expressed under different environmental or nutritional conditions. Our method can be used for the large scale identification of sRNAs in any microbe for which a suitably tiled microarray is available.

Fig. 3.

Fig. 3.

Cell cycle regulated sRNAs. A,C,E,G show cell cycle profiles from the CauloHI1 microarray assay (above) and Northern blots (below) for RNA isolated at indicated times after synchrony, as described in Experimental procedures. In the representation of the microarray data, the Y-axes indicate relative expression levels normalized so that the minimum value is 1. In the Northern blots, sRNA sizes are shown in nucleotides. For the sRNA in the intergenic region between CC1316 and tRNA Asn in (G), the arrow indicates the transcript identified by the probe correlation analysis. B, D, F, H show diagrams of IGRs containing sRNAs. Arrows indicate direction of transcription. The distance in nucleotides between the sRNA and each adjacent gene is indicated, measured as described for Fig. 2. In (D) and (H), the sequences of the sRNAs between CC3552 and CC3555 and between CC1316 and tRNA_Asn are shown, with conserved regulatory motifs boxed. Upstream arrows indicate estimated sRNA transcriptional start sites; downstream arrows show stems of predicted rho-independent terminator hairpins. Motif cc_10 is the consensus binding site for CtrA (Quon et al., 1996; McGrath et al., 2007). (I) shows a schematic of the C. crescentus cell cycle. Grey shading indicates times when CtrA is present during the cell cycle (Quon et al., 1996; Domian et al., 1997).

Results

Identification of candidate sRNAs using whole-genome tiling microarrays

The CauloHI1 Affymetrix whole-genome tiling microarray design is described at http://www.stanford.edu/group/caulobacter/CauloHI1 (McGrath et al., 2007). This custom C. crescentus genome Affymetrix microarray has 25-mer probes in perfect matched (PM)/mismatched (MM) pairs tiled every 5 bp on both strands of IGRs. We synchronized cultures of wild-type strain CB15N (Evinger and Agabian, 1977) and isolated RNA from 12 samples collected at 15 min intervals during a single cell cycle. To enhance detection sensitivity by enriching for sRNAs and removing cross-reactive species, the RNA was size-selected, retaining molecules ranging from ~35–500 nucleotides in length. The size-selected RNA was hybridized directly to the microarrays and hybridization was detected using an antibody that recognizes RNA : DNA hybrids (Zhang et al., 2003).

Identification of candidate RNAs

For every probe on the microarray, the signals for the 12 different microarray experiments form a vector with 12 elements. The vectors for any two probes assaying regions of RNA expressed as part of the same transcript should be correlated if the transcription levels for that RNA vary over the cell cycle. We computed the matrix of Pearson correlation coefficients for all probe pairs in every IGR. Before computing the correlation matrices, we pre-processed the data to reduce probe specific noise (see Experimental procedures). As only ~15% of mRNAs are estimated to be cell cycle regulated (Laub et al., 2000), we expected that the majority of sRNAs would not exhibit significant variation over the cell cycle. Thus, we expected the correlation signal for probe pairs within an RNA transcript that is constitutively expressed to be negligible. However, we found that if we did not normalize for interchip variation before correlating probe intensities (see Experimental procedures), we observed a correlation signal from expressed probes that are not cell cycle controlled. This signal is the result of systematic experimental variation in the isolation and detection of RNA from each sample that affects all RNAs in the same sample similarly. We exploited this signal to identify non-cell cycle-regulated sRNAs. We subsequently normalized for these systematic interchip variations (see Experimental procedures) to identify sRNAs whose expression is under cell cycle control. We devised an algorithm, based on analysis of correlation matrices, to find sets of highly correlated probes within these matrices. Our algorithm assigns likelihood scores to all sets of probes within an IGR, which are used to identify the best sRNA candidate in each IGR and to rank all candidate sRNAs. The algorithm is described in Experimental procedures.

Figure 1 shows the expression profiles and correlation matrices for two highly ranked sRNAs. In both of these cases, the probe correlation analysis identified regions of strong signal and predicted 5′-transcript boundaries within four nucleotides of the boundary determined by 5′ RACE analysis and within 17 nucleotides (Fig. 1A) and 14 nucleotides (Fig. 1B) of the 3′ boundary predicted using Transterm (Kingsford et al., 2007) to estimate rho-independent terminators. For the candidate sRNA located between CC2171 and CC2172, though there appear to be two non-responsive probes within the expressed region, the correlation analysis predicted a single transcript based on the presence of large numbers of adjacent correlated probes, and this was confirmed by Northern blotting (Table 1, Fig. 5). Table S2 provides a comparison of transcript boundaries predicted by correlation analysis with those determined by independent methods (5′ RACE analysis, transcriptional terminator prediction, and Northern blots) for 23 validated transcripts.

Fig. 1.

Fig. 1.

Transcripts identified by probe correlation analysis.

A. The transcript analysis of the sRNA between CC2171 and CC2172; (B) shows the transcript analysis of the sRNA between CC1840 and CC1841. In each panel the upper figure shows levels of RNA expression across the indicated intergenic region. The x-axis gives the location in the intergenic region, relative to the final nucleotide in the upstream ORF, of the central (13th) nucleotide in that 25-mer probe. Each differently coloured data set is from the series of probes tiled across the IGR at a different 15 min time point in the cell cycle over 165 min. Arrows indicate experimentally determined transcript boundaries. The middle figure in both panels shows colour coded correlation values, calculated using un-normalized data, for all signal vector-pairs in the indicated IGR. Red denotes high positive correlation, and blue denotes high negative correlation. Vertical black bars indicate transcript boundaries predicted by the correlation analysis. The values below the correlation chart in each panel compare sRNA transcript boundary estimates from the probe correlation analysis with those from 5′ RACE experiments or transcriptional terminator predictions.

Table 1.

sRNAs in Caulobacter crescentus.

Intergenic Region Left gene Right gene sRNA strand Estimated starta Estimated stopb Length (Northern)c Induction Comment
Known sRNAs
  CC0266_CC0267 281985 282044 na 4.5S RNA
  CC2563_CC2564 2774736 2774342 na RNase P
  CC3245_CC3246 3507103 3506914 na 6S RNA
  tmRNA na tmRNA
Cell cycle regulated sRNAs
  CC2167_CC2168 2371759-T 2371848 90/83 Expressed at swarmer-to-stalked cell transition
  CC3552_CC3555d 3804150 3804253 104 Expressed at swarmer-to-stalked cell transition
  CC0196_CC0197 210769-T 210642 131 Expressed in early stalked cells
  CC1316_tRNA Asn 1465670 1465727 58 Expressed in pre-divisional cells
sRNAs expressed under specific growth conditions
  CC0741_CC0743d 814107-T 814181 76 Stationary phase
  CC0804_CC0805A 894721-T 894546 176 Stationary phase
  CC3412_CC3413 3654964 3655112 143 Carbon starvation
  CC2642_CC2643 2863197-T 2863292 96 Minimal media
  CC3510_CC3511 3755518-T 3755383 131/66 Minimal media
  CC3664_CC3666Ad 3920055 3920190 135/87 Minimal media
  CC3664_CC3666B 3920305 3920409 105 Minimal media
  CC0804_CC0805B 894394-T 894602 209 Rich media Antisense: 5’ end CC0804
  CC0848_CC0849d 944024 944119 96 Rich media
  CC1840_CC1841 2033966-T 2033872 92 Rich media Promoter motif identified (McGrath et al., 2007)
sRNAs antisense to the 5’ end of transposase genes
  ISCc2 transposases (see Fig. 2) 144 The same region encoding a 144 bp sRNA is found upstream of 5 annotated ISCc2 transposase genes (see Fig. 2)
  CC2739_CC2740 2955126 2955037 84 Antisense: 5’ end IS1111A/ IS1328/IS1533 transposase (Fig. 2)
Other sRNAs
  CC0088_CC0089 96585-T 96459 119
  CC0446_CC0447 464843 464754 90 Antisense: 5’ end CC0447
  CC0618_CC0619 682188-T 682282 95
  CC0734_CC0735 800923 800809 128
  CC0745_CC0746 818302-T 818364 63
  CC0792_CC0793 874583-T 874673 91
  CC1017_CC1018 1149991-T 1150327 336/303/281
  CC1469_CC1470 1619154-T 1619343 190/133/108
  CC2171_CC2172 2378333-T 2378518 190
  CC3212_CC3213 3471772 3471894 123/93
  CC3513_CC3514 3759303-T 3759475 nd Antisense: 5’ end CC3513

(←) indicates transcription from the minus strand (→) indicates transcription from the plus strand.

a.

Start sites that were determined using 5′ RACE are shown in boldface. ‘-T’ indicates that the RACE products are TAP-dependent and thus likely to indicate sites of transcription initiation. In cases where RACE was not performed, if a rho-independent terminator was predicted, the 5′ boundary was estimated from the location of the terminator and the length of the sRNA, as determined by Northern blot. Otherwise, start sites were estimated from the correlation analysis. The 5′ sRNA boundaries for sRNAs between CC0446 and CC0447 and between CC2739 and CC2740 overlap mRNAs and were not considered in the correlation analysis. Start sites were estimated manually for these two transcripts.

b.

Boldface indicates that stop sites are estimated from the 3′ end of the stem belonging to a rho-independent terminator predicted with > 90% confidence using Transterm. In cases where a terminator was not predicted, if a transcriptional start site was determined by RACE, the 3′ boundary was estimated from the location of the start site and the length of the sRNA, as determined by Northern blot. Otherwise, stop sites were estimated from the correlation analysis.

c.

na indicates that Northern blots were not performed. nd indicates that no band was observed by Northern blotting.

d.

These sRNAs were not identified by correlation analysis but were instead detected were detected as described in Fig. S1.

Fig. 5.

Fig. 5.

Northern blots of constitutively expressed sRNAs. RNA was isolated from cells grown in M2G minimal media. Asterisks indicate sRNA bands consistent with boundary predictions made by correlation analysis in conjunction with 5′ RACE and Transterm terminator predictions (Table 1 and Table S2). DNA markers are shown at left. Arrows indicate direction of transcription.

sRNAs in C. crescentus

Table 1 shows 27 novel sRNAs which are predicted to be synthesized from their own transcription initiation sites (see Experimental procedures) and are not predicted to code for proteins by the Glimmer and Genemark ORF prediction algorithms (Lukashin and Borodovsky, 1998; Delcher et al., 2007). Of these 27, 26 were verified by Northern blots (Figs 25) and another was verified by 5′ RACE (see below). The sRNA 5′- and 3′ boundaries are from 5′ RACE experiments and/or rho-independent terminator predictions (Table 1, bold entries). Fourteen of the 27 novel sRNAs were among the top 100 in the correlation rankings, out of ~1800 ranked IGRs. Among the 100 highest ranking sRNA candidates, 18 were evaluated by Northern blotting and 5′ RACE, and 16 (14 novel and two previously identified) of these were validated (Table S1 and Fig. S1, Group A). We evaluated seven of 150 sRNA candidates ranked between 101 and 250, and only three novel sRNAs were validated (Table S1 and Fig. S1, Group B). Thus, the population of highest scoring candidates is enriched for validated sRNAs.

Fig. 2.

Fig. 2.

Transposase genes associated with predicted cis-antisense sRNAs.

A. Diagrams show a common 144 nucleotide (nt) cis-antisense sRNA adjacent to five independently positioned ISCc2 transposase genes (four of which are expressed from the positive strand and one of which is expressed from the negative strand). In each case, the 144 nt sRNA is 13 nt upstream of the transposase coding region. Below is shown an 84 nt cis-antisense sRNA whose transcript overlaps the IS1111A transposase gene mRNA by 32 nt. Arrows indicate direction of transcription. Sizes of overlapping regions and distances between transcripts are from identified transcriptional start sites (McGrath et al., 2007), indicated by the asterisk, or the predicted start codon of the encoded ORF.

Genome locations are in parentheses.

B. Northern blots showing expression of cis-antisense sRNAs. DNA markers are shown at left.

Ten of the novel 27 sRNAs were not ranked among the 250 highest scoring candidates (see Fig. S1 for a detailed description of the analysis scheme and our criteria for identifying the novel sRNAs). Five of these 10 were found in a search for stretches of consecutive highly expressed probes (Fig. S1, Group D and Table S1), and one was identified in an analysis of IGRs containing known transcription factor binding sites (Fig. S1, Group E). Three sRNAs (Fig. S1, Group C) were never considered in the correlation analysis because they were located in IGRs incorrectly annotated to contain an ORF, while one, located between CC0848 and CC0849, was overlooked because a higher scoring candidate was identified elsewhere in the same IGR.

The probe signal correlation analysis also detected three sRNAs previously identified as 4.5S RNA (Winzeler et al., 1997) the RNA subunit of RNaseP (Brown, 1999), and 6S RNA (Barrick et al., 2005). A fourth documented sRNA, the tmRNA, SsrA (Keiler et al., 2000), was detected, but it was tiled at too low a probe density for precise boundary estimation.

One sRNA, between CC3513 and CC3514 (Table 1), was undetectable in Northern blots, and it had low signal levels in the microarray assay, yet it was highly ranked in the probe correlation analysis (Table 2). Using tobacco acid pyrophosphatase (TAP)-mediated 5′ RACE (Bensing et al., 1996), we identified a TAP-dependent transcription initiation site 25 nucleotides into a 190 nucleotide region that is conserved in Caulobacter species K31, the closest sequenced relative to C. crescentus. A 13 bp hairpin overlaps the predicted 3′ boundary and may serve as a terminator. Although this sRNA was poorly expressed under our experimental conditions, its detection demonstrates the sensitivity of the probe correlation analysis method for sRNA identification.

Table 2.

Top 30 intergenic regions with the highest correlation scores.

Rank Intergenic region Strand Estimated start Estimated stop Length (bp) RNA class Score
  1 CC0681_CC0682a 74B046 747702 > 345 sRNA candidate 1606.5
  2 CC2336_CC2337 2539899 2539655 > 245 5’ UTR 920.8
  3 CC1749_CC17S0 + 1925003 1925322 > 320 5’ UTR- Cobalamin riboswitch 822.1
  4 CC2563_CC2564 2774736 2774342 395 sRNA- RNaseP 789.9
  5 CC0659_CC0660 731249 730860 390 Repeat element 64S.9
  6 CC3355_CC3356 3603521 3603307 215 5’ UTR- Glycine riboswitch 581.3
  7 CC2624_CC2625 2842353 2842034 > 320 rRNA 5’ leader 567.5
  8 CC0641_CC0642 709S07 708973 > 535 sRNA 544.0
  9 CC0495_CC0496 + 516251 516555 > 305 5’ UTR 543.4
10 CC0481_CC0482 + 502949 503233 255 5’ UTR- Cobalamin riboswitch 536.0
11 CC0996_CC0997 + 1119432 1119636 205 Newly annotated mRNA 514.2
12 CC0680_CC0681a 747411 747097 > 315 sRNA candidate 471.2
13 CC2922_CC2923 3144469 3144150 320 sRNA 401.6
14 CC2171_CC2172 + 2378337 2378501 165 sRNA 389.8
15 CC0661_CC0662 732770 732471 300 sRNA 378.2
16 CC1676_CC1677 + 1850888 1851057 > 170 5’ UTR 355.1
17 CC0924_CC0925 + 1025217 1025361 > 145 5’ UTR 341.8
18 CC3085_CC3086 3316272 3316068 205 Repeat element 330.3
19 CC3065_CC3066 + 3291812 3291931 120 Repeat element 316.1
20 CC3202_CC3203 34SB277 3458048 > 230 5’ UTR 285.4
21 CC3513_CC3514 + 37S92S1 3759475 225 sRNA 274.0
22 CC0875_CC0876 + 9701BB 970437 > 250 5’ UTR 261.2
23 CC0161_CC0162 + 170746 170925 180 5’ UTR 256.6
24 CC0639_CC0640 707BSS 707526 > 330 Newly annotated mRNA 248.3
25 CC0007_CC0008 + S496 5640 > 145 5’ UTR 239.7
26 CC2815_CC2816 3037029 3036795 235 sRNA 239.3
27 CC3249_CC3250 + 3S12S0S 3512674 170 5’ UTR 235.8
28 CC3705_CC3706 + 39611S7 3961351 > 195 3’ UTR 230.9
29 CC0241_CC0242 2S6210 256056 155 3’ UTR 228.9
30 CC3412_CC3413 + 36S4966 3655120 155 sRNA 208.5

Start and stop sites are estimated by correlation analysis. > indicates that expression continues into an adjacent ORF and an exact transcript length could not be determined.

a.

Expression from these two intergenic regions, along with expression from the intergenic region between CC0682 and CC0683, appears to be part of the same transcript. Updated ORF predictions indicate that CC0680–CC0682 were previously misannotated as ORFs. Multiple large transcripts (> 1 kB) were detected by Northern blot and a tobacco acid pyrophosphatase (TAP)-independent start site was identified by 5’ RACE. However, because of the large size and heterogeneity of these transcripts, we refer to these as sRNA candidates, rather than as validated sRNAs.

cis-antisense sRNAs

cis-antisense sRNAs are often found in plasmids, phage and transposons (Brantl, 2007). These sRNAs are expressed from the strand opposite an ORF and overlap the mRNA at either the 5′ or 3′ end. sRNA:mRNA base-pairing interactions can have a variety of positive and negative regulatory consequences (Opdyke et al., 2004; Brantl, 2007). We identified two transposase-associated sRNAs that probably belong to this class (Table 1). An identical 144 nucleotide TAP-dependent sRNA is transcribed from the opposite strand of five independent ISCc2 transposase genes (Fig. 2). An 84-nucleotide sRNA is transcribed from the opposite strand of the CC2740 transposase gene, overlapping the CC2740 mRNA by 32 nucleotides (Fig. 2). This configuration is similar to that observed for antisense sRNAs which function in transposition silencing (Simons and Kleckner, 1983; Arini et al., 1997).

Additional sRNAs were identified with boundaries overlapping annotated protein coding transcripts. A sRNA located between CC0446 and CC0447 overlaps the first 23 nucleotides of the CC0447 coding region, which encodes a putative beta-N-acetylhexosaminidase (Fig. 5). Two sRNAs are located between CC0804 and CC0805: The 212 nucleotide sRNA transcribed from the (+) strand (transcript B), that is preferentially expressed in rich media, overlaps 88 nucleotides of the 5′ end of the CC0804 transcript that encodes a proline dehydrogenase (Fig. 4D). The second sRNA, transcript A, is preferentially transcribed in stationary phase (Fig. 4A). It is 176 nucleotides long and overlaps the 212 nucleotide sRNA by 58 nucleotides on the opposite strand. This may provide a sRNA:sRNA regulatory interaction. The boundaries of seven more sRNAs are within 25 nucleotides of a predicted coding region on the opposite strand (Figs 35).

Fig. 4.

Fig. 4.

sRNAs expressed under specific growth conditions.

A. Two sRNAs were induced by entry into stationary phase. E, RNA isolated from cells grown in M2G media in exponential phase; ES, from cells in early stationary phase (OD = 1.0); S, from cells incubated an additional 24 h after reaching OD = 1.0.

B. The expression of a sRNA induced by carbon starvation. M2 indicates RNA taken from cells grown in M2G media, then washed and re-suspended in M2 media (M2G lacking glucose) and incubated for another 10 min. M2G indicates RNA taken from cells treated identically but washed and re-suspended in M2G media for the same amount of time.

C and D. Four sRNAs were upregulated in minimal media with glucose as the sole carbon source (C). Three sRNAs were upregulated in rich PYE media (D). M, RNA isolated from cells in exponential phase grown in M2G minimal media; P, RNA isolated from cells in exponential phase grown in PYE rich media. Asterisks indicate the sRNA bands on the Northern blots that are consistent with predictions made by probe correlation analysis in conjunction with 5′ RACE and Transterm terminator predictions. DNA markers are shown at left. Schematics are as described in Fig. 3.

Differentially expressed sRNAs

Fourteen of the 27 novel sRNAs in Table 1 were differentially expressed among the experimental conditions examined and have been categorized as cell cycle regulated or as induced in stationary phase, glucose starvation, rich media or minimal media.

Cell cycle-regulated sRNAs.

Approximately 15% of C. crescentus mRNAs have cell cycle-dependent expression patterns (Laub et al., 2000), as does the sRNA, tmRNA (Keiler and Shapiro, 2003a). Four of the novel sRNAs exhibited cell cycle-dependent expression profiles by both microarray assay and Northern blots (Fig. 3).

On Northern blots, the candidate sRNA expressed from the IGR between CC2167 and CC2168 migrated as two bands of 90 and 83 nucleotides in length (Fig. 3A). The sRNAs corresponding to each of these bands accumulated at the swarmer-to-stalked cell transition. RACE identified a TAP-dependent product and a single TAP-independent product seven nucleotides shorter (data not shown), suggesting that the shorter band is generated from a 5′ processing event (Fig. 3B).

Another sRNA, located between CC3552 and CC3555, also accumulated at the swarmer-to-stalked cell transition (Fig. 3C). Northern blots revealed a band of 104 nucleotides, and a strong terminator sequence signal is located near the predicted 3′ boundary (Fig. 3D). Directly upstream of this sRNA, there is a near perfect match to a motif upstream of a group of 26 diverse genes that are transcribed at the swarmer-to-stalked cell transition (McGrath et al., 2007).

Northern blots of the IGR between CC0196 and CC0197 showed a sRNA of 131 nucleotides whose expression peaked in stalked cells (Fig. 3E). 5′ RACE identified a TAP-dependent transcriptional start site, and the 3′ end of this transcript contains a strongly predicted terminator, yielding a transcript size that agrees with the band observed by Northern blots (Table 1 and Fig. 3F).

A temporally regulated sRNA between CC1316 and tRNA-Asn accumulates in pre-divisional cells (Fig. 3G). Examination of this region by Northern blot revealed that a transcript of 58 nucleotides accumulates in a pattern consistent with the microarray data (Fig. 3G). There is a predicted terminator (Fig. 3H) at a location consistent with the size of the RNA band on the Northern blot and the start site predicted by correlation analysis (Table S2). A sequence perfectly matching the consensus CtrA binding site (Quon et al., 1996; McGrath et al., 2007) is found 25 bp upstream of the predicted start site (Fig. 3H). CtrA is an essential cell cycle transcriptional regulator that directly controls over 95 genes involved in polar organelle development and cell division (Laub et al., 2002). Chromatin immunoprecipitation analysis previously showed that CtrA binds directly to the CtrA motif in this IGR (Laub et al., 2002), and the peak accumulation of the 58 nucleotide sRNA is at the time of appearance of the CtrA protein in the pre-divisional cell (Fig. 3I). This suggests that transcription of this sRNA is controlled by CtrA. Two less abundant bands of 221 nucleotides (not shown) and 148 nucleotides were also detected that did not show cell cycle regulated expression (Fig. 3G).

Stationary phase and carbon starvation induced sRNAs.

The expression profiles of six sRNAs showed a peak at the initial time points after synchronization and then declined rapidly. Many genes associated with responses to environmental stresses show this pattern in microarray gene expression assays, probably owing to stress from the synchronization procedure. In E. coli, a general stress response is induced in stationary phase (Lange and Hengge-Aronis, 1991) coincident with the accumulation of several sRNAs (Argaman et al., 2001; Wassarman et al., 2001; Vogel et al., 2003). To explore the basis for these C. crescentus sRNA stress response patterns, we compared Northern blot expression levels of all six sRNAs in log phase and in stationary phase. Two of the six sRNAs accumulated in stationary phase (Fig. 4A). The sRNA in the IGR between CC0741 and CC0743 accumulated in early stationary phase, and the sRNA between CC804 and CC805 (transcript A) accumulated in both early and late stationary phase.

These six sRNAs were also examined for a response to carbon starvation. Cultures were grown to mid-exponential phase in minimal media containing glucose as the sole carbon source and then washed and transferred to minimal media lacking glucose. Northern blot analysis was carried out for all six sRNAs. The 143 nucleotide sRNA located between CC3412 and CC3413 showed a significant increase in the sRNA level within 10 min of glucose removal (Fig. 4B), suggesting that this sRNA may mediate a rapid adaptive response to carbon starvation.

Minimal or rich media-specific sRNAs.

For seven of the sRNAs shown in Table 1, transcript levels were significantly different when cells were grown in minimal (M2 minimal media supplemented with 0.2% glucose, M2G) media and in rich media (peptone yeast extract, PYE). Northern blots for these seven sRNAs (Fig. 4C and D) show that when cells are grown in M2G (where glucose is the sole carbon source), four sRNAs, located in the IGRs between CC2642 and CC2643, between CC3510 and CC3511, and two sRNAs located between CC3664 and CC3666), accumulated to higher levels than in cells which are grown in PYE (where amino acids are the primary carbon source). Three novel sRNAs accumulate specifically in cells grown in PYE. They are located in the IGRs between CC0804 and CC0805 (transcript B), between CC0848 and CC0849, and between CC1840 and CC1841.

Additional sRNAs

Conditions affecting the expression of 11 of 27 sRNAs shown in Table 1 were not identified, so these sRNAs are categorized as ‘Other’ in Table 1. Northern blots for 10 of these sRNAs are in Fig. 5. In six cases, the Northern blots showed a single predominant band consistent with boundaries estimated from 5′ RACE, rho-independent terminator prediction and probe correlation analysis. In the four cases where multiple bands were observed, the 5′ ends of the sRNAs were mapped by RACE. For the sRNA located between CC0618 and CC0619, RACE identified a TAP-dependent start site consistent with the probe signals and the 95 nucleotide transcript seen in the Northern blot. For the sRNAs located between CC1017 and C1018 and between CC1469 and CC1470, RACE identified TAP-dependent start sites consistent with the sizes of the largest bands observed on Northern blots. The presence of shorter, TAP-independent, RACE products suggests that the lower bands were the result of 5′ processing events. For the sRNA between CC3212 and CC3213, we identified TAP-independent 5′ ends consistent with both the bands observed by Northern blot and the expression profile in the microarray data, although we did not identify a TAP-dependent product indicative of a transcription initiation site. As the upstream ORF is encoded on the opposite strand, this is unlikely to be a processed fragment of an mRNA, but rather a sRNA that is subjected to 5′ processing.

Discussion

We have identified and validated 27 novel sRNAs in C. crescentus by combining sRNA identification using tiled microarray probe correlation analysis with 5′ RACE analysis, transcriptional terminator prediction, and Northern blot analysis. This approach for global identification of sRNAs is applicable to any sequenced microbial species. The method requires inclusion of dense intergenic tiling in the microarrays designed for the species. The alternative of sRNA identification by comparative genomics analysis is only applicable for species where sequences of several closely related species are available. Only eight of the 27 novel sRNAs reported here are conserved in Caulobacter species K31, the closest sequenced relative of C. crescentus.

The sensitivity of our sRNA identification approach results from four considerations: (i) 5 bp tiling of the Affymetrix CauloHI1 microarray in the IGRs gave many independent sRNA measurements even within short RNAs (dense tiling also facilitates estimation of sRNA boundaries); (ii) size fractionation of the input RNA samples increased the representation of sRNAs and reduced cross-hybridization based on comparison of probe signals for sRNA-containing IGRs from our data with signals observed for the same IGRs in data sets obtained from reverse transcription of unfractionated total RNA (P.T. McGrath and H.H. McAdams, unpubl. data); (iii) RNA detection sensitivity was optimized by direct hybridization to the microarray coupled with detection using an antibody recognizing RNA:DNA hybrids (Zhang et al., 2003); and (iv) the application of the probe correlation analysis prior to the normalization of the data enabled detection of sRNAs whose expression level is independent of time in the cell cycle.

In addition to 27 novel sRNAs, we identified four previously known sRNAs. Additional candidate sRNAs were identified in possible UTR sequences, but these were not investigated in detail in this study. For example, the 30 highest ranked sRNA transcripts from the probe correlation analysis, shown in Table 2, include 14 fragments from 5′ and 3′ UTRs. Some of these are riboswitch elements which have been previously reported (Vogel et al., 2003; Kawano et al., 2005). Northern blot analysis showed that two of the UTR candidates shown in Table 2 (located between CC0241 and CC0242 and between CC3705 and CC3706) are discrete RNA species (data not shown) and thus may have a cellular function. In addition, three of the 30 highest ranked sRNA transcripts (Table 2) from the probe correlation analysis, as well as others in the 200 highest ranked RNA candidates, are expressed from palindromic repeat elements in IGRs. Several families of palindromic elements have been identified in the C. crescentus genome (Chen and Shapiro, 2003; P.T. McGrath and H.H. McAdams, unpublished). Northern blot analysis using probes for two families represented multiple times in the correlation rankings confirmed that they are transcribed.

Ten of the 27 novel C. crescentus sRNAs in Table 1 exhibit a differential response to nutritional or stationary phase challenge. Of these, expression of four is upregulated in minimal media (M2G), while three are upregulated in rich media (PYE). A sRNA gene between CC1840 and CC1841, which is downregulated in minimal media, has a promoter motif that is conserved in 13 genes with mRNA expression profiles seen for stress-response genes (McGrath et al., 2007). These sRNAs may be involved in adaptation to the different growth rates observed in rich media as compared with minimal media, or they may be involved in adaptation to levels of specific nutrients that differ between the two media. Global transcript profiles of C. crescentus grown in M2G or PYE media using oligonucleotide microarrays revealed 119 genes with mRNA levels significantly higher in M2G than in PYE media, and 88 genes with mRNA levels higher in PYE than in M2G (Hottes et al., 2004). The mRNAs encoding membrane proteins of the Ton-B dependent receptor and ABC-transporter families are highly represented among this group of differentially expressed genes (Hottes et al., 2004). sRNAs are known to play a critical role in adapting the protein composition of the membrane to environmental conditions (Antal et al., 2005; Guillier and Gottesman, 2006; Sharma et al., 2007). Additionally, C. crescentus possesses an ortholog of the Hfq RNA-binding protein found in many bacterial species (Sun et al., 2002), including all species where substantial numbers of sRNAs have been identified (Ostberg et al., 2004). Hfq is an important cofactor for the post-transcriptional regulatory function of many sRNAs that mediate adaptation to environmental conditions (Gottesman et al., 2006). We have found that several of the novel sRNAs described here bind C. crescentus Hfq, and an hfq mutant shows impaired survival under several different stress conditions (data not shown). These observations suggest that these sRNAs are involved in environmental adaptation.

The expression of four of the novel sRNAs is cell cycle regulated, and for two of these, the promoter regions contain previously identified regulatory motifs. McGrath et al. identified transcription start sites of 769 C. crescentus genes and 10 novel promoter motifs upstream of many cell cycle-regulated genes (McGrath et al., 2007). One of these motifs, identified as cc_7, found in the promoter region of 26 genes expressed at the swarmer-to-stalked cell transition, is present in the promoter region of the sRNA between CC3552 and CC3555 that is transcribed at the same time.Arecent report suggests that the cc_7 motif is bound by two ECF sigma factors, SigT and SigU (Alvarez-Martinez et al., 2007). A SigT deletion strain shows reduced viability under conditions of osmotic shock and oxidative stress (Alvarez-Martinez et al., 2007), suggesting a role for this sRNA in these stress conditions. The 58 nucleotide sRNA between CC1316 and tRNA Asn is controlled by the CtrA cell cycle regulatory protein. Evidence supporting this role for CtrA includes: (i) a consensus CtrA binding motif is upstream of this sRNA’s predicted transcriptional start site; (ii) the promoter region binds CtrA in chromatin immunoprecipitation experiments (Laub et al., 2002); and (iii) this sRNA accumulates only when the transiently expressed CtrA is present (Domian et al., 1997). In addition, a previously identified cell cycle-regulated C. crescentus sRNA, the tmRNA, SsrA, is required to synchronize the start of chromosome replication with other temporally ordered cell cycle events (Keiler and Shapiro, 2003b). Thus, sRNA-mediated cell cycle regulatory pathways may play a significant role in C. crescentus cell cycle control, and the identification of these sRNAs will help clarify these functional pathways.

Experimental procedures

Bacterial strains and growth conditions

All experiments were performed with C. crescentus strain CB15N (NA1000) (Evinger and Agabian, 1977), grown at 28°C in either PYE or M2G (Ely, 1991). For carbon starvation, cells were grown to OD 0.4 in M2G, washed three times in complete M2 media lacking glucose and re-suspended in the same media.

RNA isolation and detection by Affymetrix microarray

Cells were grown to OD 0.4 in M2G and synchronized using standard procedures (Evinger and Agabian, 1977). Aliquots were taken at 15 min intervals over ~1 cell cycle (165 min), pelleted and immediately frozen. Total RNA was isolated using Trizol (Invitrogen) according to the manufacturer’s protocol. Seventy-five micrograms of total RNA was sizefractionated on 5% PAGE 1× Tris borate EDTA (TBE) gels alongside RNA size standards. Lanes containing standards were stained with ethidium bromide. RNA ranging in size from ~35–500 nucleotides was excised and eluted using the Elutrap Electro-elution system (Schliecher and Schuell) in 1× TBE buffer for 6 h at 200 V. Eluted RNA was precipitated and re-suspended in DEPC-treated water.

Size-fractionated RNA (2 μg) was fragmented by incubation for 30 min at 95° in 50 mM Tris (pH 7.9)/100 mM NaCl/10 mM MgCl2, followed by ethanol precipitation and re-suspension in 50 μl of DEPC-treated water. RNA was hybridized to the CauloHI1 microarray (McGrath et al., 2007). Hybridizations were carried out as described by Zhang et al. (2003), except that the temperature was 50°C. Detection was carried out using a proprietary antibody that recognizes RNA : DNA hybrids (Digene Corporation) (Zhang et al., 2003). Detection was as described by Zhang, except that the antibody matrix solution contained 0.1% Tween-20 and 0.1× Superblock T20 phosphate buffered saline (PBS) Blocking buffer (Pierce) and 11 μg of antibody was used for each array. There was one microarray assay for each cell cycle time point.

Microarray data pre-processing

As a pre-processing step, the Affymetrix GCOS software was used to remove the probe signals that were anomalous. We then analysed the probe-pair signals, computed as the difference between the PM and MM probe signals for that probe-pair for the microarray data sets for all 12 cell cycle time samples for every tiled IGR on the array. Every case of a probe-pair signal negative outlier more than five standard deviations below the mean signal level for its array led to removal of that signal from all the microarray data sets. For the majority of the C. crescentus IGRs, probe pairs on the CauloHI1 microarray are spaced every five nucleotides. If probes were not spaced five nucleotides apart or where probes were removed during pre-processing, we used linear interpolation to produce a data set of uniformly spaced points.

Detection of non-cell cycle regulated sRNAs

In our analysis, we used the difference between the PM and MM probe signals as the indicator of RNA abundance. For every microarray, there is systematic noise that affects this signal level in a microarray-wide manner, as well as noise arising from the conditions at the individual probes (e.g. spurious cross-hybridization). The systematic variations arise from small differences in nominally identical sample preparation and microarray processing. Formal analysis of these effects with error models including both systematic and nonsystematic error sources are in (Ideker et al., 2000; Li and Wong, 2001). We found that if the probe correlation analysis method was performed before normalizing the microarrays, these systematic variations, which affected all probes within the same sRNA on each microarray identically, produced a correlation signal adequate to identify even non-cell cycle regulated sRNAs.

Small-RNA scoring algorithm

For each probe position, the probe-pair signals from each time point were treated as a vector with 12 elements that we call the signal vector. We computed the matrix of Pearson correlation coefficients for all signal vector pairs within every IGR, and we observed that the empirical distribution of the correlation coefficient between signal vectors within IGRs is a ‘bell-shaped’ curve that is approximately symmetric around zero (see Fig. S2). From this empirical probability density function, we estimated the probability P that the correlation between any two signal vectors would be above a threshold correlation t purely by chance. For an adjacent series of L probes there are L(L-1)/2 pair wise combinations. If these L probes are all bound by the same sRNA, we expect that over a series of experimental conditions, the correlation between pairs of the signal vectors within the sRNA footprint will be significantly higher on average than between random signal pairs. That hypothesis is the basis for our candidate sRNA detection and ranking procedure. For each set of five or more consecutive probes within an IGR, we counted the number k of signal vector pairs with a correlation above a threshold t. We then computed the probability that k out of the N = L(L-1)/2 probe pair combinations would have a correlation of at least t, using the binomial distribution function B(k; N,P), where N is the total number of probe-pair combinations in the set, P is the probability of a single pair having a correlation above t, and k is the number of signal vector pairs in the set with correlation above t. The probability P was determined from the empirical probability distribution function for random pair wise correlations (Fig. S2). The resulting P = B(k; N,P) value was transformed to a score s =-log(P). We scored every possible set of five or more consecutive probes within each IGR, and we considered the highest scoring of these transcripts within each IGR as a candidate sRNA. We considered sets of five or more probes because this corresponds to a sRNA of ~50 nucleotides, which is near the size of the smallest known sRNAs (Hershberg et al., 2003). The ranked list of the 300 highest ranking sRNA candidates is available at http://www.stanford.edu/group/caulobacter/smallRNA/. Rankings were found to be relatively insensitive to variations in the choice of the correlation threshold t. All IGRs having more than 100 nucleotides were also examined to identify highly expressed sRNAs with low scores in the correlation analysis (Fig. S1). (These highly expressed sRNAs could saturate the microarray assay and thus yield an anomalously low correlation score.)

To eliminate signals arising from UTR fragments, the 769 transcriptional start sites (McGrath et al., 2007), transcriptional terminator predictions, and intergenic expression patterns from the two previous Affymetrix microarray studies in C. crescentus (Hu et al., 2005; McGrath et al., 2007) were used to estimate boundaries of adjacent mRNAs and identify overlap with sRNA candidates. If no sRNA : mRNA overlap was observed, and the predicted sRNA boundaries were > 40 nucleotides from adjacent ORFs, or if the expression patterns observed for the sRNA candidate and the adjacent mRNA differed over the course of the cell cycle, the sRNA was considered to be independently transcribed.

Annotation-related considerations

The CauloHI1 microarray was designed based on the C. crescentus genome annotation at the time of completion of the sequence (Nierman et al., 2001). As annotation tools have improved over the intervening years, we used contemporary tools to predict ORFs and in the analysis of the CauloHI1 data sets. ORFs were predicted for the C. crescentus genome using Glimmer3 (ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Caulobacter_crescentus/NC_002696.Glimmer3) (Delcher et al., 2007) and GeneMark.hmm (http://opal.biology.gatech.edu/GeneMark/prokaryotes_database/index.cgi) (Lukashin and Borodovsky, 1998). These new ORF predictions differ somewhat from the annotation used in the CauloHI1 chip design. There were 202 DNA regions formerly identified as ORFs that are classified as IGRs in our revised genome analysis (Table S4). As the regions now reclassified as IGRs are not tiled with probes at 5 bp spacing on the CauloHI1 chip, we examined any sets of probe pairs present in these regions for evidence of sRNAs (Fig. S1). Three sRNAs were identified and experimentally verified in this set of IGRs (Group C in Table S1). Rho-independent terminators were predicted with TransTerm (http://transterm.cbcb.umd.edu/cgi-bin/transterm/predictions.pl) (Kingsford et al., 2007). Terminators predicted with a confidence of > 90% were considered valid.

Microarray data set normalization and identification of cell cycle-regulated sRNAs

To identify temporally regulated sRNAs, the 12 microarray data sets were normalized so that every data set had the same mean expression value. Using these normalized data sets and the sRNA boundary estimates in Table 1, the cell cycle expression profiles for candidate sRNAs were computed. Values for each time point were computed by averaging the probe signals within the predicted sRNA. The sRNA temporal expression profiles were analysed for evidence of cell cycle dependent expression patterns. We required sRNAs to have at least a threefold difference between minimum and maximum expression level over the cell cycle to be classified as temporally regulated.

Northern blotting

Northern blots were performed with DNA oligonucleotide probes (Table S3) 5′ end-labelled with 32P-ATP. Low Molecular Weight DNA ladder (New England Biolabs) was used to estimate the sizes of RNA bands. Total RNA (5–10 μg) was separated on 6% PAGE-urea gels and transferred to Hybond N+ membranes (Amersham) in 0.5× TBE using the TransBlot Semi-dry Transfer Apparatus (Bio-Rad) according to manufacturers instructions. Blots were prehybridized in Ultrahyboligo buffer (Ambion) according to instructions.

RACE

5′ RACE was performed as described by Argaman (Argaman et al., 2001). Sequences for all adapters and PCR primers are given in Table S3. Forty PCR cycles were performed at 53–55°C melting temperatures using 0.5–4 μl reverse transcription reaction and Platinum Taq High Fidelity polymerase (Invitrogen).

Supplementary Material

Supplemental 18373523

Acknowledgements

This work was supported by NIH grants GM51426 and GM32506 to L.S. and by DOE grants DE-FG03ER63219A001 and DE-FG02ER63219 to H.H.M. The author S.G.L. was supported by the Stanford NIH Genome Training Program.

Footnotes

Supplementary material

This material is available as part of the online article from: http://www.blackwell-synergy.com/doi/abs/10.1111/j.1365–2958.2008.06172.x

(This link will take you to the article abstract).

Please note: Blackwell Publishing is not responsible for the content or functionality of any supplementary materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

Accession numbers

The accession number for the genome sequence of Caulobacter species K31 is AATH00000000.

References

  1. Alvarez-Martinez CE, Lourenco RF, Baldini RL, Laub MT, and Gomes SL (2007) The ECF sigma factor sigma(T) is involved in osmotic and oxidative stress responses in Caulobacter crescentus. Mol Microbiol 66: 1240–1255. [DOI] [PubMed] [Google Scholar]
  2. Antal M, Bordeau V, Douchin V, and Felden B. (2005) A small bacterial RNA regulates a putative ABC transporter. J Biol Chem 280: 7901–7908. [DOI] [PubMed] [Google Scholar]
  3. Argaman L, Hershberg R, Vogel J, Bejerano G, Wagner EG, Margalit H, and Altuvia S. (2001) Novel small RNA-encoding genes in the intergenic regions of Escherichia coli. Curr Biol 11: 941–950. [DOI] [PubMed] [Google Scholar]
  4. Arini A, Keller MP, and Arber W. (1997) An antisense RNA in IS30 regulates the translational expression of the transposase. Biol Chem 378: 1421–1431. [DOI] [PubMed] [Google Scholar]
  5. Babitzke P, and Romeo T. (2007) CsrB sRNA family: sequestration of RNA-binding regulatory proteins. Curr Opin Microbiol 10: 156–163. [DOI] [PubMed] [Google Scholar]
  6. Barrick JE, Sudarsan N, Weinberg Z, Ruzzo WL, and Breaker RR (2005) 6S RNA is a widespread regulator of eubacterial RNA polymerase that resembles an open promoter. Rna 11: 774–784. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bensing BA, Meyer BJ, and Dunny GM (1996) Sensitive detection of bacterial transcription initiation sites and differentiation from RNA processing sites in the pheromone-induced plasmid transfer system of Enterococcus faecalis. Proc Natl Acad Sci USA 93: 7794–7799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Brantl S. (2007) Regulatory mechanisms employed by cis-encoded antisense RNAs. Curr Opin Microbiol 10: 102–109. [DOI] [PubMed] [Google Scholar]
  9. Brown JW (1999) The ribonuclease P database. Nucleic Acids Res 27: 314. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Chen SL, and Shapiro L. (2003) Identification of long intergenic repeat sequences associated with DNA methylation sites in Caulobacter crescentus and other alphaproteobacteria. J Bacteriol 185: 4997–5002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Del Val C, Rivas E, Torres-Quesada O, Toro N, and Jimenez-Zurdo JI (2007) Identification of differentially expressed small non-coding RNAs in the legume endosymbiont Sinorhizobium meliloti by comparative genomics. Mol Microbiol 66: 1080–1091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Delcher AL, Bratke KA, Powers EC, and Salzberg SL (2007) Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics 23: 673–679. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Domian IJ, Quon KC, and Shapiro L. (1997) Cell type-specific phosphorylation and proteolysis of a transcriptional regulator controls the G1-to-S transition in a bacterial cell cycle. Cell 90: 415–424. [DOI] [PubMed] [Google Scholar]
  14. Ely B. (1991) Genetics of Caulobacter crescentus. Methods Enzymol 204: 372–384. [DOI] [PubMed] [Google Scholar]
  15. Evinger M, and Agabian N. (1977) Envelope-associated nucleoid from Caulobacter crescentus stalked and swarmer cells. J Bacteriol 132: 294–301. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Gottesman S, McCullen CA, Guillier M, Vanderpool CK, Majdalani N, Benhammou J, et al. (2006) Small RNA regulators and the bacterial response to stress. Cold Spring Harb Symp Quant Biol 71: 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Guillier M, and Gottesman S. (2006) Remodelling of the Escherichia coli outer membrane by two small regulatory RNAs. Mol Microbiol 59: 231–247. [DOI] [PubMed] [Google Scholar]
  18. Hershberg R, Altuvia S, and Margalit H. (2003) A survey of small RNA-encoding genes in Escherichia coli. Nucleic Acid Res 31: 1813–1820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Hottes AK, Meewan M, Yang D, Arana N, Romero P, McAdams HH, and Stephens C. (2004) Transcriptional profiling of Caulobacter crescentus during growth on complex and minimal media. J Bacteriol 186: 1448–1461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hu P, Brodie EL, Suzuki Y, McAdams HH, and Andersen GL (2005) Whole-genome transcriptional analysis of heavy metal stresses in Caulobacter crescentus. J Bacteriol 187: 8437–8449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Ideker T, Thorsson V, Siegel AF, and Hood LE (2000) Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data. J Comput Biol 7: 805–817. [DOI] [PubMed] [Google Scholar]
  22. Kawano M, Reynolds AA, Miranda-Rios J, and Storz G. (2005) Detection of 5′- and 3′-UTR-derived small RNAs and cis-encoded antisense RNAs in Escherichia coli. Nucleic Acids Res 33: 1040–1050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Keiler KC, and Shapiro L. (2003a) tmRNA in is cell cycle regulated by temporally controlled transcription and RNA degradation. J Bacteriol 185: 1825–1830. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Keiler KC, and Shapiro L. (2003b) TmRNA is required for correct timing of DNA replication in Caulobacter crescentus. J Bacteriol 185: 573–580. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Keiler KC, Shapiro L, and Williams KP (2000) tmRNAs that encode proteolysis-inducing tags are found in all known bacterial genomes: a two-piece tmRNA functions in Caulobacter. Proc Natl Acad Sci USA 97: 7778–7783. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kingsford CL, Ayanbule K, and Salzberg SL (2007) Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake. Genome Biol 8: R22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Lange R, and Hengge-Aronis R. (1991) Identification of a central regulator of stationary-phase gene expression in Escherichia coli. Mol Microbiol 5: 49–59. [DOI] [PubMed] [Google Scholar]
  28. Laub MT, McAdams HH, Feldblyum T, Fraser CM, and Shapiro L. (2000) Global analysis of the genetic network controlling a bacterial cell cycle. Science 290: 2144–2148. [DOI] [PubMed] [Google Scholar]
  29. Laub MT, Chen SL, Shapiro L, and McAdams HH (2002) Genes directly controlled by CtrA, a master regulator of the Caulobacter cell cycle. Proc Natl Acad Sci USA 99: 4632–4637. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Lenz DH, Mok KC, Lilley BN, Kulkarni RV, Wingreen NS, and Bassler BL (2004) The small RNA chaperone Hfq and multiple small RNAs control quorum sensing in Vibrio harveyi and Vibrio cholerae. Cell 118: 69–82. [DOI] [PubMed] [Google Scholar]
  31. Li C, and Wong WH (2001) Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci USA 98: 31–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Livny J, Fogel MA, Davis BM, and Waldor MK (2005) sRNAPredict: an integrative computational approach to identify sRNAs in bacterial genomes. Nucleic Acids Res 33: 4096–4105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Livny J, Brencic A, Lory S, and Waldor MK (2006) Identification of 17 Pseudomonas aeruginosa sRNAs and prediction of sRNA-encoding genes in 10 diverse pathogens using the bioinformatic tool sRNAPredict2. Nucleic Acids Res 34: 3484–3493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Lukashin AV, and Borodovsky M. (1998) GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res 26: 1107–1115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. McGrath PT, Lee H, Zhang L, Iniesta AA, Hottes AK, Tan MH, et al. (2007) High-throughput identification of transcription start sites, conserved promoter motifs and predicted regulons. Nat Biotechnol 25: 584–592. [DOI] [PubMed] [Google Scholar]
  36. Mandin P, Repoila F, Vergassola M, Geissmann T, and Cossart P. (2007) Identification of new noncoding RNAs in Listeria monocytogenes and prediction of mRNA targets. Nucleic Acids Res 35: 962–974. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Nierman WC, Feldblyum TV, Laub MT, Paulsen IT, Nelson KE, Eisen JA, et al. (2001) Complete genome sequence of Caulobacter crescentus. Proc Natl Acad Sci USA 98: 4136–4141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Opdyke JA, Kang JG, and Storz G. (2004) GadY, a small-RNA regulator of acid response genes in Escherichia coli. J Bacteriol 186: 6698–6705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Ostberg Y, Bunikis I, Bergstrom S, and Johansson J. (2004) The etiological agent of Lyme disease, Borrelia burgdorferi, appears to contain only a few small RNA molecules. J Bacteriol 186: 8472–8477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Pichon C, and Felden B. (2005) Small RNA genes expressed from Staphylococcus aureus genomic and pathogenicity islands with specific expression among pathogenic strains. Proc Natl Acad Sci USA 102: 14249–14254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Quon KC, Marczynski GT, and Shapiro L. (1996) Cell cycle control by an essential bacterial two-component signal transduction protein. Cell 84: 83–93. [DOI] [PubMed] [Google Scholar]
  42. Repoila F, Majdalani N, and Gottesman S. (2003) Small non-coding RNAs, co-ordinators of adaptation processes in Escherichia coli: the RpoS paradigm. Mol Microbiol 48: 855–861. [DOI] [PubMed] [Google Scholar]
  43. Rivas E, and Eddy SR (2001) Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics 2: 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Sharma CM, Darfeuille F, Plantinga TH, and Vogel J. (2007) A small RNA regulates multiple ABC transporter mRNAs by targeting C/A-rich elements inside and upstream of ribosome-binding sites. Genes Dev 21: 2804–2817. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Simons RW, and Kleckner N. (1983) Translational control of IS10 transposition. Cell 34: 683–691. [DOI] [PubMed] [Google Scholar]
  46. Sun X, Zhulin I, and Wartell RM (2002) Predicted structure and phyletic distribution of the RNA-binding protein Hfq. Nucleic Acids Res 30: 3662–3671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Vogel J, Bartels V, Tang TH, Churakov G, Slagter-Jager JG, Huttenhofer A, and Wagner EG (2003) RNomics in Escherichia coli detects new sRNA species and indicates parallel transcriptional output in bacteria. Nucleic Acids Res 31: 6435–6443. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Wassarman KM, Repoila F, Rosenow C, Storz G, and Gottesman S. (2001) Identification of novel small RNAs using comparative genomics and microarrays. Genes Dev 15: 1637–1651. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Winzeler E, Wheeler R, and Shapiro L. (1997) Transcriptional analysis of the Caulobacter 4.5 S RNA ffs gene and the physiological basis of an ffs mutant with a Ts phenotype. J Mol Biol 272: 665–676. [DOI] [PubMed] [Google Scholar]
  50. Zhang A, Wassarman KM, Rosenow C, Tjaden BC, Storz G, and Gottesman S. (2003) Global analysis of small RNA and mRNA targets of Hfq. Mol Microbiol 50: 1111–1124. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental 18373523

RESOURCES