Skip to main content
Journal of Bacteriology logoLink to Journal of Bacteriology
. 2008 Oct 31;191(1):23–31. doi: 10.1128/JB.01017-08

Bioinformatics Resources for the Study of Gene Regulation in Bacteria

Julio Collado-Vides 1,*, Heladia Salgado 1, Enrique Morett 2, Socorro Gama-Castro 1, Verónica Jiménez-Jacinto 1, Irma Martínez-Flores 1, Alejandra Medina-Rivera 1, Luis Muñiz-Rascado 1, Martín Peralta-Gil 1, Alberto Santos-Zavaleta 1
PMCID: PMC2612416  PMID: 18978060

Genomics, which has been identified as the science of the century, is dramatically changing the historically weak relationship between experimental and theoretical biology. The addition to the Journal of Bacteriology of a section for computational biology marks a turning point in the history of this dialogue. This minireview is focused on the computational biology of gene regulation in bacteria, defined as the extensive use of bioinformatics tools to increase our understanding of the regulation of gene expression.

The study of gene regulation has been radically affected by the elucidation of full-genome DNA sequences and the subsequent development of high-throughput methodologies for deciphering their expression. Before the genomics era, most research was focused on individual biological systems. A large number of our colleagues, contributors to this journal, have devoted much of their academic careers to the understanding of individual regulatory units describing operons, regulators, and promoters and their roles in the physiology of the cell. These contributions have provided fundamental information to support the most recent efforts for the integrative knowledge of the cell that all the genomics sciences are achieving.

Genomics offers for the microbiologist studying gene regulation the opportunity to understand individual systems in the context of the whole cell. These integrative sciences have also changed the landscape of data available for new discoveries in the evolution of gene regulation. The major challenge of the genomics era is dealing with large amounts of data at all molecular levels and being able to generate integrated biological knowledge from the data. Bioinformatics is essential to progress in this direction, as it provides what is necessary to deal with large amounts of data: databases, algorithms to generate genomic answers to standard questions, overviews, and navigation capabilities, as well as statistical methods to perform and validate analyses.

Current knowledge of gene regulation in prokaryotes is quite diverse, from the constantly increasing number of full genome sequences for which very little experimental work has been performed, including the many genomes that cannot yet be grown in the laboratory or the little-studied archaea, to the few highly characterized bacterial genomes, such as those of Bacillus subtilis and Pseudomonas aeruginosa, with Escherichia coli K-12 being by far the best-known bacterium and free-living organism. Figure 1 shows this strongly unequal distribution of knowledge, with the number of publications on gene regulation, as an example, for the different microbial genomes.

FIG. 1.

FIG. 1.

Number of published particles per organism. We searched through PubMed by using a collection of keywords that we regularly use to gather information for RegulonDB across different bacteria. The profile requires the name of the organism to be in the title.

The first exhaustive historical set of regulated promoters and their associated transcription factors (TFs) and TF DNA-binding sites (TFBSs) gathered around 120 σ70 and σ54 promoters of E. coli K-12 (16, 29). This information and the experience obtained were the seeds for what is now RegulonDB (http://regulondb.ccg.unam.mx/), the original source of expert curated knowledge on the regulation of transcription initiation and operon organization in E. coli K-12. It contains what is currently the major electronically encoded regulatory network for any free-living organism (25). This information is also contained in EcoCyc, the E. coli model organism database, with added curated knowledge on metabolism and transport (http://ecocyc.org/). We estimate that currently around 25% of the interactions of the full cellular regulatory network of transcription initiation have been assembled. RegulonDB should not be conceived of as a database, but as an environment for genomic regulatory investigations, linked to bioinformatics tools that facilitate analyses of upstream regions, together with data sets and tools for microarray analyses and, more recently, direct access to full papers supporting its knowledge. We are not only curating up-to-date original papers, but have also initiated “active annotation,” to use Jean-Michelle Claverie's terminology, to more precisely experimentally map promoters by using a high-throughput strategy.

This minireview has been organized taking into account the fact that most of the readers of the Journal of Bacteriology are experimentalists. We start with a few examples of lessons on gene regulation from bioinformatics, focusing on promoters, their definition and regulation, and operon structure. The second section summarizes how RegulonDB has been useful to experimentalists, as well as its role as the “gold standard” for implementing bioinformatics predictive methods, topological analyses of the network, and models of the cell. The last section offers a compendium of links to bioinformatics resources on gene regulation in bacteria, illustrating their usage with flowcharts associated with questions on the regulation of gene expression that are entailed in the use of RegulonDB and associated bioinformatics resources. We illustrate the very efficient text analysis obtained using Textpresso to access more than 2,400 full papers that support the E. coli regulatory network. A cautionary note: the examples of the three sections are biased toward cases in E. coli and its RegulonDB regulatory database. This bias is natural, since first of all, our direct experience is with RegulonDB, and second, especially for bacterial-gene regulation (Fig. 1), we may well quote Fred Neidhardt (87): “Not everyone is mindful of it, but all cell biologists have two organisms of interest: the one they are studying and Escherichia coli!”

LESSONS FROM OVERVIEWS IN GENE REGULATION: PROMOTER DEFINITION, PROXIMAL-SITE REQUIREMENT FOR RNAP-σ70, AND OPERON STRUCTURE

Historically, knowledge of gene regulation started with a model of the lac operon, its cis-regulatory elements, and the notion of allosterism in E. coli (43, 65). Quickly, this model, based on repression, had to be expanded to accommodate the more elaborated positive mechanisms of gene regulation. Since then, we have witnessed a gradual expansion in the diversity of knowledge about the molecular anatomy of gene regulation and the rich mechanisms that together compose the decision machinery of the cell.

The discovery of a conserved motif in E. coli promoters, the −10 box, also called the Pribnow box, is a striking example of how a pattern that was visually discovered in as little as seven DNA sequences has remained valid (75). This was an early contribution of bioinformatics to the study of gene regulation. Certainly, the identification of a promoter as the physical region for the binding of the RNA polymerase (RNAP) results from the combination of transcription initiation experimental mapping and pattern recognition, initially by visual inspection and now performed by multiple-alignment methods (37, 39). Promoters in this sense come from a combination of experimental and bioinformatics evidence.

We should keep in mind, though, that gathering a large collection of data in biology does not guarantee that we can make sense of it or that new knowledge will emerge. We illustrate this by showing what we have learned from an exhaustive genomic collection of components of the regulatory network, that is, regulated promoters and the relative distances to their corresponding activator and repressor binding sites, and with a second example demonstrating how a very simple analysis of operon structure enabled the development of a method to predict operons in E. coli and then in any bacterial genome.

As mentioned above, in 1991, we gathered knowledge about around 120 promoters in E. coli K-12 (16, 29), and we have continued since then, increasing the data set (see http://regulondb.ccg.unam.mx/html/Database_summary.jsp for the history of the accumulated curated knowledge). One of the clearest lessons from that collection effort was the conclusion that regulation of transcription initiation in the case of the σ70 holoenzymes (Eσ70) always requires a proximal DNA site (a proximal site is defined in terms of its position relative to the transcription initiation site, so that a direct contact of the TF with RNAP is assumed) (53, 74). Seventeen years later, version 6.2 of RegulonDB (July 2008) has 1,754 promoters. Of these, 697 are σ70 promoters, 421 of which have at least one characterized binding site for a TF. This collection of promoters has 1,382 associated binding sites with coordinate positions. The definition of proximal sites in that initial review was from −65 to +20 (16); however, single activation and coactivation were later reported from −90 with cyclic AMP receptor protein (CRP) (12, 13, 116). We also learned, after 1991, that the flexibility of the α C-terminal domain of RNAP expands the proximal region to around −100, supporting direct contact with CRP (5, 57). Thus, in principle, the range of positions enabling direct contact with RNAP can be set from −95 to +20. Analyses with current data show that only 26 promoters, accounting for less than 5.9% of all regulated promoters, currently lack sites within this proximal range. In principle, there should be no promoter subject to regulation from only remote positions, other than σ54 promoters (67), and we will not discuss them in detail here. The distribution of proximal sites in this range is shown in Fig. 2. We can observe the same general tendencies discussed in the 1991 review, with repressors distributed across all of the proximal region, with the downstream −30 to +20 interval being dominant. Repressors prevent RNAP from interacting with the proximal region, between −30 and +20, or from preventing interactions of activators at −40, −50, and −60 central positions. The purB gene, cotranscribed with hflD, is repressed by PurR with only one operator located at +892.5. It could interfere with transcription initiation or act as a roadblock and obstruct the progress of the transcribing polymerase (30, 33).

FIG. 2.

FIG. 2.

Distribution of TFBSs. RegulonDB version 6.2 has 697 σ70 promoters, 421 of which have at least one characterized binding site for a TF. The figure displays the distribution of central positions of activator and repressor DNA-binding sites in the −95 to +20 interval. The percentage of promoters was divided by the number of activator or repressor DNA-binding sites with the center position within each interval of 10 bp. This figure can be compared to Fig. 2 in reference 16.

As shown in Fig. 2, all activators, not only CRP, tend to prefer interacting near positions −40 and −70. The strongly diminished occupancy around −50 has been shown to be due to the binding of the α C-terminal domain of RNAP (4, 83). On the other hand, some activators overlap the promoter and can bind downstream of the +1 site. TFs of the MerR family activate transcription binding at a palindrome located between the −35 and −10 elements of the promoter (71, 93). Other regulators activate transcription through auxiliary binding sites located between −10 and +20 (38). A striking, well-documented case is activation from +1848 by IHF, a TF known to bend the DNA to favor transcription (1).

Understanding gene regulation requires a detailed knowledge of operon structure. Thanks to the efforts of Mary Berlyn, around 1998 we populated RegulonDB with an initial set of putative operons that we have since been expanding using both experimental information and predictions based on the rules learned from the original sets. Figure 3 shows the distribution of intergenic distances within operons (572 in 2002 versus 1,839 in 2008) and its contrasting set of distances of genes at boundaries of operons in the same direction of transcription (346 in 2002 versus 1,311 in 2008). This striking, clear-cut distribution of very short distances within operons as opposed to intergenic regions upstream of operons was the basis for the prediction of operons even for genes without an assigned function in the E. coli genome, and subsequently in many bacterial genomes (66, 84). As mentioned below, a high-quality curated corpus of knowledge such as this one has supported the development of bioinformatics methods capable of predicting many aspects of the regulatory network.

FIG. 3.

FIG. 3.

Intergenic distances of genes within and at transcription unit boundaries. The sharply different distributions of these distances enabled the use of a direct method to predict transcription units in the complete E. coli genome. This figure is very similar to Fig. 3 in reference 84.

We recall how the definitions of promoter elements are currently the basis of more elaborate bioinformatics methods for pattern recognition. We briefly discuss two examples in which gathering large amounts of data for individual cases has been fruitful in studying the biology of gene regulation: the examples of (i) promoters and their regulation and (ii) operon structure. It is true, though, that these are large collections of “the same types of stamps.” The major challenge in genomics, as we have said, is nonetheless a different one, that of integrating a particular system or gene and its product with expression patterns of similarly regulated genes as the cellular environment changes, for instance.

HOW AN ELECTRONIC CORPUS ON GENE REGULATION HAS BEEN USEFUL FOR EXPERIMENTALISTS

Here, we illustrate how the experimental and bioinformatics scientists studying gene regulation, not only in E. coli but also in many other organisms, have made extensive use of RegulonDB to gain insights into several aspects of gene regulation. We note that RegulonDB contains detailed, accurate, and up-to-date information about operon organization, regulatory DNA sites for TFs, promoters, terminators, and RNA regulatory elements that have been both experimentally determined and predicted. Together, these elements constitute the known transcriptional regulatory network of E. coli. The primary source of information for this section was obtained from questions addressed to the database by users and a search of the literature for articles citing or making use of RegulonDB. We are certain that these questions are valid for other databases and for any other bacterial species. Table 1 provides a list of selected articles published since the creation of RegulonDB, emphasizing the purposes for which they have used RegulonDB. People studying gene expression or modulon architectures by using microarrays or proteomics data, for many wild-type and mutant strains grown under different conditions, have taken into account the operon structure in E. coli to make sense of experimental data. For instance, in work by Yooseph et al. (114), the 7.7 million Global Ocean Sampling sequences were analyzed using the collection of transcription units in RegulonDB for their statistical analysis and to identify same-operon gene pairs. RegulonDB has also been used for the identification of regulatory binding sites and determination of how this information correlates with gene expression in wild-type and TF mutant strains. Promoter and TF binding site mapping by using genomic strategies (chromatin immunoprecipitation [ChIP]-chip) have also relied on this database as the source of primary information, or even proteomics data. In the genome-wide location of RNAP promoter sites using ChIP-chip, the 961 identified promoters in RegulonDB were used to set the 26% negative-detection rate (35).

TABLE 1.

Uses of RegulonDB

Description Reference
Experimental
    Gene expression analysis (microarray)
        Operons improve estimation for cDNA microarrays 113
        Large-scale validation of regulation from expression profiles 22
        Cross talk between the plasmid and the chromosome 32
        Microarray of a Shewanella oneidensis etrA mutant 3
        Clustering gene expression data with error information 100
        Affymetrix microarray coexpression of genes in operons 31
        Quantitative description of large-scale microarray data 86
        Analysis of the NsrR regulon 23
        The Rcs phosphorelay and intrinsic antibiotic resistance 52
        Growth defects and cross-regulation of gene expression 92
        σS-dependent genes and their promoters 50
        DNA adenine methyltransferase and gene regulation 89
        The CRP regulon: in vitro and in vivo transcriptional profiling 115
        Genome-wide expression analysis of FNR regulation 46
        Microarrays of genes in quorum sensing 19
        Transcriptome analysis of E. coli 101
        An extended regulon of the methionine repressor 62
        Transcriptome polymorphism in E. coli/Shigella species 54,88
    Global RNA half-life analysis and patterns of transcript degradation
        Early osmostress gene expression using microarrays 111
        Transcriptome determination of transcription regulators 47
        Constraint-based in silico models of E. coli 80
    Analysis of ChIP-chip data
        RNA polymerase binding sites by ChIP-chip 35
    Analysis of protein abundance
        Protein abundance profiling of the cytosol 42
    Analysis regulatory mechanisms
        Outer membrane vesicle and membrane instability 64
    DNA sequence annotation
        The symbiotic plasmid of Rhizobium etli CFN42 27
    Analysis of specific biological systems
        Mutant release factor 1 and 16S rRNA maturation 45
        The PTS system of Vibrio fischeri 108
    Analysis of gene expression dynamics
        SoxRS-dependent transcriptional networks 7
    Metagenomics
        The Global Ocean Sampling: expanding protein families 114
Bionformatics
    Regulatory-network prediction
        Prediction of new members of regulons 96
        Predicting transcriptional regulatory interactions 107
    Promoter prediction
        Recognition and prediction of σ70 promoters 56
        Improved prediction of transcription start sites 28
        Improving promoter prediction in E. coli 11
        Transcription factor prediction 73
    Predicted transcriptional regulators in E. coli
    DNA-binding site prediction
        Genomic prediction of transcriptional regulatory sites 98
        Binding sites in bacterial genomes 55
        TF binding sites in E. coli and Streptomyces coelicolor 51
    Operon prediction
        Operon prediction in Pyrococcus furiosus 103
        Operon prediction without expt 6
        A Bayesian network approach to operon prediction 8
    Analysis of promoters
        Computational promoter analyses in 32 genomes 91
    Orthologous regulatory identification
        Orthologous TFs have different functions 77
        Transcription in archaea 49
    Architecture of operons, promoters, or DNA-binding sites
        Gene expression with combinatorial promoters 17
        Positional distribution of DNA motifs in promoter regions 14
        Transcriptional units and operons in Bacillus and E. coli 70
    Evolutionary analysis of elements involved in transcription
        Evolution of noncoding DNA in prokaryotic genomes 82
        Evolution of DNA regulatory regions for proteogamma bacteria 78
        Co-volution of TFs depends on mode of regulation 36
    Prediction of gene/protein biological function
        Gene Function Predictor based on context analysis 21
        Protein function and linkages based on genome organization 95
        Inference of functional relationships from predicted operons 44
    Inferring the role of transcription factors
        Inferring the role of TFs in regulatory networks 106
    Source for other databases
        Multidimensional annotation of the E. coli K-12 genome 48
        Regulatory interactions in gammaproteobacterial genomes 72
Modeling and topology of the network
    Modeling the overall dynamics of the network
        Genetic and metabolic regulatory networks of Bacillus 26
        Reconstruction of microbial transcriptional regulatory networks 34
    Metabolic network and connectivity analyses
        Hierarchy and modularity in metabolic networks 79
        Low-degree metabolites in biological networks 85
        Regulatory network analysis
        Logical types of network control in gene expression 63
        Dynamics in regulatory networks from four kingdoms 2
        Gene expression in negatively autoregulated circuits 61
    Novel topological concepts
        Network motifs in the regulation network of E. coli 90

The corpus of knowledge on gene regulation has been essential for diverse implementations relying on bioinformatics. As depicted in Table 1, this corpus provides the means to generate and test predictions for a large collection of regulatory elements, such as promoters, TFs, and TFBSs; complete genomic repertoires of TFs and operons; and even unidentified network interactions. It has also served the purpose of modeling the overall dynamics of the network and for proposing novel biological concepts, such as the network motifs reported by the group of Uri Alon (90) or the notion of hierarchical and modular networks described by Ravasz and colleagues (79).

Annotated genomes and ingenious ways to transfer knowledge, with all the associated risks of assumptions of orthologous relationships, enable us nonetheless to estimate that the wiring of the regulatory network is different across organisms, especially among bacteria (59, 60). We know better now that the evolutionary origin of regulatory interactions in bacteria depends on gene duplication and specialization, operon reorganization, binding-site duplications, and horizontal gene transfer (76, 97).

A COMPENDIUM OF BIOINFORMATICS RESOURCES FOR STUDYING BACTERIAL GENE REGULATION AND INTRODUCTORY PROTOCOLS

Designing a representation of the rich and variable knowledge of biological systems in order to encode it into a formal database management system, together with the corresponding data-gathering and curation processes, is one of the major infrastructure-building efforts that characterize computational biology. This is apparent if one examines the year's first volume of Nucleic Acids Research, which is devoted to databases. We did not find an integrated collection of databases and bioinformatics tools devoted to gene regulation; therefore, we have gathered here an exhaustive selection of resources specifically dealing with gene regulation in bacteria. Based on the compendium gathered by Galperin in 2008 (24) plus Ecoli Hub (http://www.ecolicommunity.org/) and BIOPAX pathway databases (94), we identified approximately 100 different resources (from 240) that were directly related to prokaryotic-gene regulation. Our compendium groups sites, among others, for TFs and gene regulation (e.g., devoted specifically to the AraC/XylR families [102] or to TFs [110]); for RNAs; and for biological pathways and regulatory networks, microarray databases, and some other related themes, such as signal transduction pathways (http://genomics.ornl.gov/mist/), protein-protein interactions, genome databases, the published literature, and metadatabases. Users should be aware that we did not specify for each resource the date of its last update, which is quite variable. This compendium should be a useful resource for those interested in searching or analyzing regulatory features across bacterial genomes. The name of the site and its URL address, together with a short description and a list of tools available, can now be accessed at http://regulondb.ccg.unam.mx/Additional_resources.jsp.

Several resources have easily implemented documentation, tutorials, and demonstrations to help the user. Even though bioinformatitians devoted to database construction and maintenance invest important efforts in the design and implementation of user-friendly interfaces, it is not uncommon for first-time users to have trouble finding the best way to address their questions of interest. Let us take the following question as an example: “If I have a gene, how can I find out all that is known about its regulation and operon organization?” This simple question generates a rather complex and rich answer, including the sequences, coordinates, and regulatory effects of every single TFBS and all promoters of the gene, as well as its operon organization. Figure 4 shows a flowchart that explains how this information can be obtained in a few navigation steps, which in this case all occur within RegulonDB. Many other resources and databases display this same information differently. For instance, the PRODORIC genome browser has the ability to zoom in to the level of the DNA sequence, showing the binding sites and promoters (69). RegulonDB is linked to Gene Expression Tools (GetTools) (40), as well as to the Regulatory Sequences Analysis Tools (RSAT) website (99), and contains a suite of tools built to predict and analyze regulatory regions for 663 available microbial genomes, in addition to 62 eukaryotic ones. These tools are designed to answer questions related to groups of genes suspected to be coregulated, a very common subject whenever a research group has done a microarray experiment, or a group of genes from a ChIP-chip (or ChIP-sequence) experiment. Figure 5 shows a flowchart for a given set of genes from a ChIP-chip experiment with LexA (109). RSAT generates a collection of upstream sequences given the gene set as input, while RegulonDB contains a position-specific matrix that was derived from the collection of experimentally characterized binding sites (TFBSs). These matrices can be used to scan sequences in order to predict putative target TFBSs in the whole set of upstream regions of the genome and can then be displayed in a graph. The RSAT team has just published several protocols for a variety of similar questions. The main goal of these introductory protocols is to illustrate and motivate the use of bioinformatics resources for the study of gene regulation by illustrative questions, showing in comprehensible flowcharts an easy way to answer them. In work reported by Defrance et al. (18), flowcharts and protocols describe in detail how to discover the TFBSs common to a regulon obtained from RegulonDB, or any other bacterial database, in fact. The whole interaction network of E. coli reported in RegulonDB can also be analyzed (10). Given a set of names of commonly expressed genes from, for instance, a microarray or ChIP-chip experiment with any bacterial genome, using RSAT one can obtain the upstream regions and search for common motifs or TF binding sites and cis-regulatory modules (104).

FIG. 4.

FIG. 4.

Flowchart for gathering all regulation information for a single gene. Navigation options are shown, starting from the main page of RegulonDB with the name of a gene, melA or melR in this example. The MelR-CRP complex regulon is shown.

FIG. 5.

FIG. 5.

Flowchart for ChIP-chip data and genes with similar DNA-binding site motifs. The example uses as input a set of genes from a ChIP-chip experiment with LexA (109). RSAT (99, 105) was used to obtain the collection of upstream sequences, given the ChIP-chip gene set. The position-specific matrix (PSSM) for LexA was obtained by selecting Downloads → Data sets → Matrix alignment from the RegulonDB main menu. Then, it was pasted into RSAT to run a matrix scan. This program will search, given a threshold, for predicted sites in the complete set of upstream regions of the genome, and the results can be automatically obtained in a graphic display by using the feature map program.

EcoCyc contains a large collection of graphic and text displays, including a genome browser that shows genes by gene ontology class, operons, and all elements of gene regulation in a region of the genome, and it provides a network display of regulatory interactions, in addition to the Omics viewers that enable the user to display pathways and regulatory networks based on an input file, for example (48). Graphic displays of a set of genes can be obtained for several genomes with the PRODONET tool (http://www.prodonet.tu-bs.de/). For many bacteria, given a set of genes, their functional classes can be obtained (http://www.jprogo.de/). Several graphics tools are available to show the genomic context of a gene within its genome, as well as the contexts of its orthologs within several related genomes (see, for example, GeConT [15] or http://img.jgi.doe.gov/cgi-bin/pub/main.cgi). The number of plausible elaborated questions is certainly quite large. We invite the interested reader to use the compendium of resources and to see additional flowcharts by visiting RegulonDB at http://regulondb.ccg.unam.mx/Flow_charts.jsp. We believe these will serve as examples that the user can modify and use to find, intuitively, equivalent usages in other bacterial databases.

ACCESS TO SPECIFIC GENE REGULATION LITERATURE AND FULL PAPERS REQUIRES A SEPARATE MENTION

Textpresso, a powerful text-mining engine for studying the scientific literature (68), was implemented for RegulonDB (http://regulondb.ccg.unam.mx/Textpresso/), and 2,472 full-text papers, 3,125 abstracts, and more than 4,200 curator notes can be directly searched (81). This valuable tool allows the experimental researcher to search through categories, keywords, and ontology classes with the specific gene, promoter, operon, or TF of interest through the knowledge space of full papers that support the electronically encoded transcriptional regulatory network of E. coli.

CONCLUSIONS

As we have mentioned, genomics has changed the focus from individual systems to an understanding of the whole cell. The study of gene regulation, as well as almost any other aspect of modern molecular and cellular biology, requires bioinformatics tools and methods to manipulate and analyze the large amounts of available information to eventually generate a more integrated perspective and knowledge of the cell. The global analysis of TFBSs and their positions relative to transcription initiation illustrates a vivid combination of details of individual systems and the search for a unified understanding based on the need for bound TFs to interact with RNAP. This is one example, among many others, of what genomic perspectives offer as opportunities and challenges. To paraphrase Whitehead (112), integration of gene regulation involves both the cautious gathering of details and a passion for understanding.

The bioinformatics infrastructure for microbial-gene regulation involves an important effort to maintain and update the constantly increasing body of knowledge in this field resulting from the accumulation of experiments performed in many laboratories through the years. In fact, this paper celebrates the 10 years since the first publication of RegulonDB (41). Of course, no matter how much human effort databases involve, they remain tips of icebergs compared to what is found in each paper. New methodologies are helping to manage in more intelligent ways the large amounts of information, such as computational access to query a full-text specific corpus of literature or the scientific community's participation in EcoliWiki within the EcoliHub (http://www.ecolicommunity.org/), and many others.

This review will fulfill its purpose if it facilitates the appreciation by experimentalists of the usefulness of databases and programs devoted to the study of gene regulation. The emphasis of this review has been on the use of bioinformatics resources, not on their implementation or the challenges ahead, which are numerous. For instance, what is the best way to curate experiments as new high-throughput technologies emerge? There is a lot of work to be done in order to expand the prominent databases beyond the level of transcription initiation and to integrate the evolving knowledge about other levels of gene regulation, for instance, those of single molecules and single-cell experiments (20). Furthermore, feedback loops of the type supporting multistationarity and stochasticity (20, 58) or those responsible for the return of systems to their initial state, have not yet been systematically described in databases (however, see Goelzer et al. [26] for further information). A major challenge, given the increasing number of sequenced genomes, will be to generate predictive tools that can expand our knowledge of a few bacteria to a similar extent for the many more organisms for which there is currently much less experimental support (9).

The study of the regulation of gene expression of bacterial systems will no doubt make important contributions in this century of genomics, both to the understanding at the molecular level of the capabilities of the basic unit of life, a single cell, and in the many potential technological applications. Multidisciplinary teams and future, younger generations are welcome to this enterprise.

Acknowledgments

We dedicate this article to our colleagues and collaborators who have contributed to RegulonDB through the 10 years since its first publication. The experimental work behind RegulonDB has been mainly carried out by L. Olvera, M. Olvera, and A. Mendoza. J.C.-V. also recognizes long-term collaborations with Jacques van Helden, Rick Gourse, Robert Gunsalus, and Jim Hu, as well as important discussions with Jaime Mora. We acknowledge the comments and suggestions by the editor and two anonymous reviewers, which motivated major modifications to the previous version of this review.

This work was funded by the National Institutes of Health, grants number R01 GM071962-05 and GM077678, and by UNAM, PAPIIT grant number IN214905.

Footnotes

Published ahead of print on 31 October 2008.

REFERENCES

  • 1.Abouhamad, W. N., and M. D. Manson. 1994. The dipeptide permease of Escherichia coli closely resembles other bacterial transport systems and shows growth-phase-dependent expression. Mol. Microbiol. 141077-1092. [DOI] [PubMed] [Google Scholar]
  • 2.Balleza, E., E. R. Alvarez-Buylla, A. Chaos, S. Kauffman, I. Shmulevich, and M. Aldana. 2008. Critical dynamics in genetic regulatory networks: examples from four kingdoms. PLoS ONE 3e2456. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Beliaev, A. S., D. K. Thompson, M. W. Fields, L. Wu, D. P. Lies, K. H. Nealson, and J. Zhou. 2002. Microarray transcription profiling of a Shewanella oneidensis etrA mutant. J. Bacteriol. 1844612-4616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Belyaeva, T. A., J. A. Bown, N. Fujita, A. Ishihama, and S. J. Busby. 1996. Location of the C-terminal domain of the RNA polymerase alpha subunit in different open complexes at the Escherichia coli galactose operon regulatory region. Nucleic Acids Res. 242242-2251. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Belyaeva, T. A., V. A. Rhodius, C. L. Webster, and S. J. Busby. 1998. Transcription activation at promoters carrying tandem DNA sites for the Escherichia coli cyclic AMP receptor protein: organisation of the RNA polymerase alpha subunits. J. Mol. Biol. 277789-804. [DOI] [PubMed] [Google Scholar]
  • 6.Bergman, N. H., K. D. Passalacqua, P. C. Hanna, and Z. S. Qin. 2007. Operon prediction for sequenced bacterial genomes without experimental information. Appl. Environ. Microbiol. 73846-854. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Blanchard, J. L., W. Y. Wholey, E. M. Conlon, and P. J. Pomposiello. 2007. Rapid changes in gene expression dynamics in response to superoxide reveal SoxRS-dependent and independent transcriptional networks. PLoS ONE 2e1186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Bockhorst, J., M. Craven, D. Page, J. Shavlik, and J. Glasner. 2003. A Bayesian network approach to operon prediction. Bioinformatics 191227-1235. [DOI] [PubMed] [Google Scholar]
  • 9.Bonneau, R., M. T. Facciotti, D. J. Reiss, A. K. Schmid, M. Pan, A. Kaur, V. Thorsson, P. Shannon, M. H. Johnson, J. C. Bare, W. Longabaugh, M. Vuthoori, K. Whitehead, A. Madar, L. Suzuki, T. Mori, D. E. Chang, J. Diruggiero, C. H. Johnson, L. Hood, and N. S. Baliga. 2007. A predictive model for transcriptional control of physiology in a free living cell. Cell 1311354-1365. [DOI] [PubMed] [Google Scholar]
  • 10.Brohee, S., K. Faust, G. Lima-Mendez, O. Sand, R. Janky, G. Vanderstocken, Y. Deville, and J. van Helden. 2008. NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways. Nucleic Acids Res. 36W444-W451. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Burden, S., Y. X. Lin, and R. Zhang. 2005. Improving promoter prediction for the NNPP2.2 algorithm: a case study using Escherichia coli DNA sequences. Bioinformatics 21601-607. [DOI] [PubMed] [Google Scholar]
  • 12.Busby, S., and R. H. Ebright. 1999. Transcription activation by catabolite activator protein (CAP). J. Mol. Biol. 293199-213. [DOI] [PubMed] [Google Scholar]
  • 13.Busby, S., D. West, M. Lawes, C. Webster, A. Ishihama, and A. Kolb. 1994. Transcription activation by the Escherichia coli cyclic AMP receptor protein. Receptors bound in tandem at promoters can interact synergistically. J. Mol. Biol. 241341-352. [DOI] [PubMed] [Google Scholar]
  • 14.Casimiro, A. C., S. Vinga, A. T. Freitas, and A. L. Oliveira. 2008. An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance. BMC Bioinform. 989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ciria, R., C. Abreu-Goodger, E. Morett, and E. Merino. 2004. GeConT: gene context analysis. Bioinformatics 202307-2308. [DOI] [PubMed] [Google Scholar]
  • 16.Collado-Vides, J., B. Magasanik, and J. D. Gralla. 1991. Control site location and transcriptional regulation in Escherichia coli. Microbiol. Rev. 55371-394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Cox, R. S., III, M. G. Surette, and M. B. Elowitz. 2007. Programming gene expression with combinatorial promoters. Mol. Syst. Biol. 3145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Defrance, M., R. Janky, O. Sand, and J. van Helden. 2008. Using RSAT oligo-analysis and dyad-analysis tools to discover regulatory signals in nucleic sequences. Nat. Protoc. 31589-1603. [DOI] [PubMed] [Google Scholar]
  • 19.DeLisa, M. P., C. F. Wu, L. Wang, J. J. Valdes, and W. E. Bentley. 2001. DNA microarray-based identification of genes controlled by autoinducer 2-stimulated quorum sensing in Escherichia coli. J. Bacteriol. 1835239-5247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Elf, J., G. W. Li, and X. S. Xie. 2007. Probing transcription factor dynamics at the single-molecule level in a living cell. Science 3161191-1194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Enault, F., K. Suhre, and J. M. Claverie. 2005. Phydbac “Gene Function Predictor”: a gene annotation tool based on genomic context analysis. BMC Bioinform. 6247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Faith, J. J., B. Hayete, J. T. Thaden, I. Mogno, J. Wierzbowski, G. Cottarel, S. Kasif, J. J. Collins, and T. S. Gardner. 2007. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 5e8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Filenko, N., S. Spiro, D. F. Browning, D. Squire, T. W. Overton, J. Cole, and C. Constantinidou. 2007. The NsrR regulon of Escherichia coli K-12 includes genes encoding the hybrid cluster protein and the periplasmic, respiratory nitrite reductase. J. Bacteriol. 1894410-4417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Galperin, M. Y. 2008. The Molecular Biology Database Collection: 2008 update. Nucleic Acids Res. 36D2-D4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Gama-Castro, S., V. Jimenez-Jacinto, M. Peralta-Gil, A. Santos-Zavaleta, M. I. Penaloza-Spinola, B. Contreras-Moreira, J. Segura-Salazar, L. Muniz-Rascado, I. Martinez-Flores, H. Salgado, C. Bonavides-Martinez, C. Abreu-Goodger, C. Rodriguez-Penagos, J. Miranda-Rios, E. Morett, E. Merino, A. M. Huerta, L. Trevino-Quintanilla, and J. Collado-Vides. 2008. RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res. 36D120-D124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Goelzer, A., F. B. Brikci, I. Martin-Verstraete, P. Noirot, P. Bessieres, S. Aymerich, and V. Fromion. 2008. Reconstruction and analysis of the genetic and metabolic regulatory networks of the central metabolism of Bacillus subtilis. BMC Syst. Biol. 220. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Gonzalez, V., P. Bustos, M. A. Ramirez-Romero, A. Medrano-Soto, H. Salgado, I. Hernandez-Gonzalez, J. C. Hernandez-Celis, V. Quintero, G. Moreno-Hagelsieb, L. Girard, O. Rodriguez, M. Flores, M. A. Cevallos, J. Collado-Vides, D. Romero, and G. Davila. 2003. The mosaic structure of the symbiotic plasmid of Rhizobium etli CFN42 and its relation to other symbiotic genome compartments. Genome Biol. 4R36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Gordon, J. J., M. W. Towsey, J. M. Hogan, S. A. Mathews, and P. Timms. 2006. Improved prediction of bacterial transcription start sites. Bioinformatics 22142-148. [DOI] [PubMed] [Google Scholar]
  • 29.Gralla, J. D., and J. Collado-Vides. 1996. Organization and function of transcription regulatory elements, p. 1232-1245. In F. C. Neidhardt, R. Curtiss III, J. Ingraham, E. C. C. Lin, K. B. Low, B. Magasanik, W. Reznikoff, M. Schaechter, H. E. Umbarger, and M. Riley (ed.), Cellular and molecular biology: Escherichia coli and Salmonella, 2nd ed. ASM Press, Washington, DC.
  • 30.Green, S. M., T. Malik, I. G. Giles, and W. T. Drabble. 1996. The purB gene of Escherichia coli K-12 is located in an operon. Microbiology 1423219-3230. [DOI] [PubMed] [Google Scholar]
  • 31.Harr, B., and C. Schlotterer. 2006. Comparison of algorithms for the analysis of Affymetrix microarray data as evaluated by co-expression of genes in known operons. Nucleic Acids Res. 34e8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Harr, B., and C. Schlotterer. 2006. Gene expression analysis indicates extensive genotype-specific crosstalk between the conjugative F-plasmid and the E. coli chromosome. BMC Microbiol. 680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.He, B., J. M. Smith, and H. Zalkin. 1992. Escherichia coli purB gene: cloning, nucleotide sequence, and regulation by purR. J. Bacteriol. 174130-136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Herrgard, M. J., M. W. Covert, and B. O. Palsson. 2004. Reconstruction of microbial transcriptional regulatory networks. Curr. Opin. Biotechnol. 1570-77. [DOI] [PubMed] [Google Scholar]
  • 35.Herring, C. D., M. Raffaelle, T. E. Allen, E. I. Kanin, R. Landick, A. Z. Ansari, and B. O. Palsson. 2005. Immobilization of Escherichia coli RNA polymerase and location of binding sites by use of chromatin immunoprecipitation and microarrays. J. Bacteriol. 1876166-6174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Hershberg, R., and H. Margalit. 2006. Co-evolution of transcription factors and their targets depends on mode of regulation. Genome Biol. 7R62. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Hertz, G. Z., and G. D. Stormo. 1999. Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15563-577. [DOI] [PubMed] [Google Scholar]
  • 38.Hudson, J. M., and M. G. Fried. 1991. The binding of cyclic AMP receptor protein to two lactose promoter sites is not cooperative in vitro. J. Bacteriol. 17359-66. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Huerta, A. M., and J. Collado-Vides. 2003. σ70 promoters in Escherichia coli: specific transcription in dense regions of overlapping promoter-like signals. J. Mol. Biol. 333261-278. [DOI] [PubMed] [Google Scholar]
  • 40.Huerta, A. M., J. D. Glasner, R. M. Gutierrez-Rios, F. R. Blattner, and J. Collado-Vides. 2002. GETools: gene expression tool for analysis of transcriptome experiments in E. coli. Trends Genet. 18217-218. [Google Scholar]
  • 41.Huerta, A. M., H. Salgado, D. Thieffry, and J. Collado-Vides. 1998. RegulonDB: a database on transcriptional regulation in Escherichia coli. Nucleic Acids Res. 2655-59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Ishihama, Y., T. Schmidt, J. Rappsilber, M. Mann, F. U. Hartl, M. J. Kerner, and D. Frishman. 2008. Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics 9102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Jacob, F., and J. Monod. 1961. Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. 3318-356. [DOI] [PubMed] [Google Scholar]
  • 44.Janga, S. C., J. Collado-Vides, and G. Moreno-Hagelsieb. 2005. Nebulon: a system for the inference of functional relationships of gene products from the rearrangement of predicted operons. Nucleic Acids Res. 332521-2530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Kaczanowska, M., and M. Ryden-Aulin. 2004. Temperature sensitivity caused by mutant release factor 1 is suppressed by mutations that affect 16S rRNA maturation. J. Bacteriol. 1863046-3055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Kang, Y., K. D. Weber, Y. Qiu, P. J. Kiley, and F. R. Blattner. 2005. Genome-wide expression analysis indicates that FNR of Escherichia coli K-12 regulates a large number of genes of unknown function. J. Bacteriol. 1871135-1160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Kao, K. C., Y. L. Yang, R. Boscolo, C. Sabatti, V. Roychowdhury, and J. C. Liao. 2004. Transcriptome-based determination of multiple transcription regulator activities in Escherichia coli by using network component analysis. Proc. Natl. Acad. Sci.USA 101641-646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Karp, P. D., I. M. Keseler, A. Shearer, M. Latendresse, M. Krummenacker, S. M. Paley, I. Paulsen, J. Collado-Vides, S. Gama-Castro, M. Peralta-Gil, A. Santos-Zavaleta, M. I. Penaloza-Spinola, C. Bonavides-Martinez, and J. Ingraham. 2007. Multidimensional annotation of the Escherichia coli K-12 genome. Nucleic Acids Res. 357577-7590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kyrpides, N. C., and C. A. Ouzounis. 1999. Transcription in archaea. Proc. Natl. Acad. Sci. USA 968545-8550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Lacour, S., and P. Landini. 2004. σS-dependent gene expression at the onset of stationary phase in Escherichia coli: function of σS-dependent genes and identification of their promoter sequences. J. Bacteriol. 1867186-7195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Laing, E., K. Sidhu, and S. J. Hubbard. 2008. Predicted transcription factor binding sites as predictors of operons in Escherichia coli and Streptomyces coelicolor. BMC Genomics 979. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Laubacher, M. E., and S. E. Ades. 2008. The Rcs phosphorelay is a cell envelope stress response activated by peptidoglycan stress and contributes to intrinsic antibiotic resistance. J. Bacteriol. 1902065-2074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Law, E. C., N. J. Savery, and S. J. Busby. 1999. Interactions between the Escherichia coli cAMP receptor protein and the C-terminal domain of the alpha subunit of RNA polymerase at class I promoters. Biochem. J. 337415-423. [PMC free article] [PubMed] [Google Scholar]
  • 54.Le Gall, T., P. Darlu, P. Escobar-Paramo, B. Picard, and E. Denamur. 2005. Selection-driven transcriptome polymorphism in Escherichia coli/Shigella species. Genome Res. 15260-268. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Li, H., V. Rhodius, C. Gross, and E. D. Siggia. 2002. Identification of the binding sites of regulatory proteins in bacterial genomes. Proc. Natl. Acad. Sci. USA 9911772-11777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Li, Q. Z., and H. Lin. 2006. The recognition and prediction of σ70 promoters in Escherichia coli K-12. J. Theor. Biol. 242135-141. [DOI] [PubMed] [Google Scholar]
  • 57.Lloyd, G. S., S. J. Busby, and N. J. Savery. 1998. Spacing requirements for interactions between the C-terminal domain of the alpha subunit of Escherichia coli RNA polymerase and the cAMP receptor protein. Biochem. J. 330413-420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Losick, R., and C. Desplan. 2008. Stochasticity and cell fate. Science 32065-68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Lozada-Chavez, I., S. C. Janga, and J. Collado-Vides. 2006. Bacterial regulatory networks are extremely flexible in evolution. Nucleic Acids Res. 343434-3445. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Madan Babu, M., S. A. Teichmann, and L. Aravind. 2006. Evolutionary dynamics of prokaryotic transcriptional regulatory networks. J. Mol. Biol. 358614-633. [DOI] [PubMed] [Google Scholar]
  • 61.Maithreye, R., R. R. Sarkar, V. K. Parnaik, and S. Sinha. 2008. Delay-induced transient increase and heterogeneity in gene expression in negatively auto-regulated gene circuits. PLoS ONE 3e2972. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Marincs, F., I. W. Manfield, J. A. Stead, K. J. McDowall, and P. G. Stockley. 2006. Transcript analysis reveals an extended regulon and the importance of protein-protein co-operativity for the Escherichia coli methionine repressor. Biochem. J. 396227-234. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Marr, C., M. Geertz, M. T. Hutt, and G. Muskhelishvili. 2008. Dissecting the logical types of network control in gene expression profiles. BMC Syst. Biol. 218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.McBroom, A. J., A. P. Johnson, S. Vemulapalli, and M. J. Kuehn. 2006. Outer membrane vesicle production by Escherichia coli is independent of membrane instability. J. Bacteriol. 1885385-5392. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Monod, J., J. P. Changeux, and F. Jacob. 1963. Allosteric proteins and cellular control systems. J. Mol. Biol. 6306-329. [DOI] [PubMed] [Google Scholar]
  • 66.Moreno-Hagelsieb, G., and J. Collado-Vides. 2002. A powerful non-homology method for the prediction of operons in prokaryotes. Bioinformatics 18(Suppl. 1)S329-S336. [DOI] [PubMed] [Google Scholar]
  • 67.Morett, E., and M. Buck. 1988. NifA-dependent in vivo protection demonstrates that the upstream activator sequence of nif promoters is a protein binding site. Proc. Natl. Acad. Sci. USA 859401-9405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Muller, H. M., E. E. Kenny, and P. W. Sternberg. 2004. Textpresso: an ontology-based information retrieval and extraction system for biological literature. PLoS Biol. 2e309. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Munch, R., K. Hiller, A. Grote, M. Scheer, J. Klein, M. Schobert, and D. Jahn. 2005. Virtual Footprint and PRODORIC: an integrative framework for regulon prediction in prokaryotes. Bioinformatics 214187-4189. [DOI] [PubMed] [Google Scholar]
  • 70.Okuda, S., S. Kawashima, K. Kobayashi, N. Ogasawara, M. Kanehisa, and S. Goto. 2007. Characterization of relationships between transcriptional units and operon structures in Bacillus subtilis and Escherichia coli. BMC Genomics 848. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Outten, C. E., F. W. Outten, and T. V. O'Halloran. 1999. DNA distortion mechanism for transcriptional activation by ZntR, a Zn(II)-responsive MerR homologue in Escherichia coli. J. Biol. Chem. 27437517-37524. [DOI] [PubMed] [Google Scholar]
  • 72.Perez, A. G., V. E. Angarica, A. T. Vasconcelos, and J. Collado-Vides. 2007. Tractor_DB (version 2.0): a database of regulatory interactions in gamma-proteobacterial genomes. Nucleic Acids Res. 35D132-D136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Perez-Rueda, E., and J. Collado-Vides. 2000. The repertoire of DNA-binding transcriptional regulators in Escherichia coli K-12. Nucleic Acids Res. 281838-1847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Pinkney, M., and J. G. Hoggett. 1988. Binding of the cyclic AMP receptor protein of Escherichia coli to RNA polymerase. Biochem. J. 250897-902. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Pribnow, D. 1975. Nucleotide sequence of an RNA polymerase binding site at an early T7 promoter. Proc. Natl. Acad. Sci. USA 72784-788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Price, M. N., P. S. Dehal, and A. P. Arkin. 2008. Horizontal gene transfer and the evolution of transcriptional regulation in Escherichia coli. Genome Biol. 9R4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Price, M. N., P. S. Dehal, and A. P. Arkin. 2007. Orthologous transcription factors in bacteria have different functions and regulate different genes. PLoS Comput. Biol. 31739-1750. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Rajewsky, N., N. D. Socci, M. Zapotocky, and E. D. Siggia. 2002. The evolution of DNA regulatory regions for proteo-gamma bacteria by interspecies comparisons. Genome Res. 12298-308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Ravasz, E., A. L. Somera, D. A. Mongru, Z. N. Oltvai, and A. L. Barabasi. 2002. Hierarchical organization of modularity in metabolic networks. Science 2971551-1555. [DOI] [PubMed] [Google Scholar]
  • 80.Reed, J. L., and B. O. Palsson. 2003. Thirteen years of building constraint-based in silico models of Escherichia coli. J. Bacteriol. 1852692-2699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Rodriguez-Penagos, C., H. Salgado, I. Martinez-Flores, and J. Collado-Vides. 2007. Automatic reconstruction of a bacterial regulatory network using Natural Language Processing. BMC Bioinform. 8293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Rogozin, I. B., K. S. Makarova, D. A. Natale, A. N. Spiridonov, R. L. Tatusov, Y. I. Wolf, J. Yin, and E. V. Koonin. 2002. Congruent evolution of different classes of non-coding DNA in prokaryotic genomes. Nucleic Acids Res. 304264-4271. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Ross, W., S. E. Aiyar, J. Salomon, and R. L. Gourse. 1998. Escherichia coli promoters with UP elements of different strengths: modular structure of bacterial promoters. J. Bacteriol. 1805375-5383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Salgado, H., G. Moreno-Hagelsieb, T. F. Smith, and J. Collado-Vides. 2000. Operons in Escherichia coli: genomic analyses and predictions. Proc. Natl. Acad. Sci. USA 976652-6657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Samal, A., S. Singh, V. Giri, S. Krishna, N. Raghuram, and S. Jain. 2006. Low degree metabolites explain essential reactions and enhance modularity in biological networks. BMC Bioinform. 7118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Sangurdekar, D. P., F. Srienc, and A. B. Khodursky. 2006. A classification based framework for quantitative description of large-scale microarray data. Genome Biol. 7R32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Schaechter, M., and F. Neidhardt. 1987. Introduction, p. 1-2. In F. C. Neidhardt, J. Ingraham, K. B. Low, B. Magasanik, M. Schaechter, and H. E. Umbarger (ed.), Cellular and molecular biology: Escherichia coli and Salmonella, 1st ed. ASM Press, Washington, DC.
  • 88.Selinger, D. W., R. M. Saxena, K. J. Cheung, G. M. Church, and C. Rosenow. 2003. Global RNA half-life analysis in Escherichia coli reveals positional patterns of transcript degradation. Genome Res. 13216-223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Seshasayee, A. S. 2007. An assessment of the role of DNA adenine methyltransferase on gene expression regulation in E. coli. PLoS ONE 2e273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Shen-Orr, S. S., R. Milo, S. Mangan, and U. Alon. 2002. Network motifs in the transcriptional regulation network of Escherichia coli. Nat. Genet. 3164-68. [DOI] [PubMed] [Google Scholar]
  • 91.Sinoquet, C., S. Demey, and F. Braun. 2008. Large-scale computational and statistical analyses of high transcription potentialities in 32 prokaryotic genomes. Nucleic Acids Res. 363332-3340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Soupene, E., W. C. van Heeswijk, J. Plumbridge, V. Stewart, D. Bertenthal, H. Lee, G. Prasad, O. Paliy, P. Charernnoppakul, and S. Kustu. 2003. Physiological studies of Escherichia coli strain MG1655: growth defects and apparent cross-regulation of gene expression. J. Bacteriol. 1855611-5626. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Stoyanov, J. V., J. L. Hobman, and N. L. Brown. 2001. CueR (YbbI) of Escherichia coli is a MerR family regulator controlling expression of the copper exporter CopA. Mol. Microbiol. 39502-511. [DOI] [PubMed] [Google Scholar]
  • 94.Stromback, L., and P. Lambrix. 2005. Representations of molecular pathways: an evaluation of SBML, PSI MI and BioPAX. Bioinformatics 214401-4407. [DOI] [PubMed] [Google Scholar]
  • 95.Strong, M., P. Mallick, M. Pellegrini, M. J. Thompson, and D. Eisenberg. 2003. Inference of protein function and protein linkages in Mycobacterium tuberculosis based on prokaryotic genome organization: a combined computational approach. Genome Biol. 4R59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 96.Tan, K., G. Moreno-Hagelsieb, J. Collado-Vides, and G. D. Stormo. 2001. A comparative genomics approach to prediction of new members of regulons. Genome Res. 11566-584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Teichmann, S. A., and M. M. Babu. 2004. Gene regulatory network growth by duplication. Nat. Genet. 36492-496. [DOI] [PubMed] [Google Scholar]
  • 98.Thieffry, D., H. Salgado, A. M. Huerta, and J. Collado-Vides. 1998. Prediction of transcriptional regulatory sites in the complete genome sequence of Escherichia coli K-12. Bioinformatics 14391-400. [DOI] [PubMed] [Google Scholar]
  • 99.Thomas-Chollier, M., O. Sand, J. V. Turatsinze, R. Janky, M. Defrance, E. Vervisch, S. Brohee, and J. van Helden. 2008. RSAT: regulatory sequence analysis tools. Nucleic Acids Res. 36W119-W127. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Tjaden, B. 2006. An approach for clustering gene expression data with error information. BMC Bioinform. 717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Tjaden, B., R. M. Saxena, S. Stolyar, D. R. Haynor, E. Kolker, and C. Rosenow. 2002. Transcriptome analysis of Escherichia coli using high-density oligonucleotide probe arrays. Nucleic Acids Res. 303732-3738. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Tobes, R., and J. L. Ramos. 2002. AraC-XylS database: a family of positive transcriptional regulators in bacteria. Nucleic Acids Res. 30318-321. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 103.Tran, T. T., P. Dam, Z. Su, F. L. Poole II, M. W. Adams, G. T. Zhou, and Y. Xu. 2007. Operon prediction in Pyrococcus furiosus. Nucleic Acids Res. 3511-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Turatsinze, J. V., M. Thomas-Chollier, M. Defrance, and J. van Helden. 2008. Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules. Nat. Protoc. 31578-1588. [DOI] [PubMed] [Google Scholar]
  • 105.van Helden, J. 2003. Regulatory sequence analysis tools. Nucleic Acids Res. 313593-3596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Veber, P., C. Guziolowski, M. Le Borgne, O. Radulescu, and A. Siegel. 2008. Inferring the role of transcription factors in regulatory networks. BMC Bioinform. 9228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Veiga, D. F., F. F. Vicente, M. F. Nicolas, and A. T. Vasconcelos. 2008. Predicting transcriptional regulatory interactions with artificial neural networks applied to E. coli multidrug resistance efflux pumps. BMC Microbiol. 8101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 108.Visick, K. L., T. M. O'Shea, A. H. Klein, K. Geszvain, and A. J. Wolfe. 2007. The sugar phosphotransferase system of Vibrio fischeri inhibits both motility and bioluminescence. J. Bacteriol. 1892571-2574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Wade, J. T., N. B. Reppas, G. M. Church, and K. Struhl. 2005. Genomic analysis of LexA binding reveals the permissive nature of the Escherichia coli genome and identifies unconventional target sites. Genes Dev. 192619-2630. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Wall, M. E., W. S. Hlavacek, and M. A. Savageau. 2004. Design of gene circuits: lessons from bacteria. Nat. Rev. Genet. 534-42. [DOI] [PubMed] [Google Scholar]
  • 111.Weber, A., and K. Jung. 2002. Profiling early osmostress-dependent gene expression in Escherichia coli using DNA microarrays. J. Bacteriol. 1845502-5507. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Whitehead, A. N. 1967. Adventures of ideas. Free Press, New York, NY.
  • 113.Xiao, G., B. Martinez-Vaz, W. Pan, and A. B. Khodursky. 2006. Operon information improves gene expression estimation for cDNA microarrays. BMC Genomics 787. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Yooseph, S., G. Sutton, D. B. Rusch, A. L. Halpern, S. J. Williamson, K. Remington, J. A. Eisen, K. B. Heidelberg, G. Manning, W. Li, L. Jaroszewski, P. Cieplak, C. S. Miller, H. Li, S. T. Mashiyama, M. P. Joachimiak, C. van Belle, J. M. Chandonia, D. A. Soergel, Y. Zhai, K. Natarajan, S. Lee, B. J. Raphael, V. Bafna, R. Friedman, S. E. Brenner, A. Godzik, D. Eisenberg, J. E. Dixon, S. S. Taylor, R. L. Strausberg, M. Frazier, and J. C. Venter. 2007. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol. 5e16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115.Zheng, D., C. Constantinidou, J. L. Hobman, and S. D. Minchin. 2004. Identification of the CRP regulon using in vitro and in vivo transcriptional profiling. Nucleic Acids Res. 325874-5893. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116.Zhou, Y., T. J. Merkel, and R. H. Ebright. 1994. Characterization of the activating region of Escherichia coli catabolite gene activator protein (CAP). II. Role at class I and class II CAP-dependent promoters. J. Mol. Biol. 243603-610. [DOI] [PubMed] [Google Scholar]

Articles from Journal of Bacteriology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES