Skip to main content
Nucleic Acids Research logoLink to Nucleic Acids Research
. 2009 Mar 17;37(9):2926–2939. doi: 10.1093/nar/gkp164

Expression profiling of hypothetical genes in Desulfovibrio vulgaris leads to improved functional annotation

Dwayne A Elias 1,*, Aindrila Mukhopadhyay 2, Marcin P Joachimiak 2, Elliott C Drury 1, Alyssa M Redding 2, Huei-Che B Yen 1, Matthew W Fields 3, Terry C Hazen 4, Adam P Arkin 2, Jay D Keasling 2, Judy D Wall 1
PMCID: PMC2685097  PMID: 19293273

Abstract

Hypothetical (HyP) and conserved HyP genes account for >30% of sequenced bacterial genomes. For the sulfate-reducing bacterium Desulfovibrio vulgaris Hildenborough, 347 of the 3634 genes were annotated as conserved HyP (9.5%) along with 887 HyP genes (24.4%). Given the large fraction of the genome, it is plausible that some of these genes serve critical cellular roles. The study goals were to determine which genes were expressed and provide a more functionally based annotation. To accomplish this, expression profiles of 1234 HyP and conserved genes were used from transcriptomic datasets of 11 environmental stresses, complemented with shotgun LC–MS/MS and AMT tag proteomic data. Genes were divided into putatively polycistronic operons and those predicted to be monocistronic, then classified by basal expression levels and grouped according to changes in expression for one or multiple stresses. One thousand two hundred and twelve of these genes were transcribed with 786 producing detectable proteins. There was no evidence for expression of 17 predicted genes. Except for the latter, monocistronic gene annotation was expanded using the above criteria along with matching Clusters of Orthologous Groups. Polycistronic genes were annotated in the same manner with inferences from their proximity to more confidently annotated genes. Two targeted deletion mutants were used as test cases to determine the relevance of the inferred functional annotations.

INTRODUCTION

The application of genome sequencing and sequence annotation to numerous bacterial species has yielded a ‘road map’ for several avenues of research. These include the incorporation of gene expression changes at both the mRNA and protein levels (1–5) with physiological and metabolic studies to deduce the behavior of the microbe as a whole, a field now called Systems Microbiology. Other approaches to discern function include genetic manipulations such as gene/protein tagging for the identification and visualization of protein complexes (6–8) and deletion mutagenesis (9–12) for confirming the function(s) of a given gene or protein. One aspect that has come to light through the sequencing of more than 780 completed bacterial and archeal genomes (http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi) with an additional 2179 ongoing (http://www.genomesonline.org/gold.cgi), is that approximately one-third of all of the genes within a given genome are typically predicted to encode hypothetical (HyP) and conserved HyP genes (13). HyP proteins are defined as those with no significant sequence similarity (i.e. homology) to any characterized or uncharacterized predicted proteins, while conserved hypothetical (CHyP) proteins are those that have significant similarity to a predicted protein in another species or strain without direct evidence of the expression of the gene as defined by TIGR (http://cmr.jcvi.org/tigr-scripts/CMR/CmrHomePage.cgi).

This lack of similarity to well-characterized genes and proteins increases the interest in these groups of sequences, as they may well be important to specialized cellular physiology and metabolism, may possess unique functions, or complete tentatively assigned pathways. In the case of HyP, their potential existence is only supported by the output of gene prediction programs such as GLIMMER (14,15). More recently however, computational tools for determining plausible functions of gene products have been developed. These include methods such as genomic context analysis (16–19) and phylogenomic profiling (20,21). Such tools are still of limited application for HyP or CHyP due to the lack of sequence homology for the gene or protein of interest within a well-characterized, sequenced genome. Hence, a working knowledge of the function(s) of the gene products encoded is expected to lead to a greater mechanistic understanding of the metabolic capabilities of a target microorganism and possibly to a better physiological knowledge of other organisms through gene sequence and neighbourhood association.

As a first step to the identification of gene function, it is important to determine if the genes are actually expressed, if expression results in production of a functional protein, and under which conditions the genes and proteins may be differentially regulated or influenced. Experiments to obtain expression information have applied a combination of transcriptomic and proteomic methods. In Haemophilus influenzae, 54 HyP genes of a total 1744 genomic loci were found to be expressed with both methods of analysis and used to assign specific functions for 16 genes and general roles for another 27 genes (13). More recently, the metal-reducing bacterium Shewanella oneidensis MR-1 was the focus of three investigations (22–24). In this organism, 40% of the sequenced genome consists of genes predicted to encode HyP and CHyP. The first study of S. oneidensis MR-1 by Kolker and co-workers (24) showed the actual transcription of >500 HyP and CHyP genes with at least a general functional assignment for 240 of them. In a related study by Elias et al. (22) of only HyP, comprehensive MS-based proteomics data were queried across many culture conditions, and the results confirmed the existence of 262 predicted proteins. Additionally, inferences were made for the subcellular localization and function from differential expression in these discreet culturing conditions (22). A study of the CHyP in this same organism by Elias et al. (23) confirmed protein production from 758 such genes with improved functional assignments that were also inferred in part from the culturing conditions of expression (23).

The sulfate-reducing bacterium (SRB) Desulfovibrio vulgaris Hildenborough has a sequenced genome (25), and several physiological and metabolic studies have taken advantage of this information (3,26–29). This bacterium is well known as a model SRB (30) and is known to reduce and immobilize metals and radionuclides (31–33) making it of interest for bioremediation efforts. Additionally, D. vulgaris Hildenborough was recently the focus of an effort to assign putative functions to predicted HyP from proteomics data obtained through LC-MS/MS analysis of cultures grown with different electron donors (34). These assignments were based not only on proteomics data but also relied on a number of computational methods and resulted in the re-annotation of 20 CHyP genes and the confirmation of gene expression for 129 HyP genes.

In the current study, we have incorporated the previously published results and expanded the analyses. Transcriptomics data derived from 11 different stresses as well as corresponding shotgun LC-MS/MS proteomics data from selected stress conditions were used to query the expression of each HyP or CHyP gene. Some of these data have already been published as part of a genome-wide transcriptional response to a particular stress condition. These published data include responses to heat shock (26), high salt (28), nitrate or nitrite exposure (35,36), stationary versus exponential growth (2), high pH (37), effect of deleting the Fur gene (10) and O2 exposure (38). We have incorporated the HyP and CHyP microarray and proteomics data from these studies as well as currently unpublished results. The latter conditions include peroxide stress in the wild type and a mutant lacking the perR gene; a strain lacking pDV1, the 202 kb native plasmid found in D. vulgaris Hildenborough, versus the wild-type D. vulgaris that has pDV1; a co-culture with a methanogen versus D. vulgaris alone, acidic conditions, cold stress, chromium exposure, NaCl adaptation and Fe(II)-limitation. The microarray datasets for all HyP and CHyP genes in published studies have been made publicly available (http://genomics.lbl.gov/supplemental/DvH-hypotheticals-090223/).

These data were compiled and the genes categorized by basal expression rates by the presence versus the absence of differential gene expression in response to a particular stress, by whether the response was specific to a single stress or seen in multiple stress conditions and by the operon status of the genes, monocistronic or polycistronic. For the latter, differential expression in a given stress that coordinated with the rest of the operon was also considered. Additionally, bioinformatic information such as Clusters of Orthologous Groups (COG) and subcellular location were used for functional annotation. This may be the first such comprehensive investigation utilizing mRNA microarrays and proteomics to infer a more robust functional annotation of HyP and CHyP genes from such a large number of stress conditions.

MATERIALS AND METHODS

Biomass production

Desulfovibrio vulgaris Hildenborough (ATCC 29579) was grown in a defined lactate (60 mM)/sulfate (50 mM) medium, LS4D (28), under a variety of different stress conditions as have been reported (28,35,36). The chilled samples were harvested via centrifugation, flash frozen in liquid nitrogen and stored at −80°C until analysis.

Culture maintenance

D. vulgaris cultures from the American Type Culture Collection (ATCC) were grown to mid log phase in 1 liter of LS4D, checked for purity by the appearance of anaerobic SRB colonies on LS4D plates as well as the absence of colonies on aerobic glucose plates, dispensed into 2 ml cryogenic vials (Nalgene) with 0.5 ml 30% (vol/vol) glycerol and frozen at −80°C until used as previously described (28). To minimize phenotypic drift from repetitive culturing, all experiments used cells that were started from frozen stocks and were fewer than three subcultures from the original ATCC culture. All experiments, inoculations and transfers were performed in an anaerobic glove chamber (Coy Laboratory Products Inc., Grass Lake, MI) with an atmosphere of ∼5% CO2, 5% H2 and 90% N2.

Microarray transcriptomic and data analysis

DNA microarrays using 70-mer oligonucleotide probes covering 3482 of the 3534 annotated protein–coding sequences of the D. vulgaris genome that were constructed as previously described (39). Of the 52 genes not found on the microarrays, 14 were either HyP and CHyP (under Expression Category; Supplementary Tables 5 and 6). Total RNA extraction, purification and labelling with the fluorophore Cy5-dUTP (Amersham Biosciences, Piscataway, NJ) were performed independently on each cell sample using previously described protocols (38). Genomic DNA was extracted from D. vulgaris cultures and hybridized as previously described (36). Microarray data analyses were performed using gene models from NCBI. All mRNA changes were assessed with total genomic DNA (gDNA) as a control for each of the experimental and control hybridizations. Log2 ratios of mRNA to gDNA hybridized to gene oligonucleotides and z-scores were computed as previously described (9). A mean log2 ratio cut off across time points of ≥|1.2| and an accompanying z-score ≥|2|, respectively, were used to identify the genes whose expression changed significantly.

graphic file with name gkp164um1.jpg

Proteomics and proteomic data analyses

Shotgun LC-MS/MS analysis

Sample preparation, chromatography and mass spectrometry for shotgun LC–MS/MS proteomics were performed as described previously (35). Briefly, frozen cell pellets from triplicate 50 ml cultures were thawed and pooled prior to cell lysis. Cells were lysed via sonication in lysis buffer, composed of 4 M urea with 500 mM triethylammonium bicarbonate, pH 8.5 (Sigma-Aldrich, St Louis, MO), and the clarified lysate was used as total cellular protein. Sample denaturation, reduction, blocking, digestion and labelling with isobaric reagents were performed according to the manufacturer's directions (Applied Biosystems, Framingham, MA). Strong cation exchange chromatography was used to separate the iTRAQ-labelled samples into 21–23 salt fractions. Fractions were desalted using C18 MacroSpin Columns (Nest Group, Southborough, MA), dried and separated on a PepMap100 C18 reverse phase nano-LC-MS column (Dionex-LC Packings, Sunnyvale, CA) using an Ultimate HPLC with Famous Autosampler and Switchos Micro Column Switching Module coupled with an ESI-QTOF mass analyser (QSTAR® Hybrid Quadrupole TOF, Applied Biosystems) as previously described (26). A 2 h gradient from 0 to 25% acetonitrile was used. Two product ion scans were collected from each cycle with a 1 s accumulation time. A threshold of 50 counts was required for ions to be selected for fragmentation. Parent ions and their isotopes were excluded from further selection for 1 min, with a mass tolerance of 100 ppm.

Collected mass spectra were analysed using Analyst 1.1 with ProQuant 1.1, ProGroup 1.0.6 (Applied Biosystems) and MASCOT version 2.1 (Matrix Science, Inc, Boston, USA). A FASTA file containing all the putative open reading frame (ORF) sequences of D. vulgaris, obtained from Microbes Online (http://www.microbesonline.org) (40), was used to form the theoretical search database along with the common impurities trypsin, keratin, cytochrome c and bovine serum albumin. The same search parameters were used in both programs; namely, trypsin was designated as the cleavage enzyme, a maximum of one missed cleavage was allowed, mass tolerances of 0.1 for mass spectrometry and 0.15 for tandem mass spectrometry were allowed, and charge states from +2 to +4 were searched. Only proteins identified by at least two unique peptides in at least one of the data sets at greater than 95% confidence by both ProQuant and MASCOT were considered.

AMT tag analysis

Whole cells lysis via bead beating and whole cell lysate tryptic digestion were performed as described previously (22,23). Separation of insoluble (i.e. membrane bound/associated) from soluble proteins in whole cell lysates was achieved with ultracentrifugation (356 000 × g, 10 min, 4°C) as described elsewhere (22,23). The capillary LC system and controller, in-house manufactured mixer, capillary column selector and sample loop for manual injections as well as separations are also as previously described (22,23).

All samples were analysed as previously described (22,23,41). The collision-induced dissociation (CID) tandem mass spectra from the LC-ion trap mass spectrometer measurements were analysed with SEQUEST (42) using the protein sequences deduced from the D. vulgaris Hildenborough genome sequence (25). All samples were analysed by a 9.4-T FTICR-MS (Bruker Daltonics) as described previously (43). Mass spectra were acquired with ∼105 resolution.

High-stringency constraints were used in the filtering of the data to maximize peptide identification confidence as described previously (22,23). All peptides were required to be fully or partially tryptic. In order to gauge the confidence of MS peak matching from the FTICR data to the SEQUESTTM result, an algorithm to determine the quality of the match score, termed the ‘discriminant score’, was used (23). This scoring system computed a measure of confidence for each observation of each peptide via an extension of the approach described by Aebersold and co-workers (44). This incorporated the predicted central normalized elution time (NET) values instead of filtering out low-confidence peptides solely using observed and predicted NET values. It also utilized several SEQUESTTM scoring parameters (peptide cleavage state, difference in observed and computed mass, difference in observed and predicted NET and other indicators) to compute a confidence score for each peptide identification. This eliminated a fixed score threshold, e.g., SEQUESTTM Xcorr value of two, to filter peptides for inclusion in a database. The advantage was that a discriminant-based score is less likely to discard peptide identifications than a score based upon threshold criteria. Incorporation of NET data improved peptide identification confidence by ∼10% compared to not using elution information (45). At least one high-confidence ‘unique’ peptide (i.e., mapping to only one possible parent protein) and a total of two peptides were required for protein identification in each AMT tag analysis.

The FTICR data were processed using the PRISM data analysis system as described previously (22). Since the separation systems for both FTICR and LCQ analyses were identical, peptide confirmation was based on both the calculated (from the mass tag database) and measured mass (from the FTICR analysis) of the peptide matching to within 6 ppm and the elution times matching to within 5%.

Expression categorization of HyP and CHyP proteins

Each HyP or CHyP gene was identified and sorted as monocistronic or part of a polycistronic operon. This distinction allowed for inferences in the latter as to functional annotations by basing the expression responses to stresses and association with characterized genes in the same operon.

Each of these two groups of genes was then categorized solely on the basis of the microarray expression profiles. The first category was divided into those genes that exhibited ‘high expression’, where the basal expression levels were within the top 1/8th (12.5%) of all gene transcript levels, and those with ‘normal expression’, where basal expression was below the 12.5%. The basal gene expression level was determined by calculating the mean Log2 ratio of mRNA to gDNA hybridization intensities normalized as described earlier for all microarray experiments. With this method, the more negative log2 value (e.g. −14.9) indicated a smaller degree of absolute expression while a less negative number (e.g. −10.5) indicated a more highly expressed gene (9). Genes were categorized as ‘not expressed’ if their second highest observed mean Log2 ratio of mRNA to gDNA hybridization intensity on any individual array was <−14.0, an arbitrary cutoff determined by visual inspection of probability density distributions for HyP + CHyP genes compared to annotated genes. Each of the basal expression groups was then further subdivided into the following differential gene expression categories: (1) expressed genes that lacked differential expression in response to any of the stress conditions, (2) those that showed differential expression to only one stress, (3) those that showed differential expression to multiple stresses ‘multiple stress response proteins’ and (4) the category ‘not expressed’ included those genes that showed no expression under any of the tested conditions.

Differential gene expression in response to either single or multiple stresses was determined by the observation of a minimum |log2 R| value ≥1.2 and a corresponding |z-score| ≥2 as compared to the control condition. Because samples were analysed at several time points after the induction of the stress condition, we established that this differential level of expression had to be observed in at least 20% of the time points to be further considered. If these parameters were met for only one of the stress conditions, then the gene was placed into the ‘single stress response’ category. If the gene met these criteria in two or more of the 11 stress conditions, then they were placed in the ‘multiple stress response’ category. In either case, the current annotation was retained for genes not meeting this criterion since every conceivable growth condition was not tested, making it premature to classify these predicted genes as ‘non-coding gene region’.

Deletion mutagenesis

Specific HyP or CHyP genes were selected for targeted deletion based upon the microarray datasets. These genes included the monocistronic HyP gene DVUA0095 that is on the native plasmid of the organism and responded only to chromate stress. The second was a pair of genes (DVU0303 and DVU0304) currently annotated as an operon on the main chromosome. The expression of these HP genes was predicted and found to be directly or indirectly influenced by the Fur regulon (10) and exhibited differential expression in virtually all stress conditions tested.

Deletion cassette construction

Deletion cassettes were constructed by a method similar to the molecular bar-coding methods described for Saccharomyces cerevisiae (46,47). PCR primer sets were designed to amplify ∼800 bp up- and downstream of the selected ORF with unique barcode sequences between the common sequences and Kmr sequences (Supplementary Table 1). The PCR mixtures, marker exchange procedures, transformation and mutant selection procedures including Southern-blot analyses were performed as previously described (10). The three segments; up- and down-stream of the gene of interest as well as the Kanamycin cassette were individually amplified by PCR and then ligated by a fourth SOEing PCR. This mutagenic cassette was then introduced into D. vulgaris via electroporation, where a double recombination event replaced the target gene with the drug-resistance gene.

Physiological assessment of mutants

The deletion mutants were tested for growth compared to wild-type D. vulgaris under the same conditions used in the original stress experiments from which the microarray and proteomic datasets were generated, along with growth in LS4D medium at pH 7.2 (28) as a control. Amendments and modifications for the stress conditions included the addition of 250 mM NaCl (salt stress), lowering the pH to 5.5 (acid stress), addition of 100 mM or 150 mM sodium nitrate (nitrate stress), 1 mM or 2 mM sodium nitrite (nitrite stress) and 0.2 mM, 0.4 mM or 0.45 mM potassium chromate (chromate stress). Optical density (A600) measurements were taken periodically up to 400 h in duplicate cultures.

Sequence analysis

Protein sequence similarity was determined using FastBLAST (48) with an e-value threshold of 0.01 and an effective database size equal to 2.23 × 107. D. vulgaris Hildenborough protein sequences obtained from RefSeq (release 28 March 2008) (49) were searched against the non-redundant protein database from NCBI (as of May 15, 2008) (ftp://ftp.ncbi.nih.gov/blast/). Operon predictions (50) and COG (51) assignments were based on MicrobesOnline (40) data from the 7 April 2008 release [including the November 2007 release of the NCBI Conserved Domain Database (52)].

Homology searches and putative functional assignments

Several publicly available in silico tools were utilized in an attempt to assign a more detailed putative function to each of the HyP or CyHP genes that were expressed according to the microarray experiments. The first tool used was PSORTb version 2.0.4 (http://www.psort.org/psortb/) that was set for Gram negative organisms (53,54). This tool predicts the subcellular location of a given protein by estimating the presence and number of trans-membrane helices, the presence of signal pathway motifs as well as other parameters. These tools were used along with the final localization estimate in conjunction with the microarray and proteomic data to give the most accurate functional annotation possible. The second method was TMHMM (55) (http://www.cbs.dtu.dk/services/TMHMM-2.0/) that predicts the number of transmembrane helices and determines if the protein is inside or outside the cytoplasm of the cell and was used to corroborate the findings with PSORTb. Other bioinformatics tools such as those used previously (34) were attempted for several HyP and CHyP genes other than those already reported (34), but the results were ambiguous and so they were not pursued for this work.

Statistical comparisons of basal expression distributions

Probability density plots were created in the statistical software environment R (http://www.r-project.org/) with probability densities estimated by smoothing with a Gaussian kernel. Statistical tests for differences in expression level distributions were computed in R using the two-sided Mann–Whitney test with a continuity correction in the normal approximation for the P-value.

RT–PCR HyP and CHyP basal expression

Eight genes were selected for reverse transcription PCR (RT–PCR) in order to provide a biological verification of the microarray results. These genes were selected across the range of the average of the basal expression range with emphasis on the lower end so that if all these genes were expressed according to RT–PCR, then the assumption could be made that most if not all of the other genes were expressed as well. The positive control was the constitutively expressed dsrC gene (DVU2776) with an average basal expression rate of −9.7, which would place it above the top 1/8th percentile cutoff of −11.8 so as to be placed in the ‘highly expressed category’. The test genes included the chromosomal genes DVU1127 (−17.1), DVU1721 (−16.6), DVU1723 (−16.6), DVU2456 (−7.6) and DVU2880 (−16.4) as well as the native plasmid genes DVUA0070 (−9.5), DVUA0144 (−15.0) and DVUA0146 (−11.4). Two negative test genes were also included. These were DVU1526 for which expression has yet to be detected via either microarrays or proteomics and DVUA0044 that was not on the microarrays as has also not been detected by proteomics. D. vulgaris cells were cultured and harvested as above. Total RNA was isolated immediately as described earlier and DNA removed using three treatments of the ‘DNA-free’ DNAse removal kit (Applied Biosystems). To ensure the DNA was removed, PCR amplification of DVU0847 (adenylyl-sulphate reductase, α-subunit) and DVU2776 (dissimilatory sulfite reductase, γ-subunit) was performed and yielded no PCR product (data not shown). cDNA was produced using the ImProm II Reverse Transcription System A3800 (Promega). PCR reactions were then conducted for the eight test genes, the two negative controls and two positive controls (DVU2776 and DVU0847). The primers were designed to amplify as much of the gene sequence as possible without any upstream or downstream sequence (Supplementary Table 2).

RESULTS AND DISCUSSION

Global detection of HyP and CHyP gene-expression products

The sequenced genome of D. vulgaris shows an expected 887 HyP and 347 CHyP genes for a total of 1234 possible gene products. In general, mRNA was confidently detected for 1212 of these genes using microarrays, thus indicating actual expression of the gene (Table 1). Additionally, shotgun LC–MS/MS and AMT tag proteomic analysis further confirmed the expression of 786 proteins from HyP or CHyP genes. This represents the detection of gene expression for over 99% of the annotated HyP and over 95% of the CHyP genes with a complementary 471 (53%) of the HyP and 306 (88%) of the CHyP genes detected at the protein level (Table 1; Supplementary Table 3). In comparison, a recent report detailed the expression of 129 HyP and CHyP D. vulgaris genes via proteome analysis, with possible functional reassignments using in silico approaches (34).

Table 1.

HP and CHP genes with evidence of expressions

Current annotation Number of possible genesa Transcript identifiedb Protein identifiedc
Polycistronic
Hypothetical 327 324 227
Conserved hypothetical 220 211 194
Monocistronic
Hypothetical 560 557 247
Conserved hypothetical 127 120 113

aComputational identification of putative open reading frames were as previously described (25,63).

bTranscript evidence obtained from microarray experiments reported by VIMSS/ESPP efforts (2,10,26,28,36,38).

cProteins identified by shotgun LC–MS/MS and/or AMT tag proteomics as previously described (35,64,65).

Reverse transcription PCR was conducted in order to corroborate the findings of the microarray and proteomic analysis. Eight HyP and CHyP genes were selected with five being at the lower end of the basal expression range. Two genes (DVU1526 and DVUA0044) were included as negative test cases since all data to date suggests that DVU1526 is not expressed, while DVUA0044 was not on the microarrays and was not detected by either proteomic method. The well-annotated DVU2776 (dsrC) served as the positive control. The RT–PCR revealed that all eight of the test genes yielded bands at the predicted sizes along with the positive control, while the two negative test genes yielded no bands (Figure 1). These data were consistent with the microarray and proteomic data. Furthermore, because five of the eight test cases were among the lowest recorded basal expression rates, the results give an increased confidence that at least the large majority of genes deemed to be expressed according to the microarray and proteomics data are correct.

Figure 1.

Figure 1.

RT–PCR confirmation of microarray expression data. Agarose gel showing the results of RT–PCR reactions in order to confirm the expression or lack of expression of various HyP and CHyP genes in D. vulgaris. Circled areas indicate the expected molecular weight band in each case. Eight such genes were selected over a range of average basal expression rates (expression category in brackets) while two genes that showed no expression to date were also selected (broken circles). In both cases, PCR with gDNA ensured that the primers performed as expected. The well-annotated DsrC (DVU2776) served as the cDNA and gDNA control.

Neither proteomic approach gave evidence for protein synthesis from 448 transcribed genes. We sought to determine whether any bias existed that might explain this omission such as, was there a general difference in the expression of well-annotated genes versus the HyP and CHyP? Basal expression statistical profile comparisons were performed to answer this question. Overall, HyP and CHyP displayed lower gene-expression levels than well characterized proteins (P = 1.5 × 10−12, two-sided Mann–Whitney test; Figure 2A). This was not surprising since the core metabolic genes required for survival are unlikely to be among the HyP and CHyP and so might be expected to be expressed at or above the average levels. However, on an individual basis, the HyP and CHyP genes have appeared amongst the most highly differentially expressed genes under particular stress conditions (2,10,26,28,35,36,38).

Figure 2.

Figure 2.

Comparison of the average basal gene expression levels as quantified by microarray analysis. (A) A comparison of the 1212 expressed HyP and CHyP genes (solid line) versus the 2278 better annotated genes (broken line) showed that the former displayed significantly lower expression levels overall; P = 1.5 × 10−12, two-sided Mann–Whitney test. (B) The 774 HyP and CHyP gene-expression products detected at the protein level (broken line) were significantly more abundant than the 438 proteins not confidently identified by either proteomics method (solid line); P = 3.0 × 10−5, two-sided Mann–Whitney test.

With respect to confident identification of HyP and CHyP by proteomics, those that were identified showed a higher gene-expression level than those not identified (P = 2.9 × 10−5, two-sided Mann–Whitney test; Figure 2B). Again, this is as expected, since the most abundant proteins should be preferentially identified. Additionally, a determination of any bias in the proteomic data revealed that in a comparison of HyP vs CHyP, monocistronic vs polycistronic, proteins above or below 100 amino acids in length and highly expressed vs all others, both methods underrepresented proteins <100 amino acids in length (Figure 3A and B). Given this information, one possibility was that there was a lack of tryptic cut sites yielding no peptides for mass spectrometric detection. However, a query for the presence of either a lysine (K) or arginine (R) revealed that this was the case in only 3 of the 448 genes that did not have a protein confidently identified by either method (Supplementary Table 4). Curiously, 12 HyP and CHyP genes were confidently identified at the protein level but not with mRNA. Further investigation found that only five of these genes were on the microarrays (DVU0522, DVU1148, DVU1748, DVU2022, DVUA0050) while the remaining seven genes (DVU0509, DVU0797, DVU0833, DVU1852, DVUA0052, DVUA0088, DVUA0145) were not (Supplementary Tables 5 and 6). This suggests that there may have been an issue with the microarray hybridization for these five genes or that transcription was below detection.

Figure 3.

Figure 3.

Comparison of HyP and CHyP detection at the protein level using the (A) AMT tag and (B) shotgun LC–MS/MS approaches. The bars are additive between the number of proteins not detected (black bar) and those detected (grey bar). A comparison of each of the subgroups with both methods showed a bias against the detection of proteins under 100 amino acids in length.

Given this information and that one of the goals of this study was to assign a functional annotation to as many of the HyP and CHyP genes as possible, a separation of the polycistronic from monocistronic genes was performed. Such classifications allow the evaluation of the hypothesis that more of the polycistronic HyP and CHyP genes would be expressed compared to the monocistronic ones. We assumed a greater likelihood for a more accurate functional annotation if the gene were in a predicted operon and displayed similar stress response patterns to more confidently annotated genes in that same operon. Hence, monocistronic and polycistronic genes were treated separately for the remainder of the study.

Categorization of differential expression patterns under stress conditions

The first step in characterization of the 882 HyP and 330 CHyP genes that were transcribed was to categorize them according to observed expression patterns under one or more of the cultivation/stress conditions tested. Collectively, 45% of all of the expressed HyP and CHyP genes were found to be polycistronic with the remaining 56% being monocistronic (Table 1). However, it is interesting that despite a greater number of monocistronic genes (687) compared to polycistronic genes (548), the distribution amongst the expression categories was quite similar (Figure 4). In each case, only 0.3% of the monocistronic and 0.2% of the polycistronic genes were highly expressed with no observed differential transcription. Highly expressed genes, differentially expressed or not, comprised 7.3% and 6.2% of these groups, respectively. Similarly, 10.9% of the monocistronic and 11.7% of the polycistronic genes that were not highly expressed responded to a single stress, while a considerably higher percentage (73.9% and 71.5%, respectively) responded to multiple stress conditions. This similarity in the categorizational proportions between the two groups of genes continued in the ‘expressed’ (6.4% and 8.4%, respectively) and not expressed (1.5% and 2.2%, respectively) categories. For each of the polycistronic genes not observed to be expressed (12 cases), the other genes in the operon were expressed under some condition. Therefore, these were not cases of an operon not being expressed, but rather particular genes not showing expression. Reasons for these patterns are not clear. In fact, a preliminary assumption was that more of the monocistronic genes would not be expressed than genes within operons, since the latter would be more likely to be co-expressed with the rest of the operon. However, this was not the case since the percentage of genes in each category was similar as detailed above.

Figure 4.

Figure 4.

Pie charts showing the stress response distribution of all (A) monocistronic HP and CHP genes (680) and (B) HP and CHP genes (557) predicted to be in polycistronic operons. In each case, the genes that were categorized as having a high basal expression are grey while the rest are black. Those not showing expression are not coloured. Genes displaying no stress response are solid colours while a single stress response is denoted by a striped pie slice and multiple stress responses are checkered. Those displaying a single-stress response accounted for 10% of the genes while >80% were differentially expressed in more than one stress.

Expression profiling and putative functional assignment to monocistronic genes

Monocistronic genes are arguably more difficult to functionally annotate, because the reference point of neighbouring genes is absent. However, the 81 monocistronic genes that responded to a single stress did give clues as to function, 65 encoded a HyP with the remaining 16 encoding a CHyP (Supplementary Table 5). In order to demonstrate the observed expression profiles, the stress responses for a randomly selected set of genes responding to a single stress condition are shown (Figure 5). It is interesting that even among these essentially unknown genes, several are found to be responsive to the same stress. Several exhibited differential expression when transcripts in stationary phase were compared to those in exponential cells or when acid treated culture transcripts were compared to base treated. In contrast, there were cases where multiple HyP genes responded specifically to one stress, as in the case of chromate exposure. Both DVUA0095 (Figure 5B) and DVU1338 were upregulated by chromate exposure while DVU2436 was down-regulated (Supplementary Table 5). No polycistronic genes were solely influenced by chromate. Based on these findings and the observation that D. vulgaris lacking pDV1 is less tolerant of chromate exposure (M. Fields, personal communication), DVUA0095, located on pDV1, was targeted for deletion to ascertain the cellular response to chromate in this mutant.

Figure 5.

Figure 5.

Microarray expression profiling of several monocistronic (A) CHP and (B) HP genes from various stresses that displayed differential expression (log2 R ≥ 1.2) in a single stress. Stat vs Exp = stationary phase compared to exponential growth while all others are the listed stress condition compared to normal growth on lactate sulfate medium.

For the purposes of functional annotation, each of the Hyp genes has been renamed to reflect the stress response influencing its expression along with any other in silico features as determined by the use of COGs, TMHMM and PSORTb (Supplementary Tables 5 and 6). For example, DVUA0095 was found to be up-regulated upon exposure to chromate. Results from in silico analyses predicted that it possesses three transmembrane helices but no signal peptide motifs, with a final score of 9.46 (out of 10.0) that it is associated with the cytoplasmic membrane. Given the lack of a signal peptide or assigned COG, we infer that the protein resides in the cytoplasmic membrane with the bulk of the protein facing the cytoplasm. Hence, this protein has been re-annotated to be a ‘chromate-induced, cytoplasmic membrane protein’. For others where no such structural features were predicted, the genes were simply renamed e.g. ‘acid-induced protein’ or ‘heat-repressed protein’. The remaining monocistronic genes that did not display a differential response to any of the stresses have simply been renamed as e.g. ‘expressed protein in D. vulgaris’ or ‘expressed cytoplasmic membrane protein in D. vulgaris’ (e.g. DVU1006; Supplementary Table 5). The remaining monocistronic HyP and CHyP genes, representing 80% of those expressed, were differentially regulated in multiple stress conditions. Genes that responded by increasing or decreasing in three or more conditions were predicted to encode ‘general stress response proteins’. In a similar manner, a gene that responded to only two stresses was renamed by the observed responses such as a ‘NaCl induced, cold repressed protein’ (DVU3354) or a ‘cold and co-culture induced protein’ (DVU3130) (Supplementary Table 5). The eight genes that were not expressed under any of the conditions tested have been left with their original annotation, as opposed to being designated as a ‘non-coding region’, since not all conceivable cultivation or stress conditions have been tested to date.

Expression profiling and putative functional assignment of polycistronic genes

The polycistronic HyP and CHyP genes often have a reference point for their plausible biochemical function based on their location within operons that include ORFs with more characterized, orthologous genes in other bacteria. One of the main criteria used to assign function to the HyP or CHyP genes was the similarity of the differential expression pattern of the gene to that of other genes within the operon. An additional criterion was the degree of nucleotide overlap with other genes within the operon that could suggest transcriptional coupling. Good examples of such scenarios were found in areas of the genome apparently containing temperate bacteriophages or their remnants, such as DVU2488–DVU2729 containing several predicted operons. Some temperate phages may be induced by catastrophic stress conditions or as cells enter the stationary phase of growth. In fact, stresses from high heat and the stationary phase of growth resulted in the differential expression of the greatest number of polycistronic genes that are likely to encode phage functions. One example is the apparent increased expression of a seven-gene operon likely to be involved in temperate phage activity during stationary phase versus exponential growth (Figure 6A). It is prudent to note that other changes in culture conditions usually coincide with these events, such as sulfide and acetate accumulation with a concomitant change in pH and a decrease in the specific growth rate.

Figure 6.

Figure 6.

Microarray expression profiling of hypothetical genes within operons allows for a putative functional assignment by using the profile and gene association. The condition shown is stationary phase vs exponential growth. (A) Up-regulation of a seven-gene operon containing DVU2710 (filled diamond; prophage protein), DVU2711 (filled square; major head subunit), DVU2712 (filled triangle; hypothetical protein), DVU2713 (open square; prophage protein), DVU2714 (open circle; prophage protein), DVU2715 (open diamond; conserved hypothetical) and DVU2716 (open triangle; tail sheath protein). (B) Up-regulation of a four-gene operon of DVU0192 (filled diamond; adenine specific DNA methyltransferases) and DVU0194 (filled triangle; terminase) that are well-conserved while DVU0193 (closed square) and DVU0195 (closed circle) were annotated as hypothetical proteins.

Another example of HyP and CHyP genes within temperate phage operons was the four-gene operon of DVU0192 (adenine-specific DNA methyltransferase), DVU0194 (terminase) along with the HyP genes DVU0193 and DVU0195. In this case, all four genes responded coordinately to the onset of stationary phase compared to exponential growth (Figure 6B), suggesting that the two HyP genes could be involved in phage DNA metabolism.

Other cases were not as straightforward, such as with the predicted operon of DVU1639–DVU1642 (Supplementary Table 6) that contains two HyP and two CHyP genes. DVU1639 showed increased expression only to stationary phase whereas DVU1640 was not expressed. DVU1641 and DVU1642 were both up-regulated in multiple stresses including showing Fur-regulation. These results question the validity of the operon assignment. The three transcribed genes were renamed based upon the microarray stress response data. Hence, DVU1639 was reannotated as a ‘stationary phase induced protein’, whereas DVU1641 and DVU1642 were reannotated as a ‘Fur influenced, multiple stress induced protein’ and a ‘multiple stress induced outer membrane protein’, respectively. The original annotation for DVU1640 remained.

Fur-influenced regulation of HP and CHP genes

D. vulgaris possesses three Fur regulator paralogs DVU0942 (Fur), DVU3095 (PerR) and DVU1340 (Zur) (56). While little information is available on the latter in D. vulgaris, the global regulation roles of Fur and PerR have been explored in more depth. Gene deletions within D. vulgaris are available for the putative global regulators Fur (10) and PerR (unpublished results), and transcriptional analyses in a few stress conditions have been performed. In a number of bacteria, the Fur system regulates the uptake of ferric iron (57,58). Although Fur has also been shown to control the synthesis of specialized Fe(III) chelators known as siderophores (57,59,60), there is no evidence for siderophore production in D. vulgaris. In contrast, Fur is suggested to play a regulatory role in oxidative stress, motility, virulence and acid tolerance (57,58). PerR, the second of the potential global regulators, has been best studied in Bacillus subtilis (61,62). Experimental evidence supported a role for the PerR system in the cellular response of D. vulgaris to oxidative stresses such as peroxide or metal ion limitation through increased expression of rubrerythrin and rubredoxin genes (10). Bender and co-workers (10) reported that the HyP gene DVU2681 was reported to have the greatest increase in transcription of all genes in the Fur deletion mutant, while 6 of the 21 genes showing strong increases in transcription in the absence of Fur were HyP or CHyP genes.

In the current work, a number of HyP and CHyP genes appeared to be either differentially transcribed in Fur or PerR deletion mutants compared to the wild-type strain, or displayed stronger transcriptional responses in the deletion mutant. Expression changes occurring in the deletion mutants were inferred to result from the altered genetic background, and, in particular, when the PerR deletion mutant was exposed to a 0.1% O2 stress or the Fur deletion mutant was exposed to high salt or high nitrate. A small proportion of genes apparently were influenced by both Fur and Per (Table 2). Hence, in these cases, the prefix of ‘Fur-influenced’, ‘Per-influenced’ or ‘Fur- and Per-influenced’ was added to the annotation where appropriate (Supplementary Tables 5 and 6). While such numbers of proteins are probably not directly linked to either of these global regulatory systems, it is conceivable that indirect, cascading regulators could affect this number of genes.

Table 2.

Apparent HyP and CHyP influence by the global regulators Fur and Per

Fur Per Fur and Per
Polycistronic (535 total transcripts detected)
Highly expressed, single-stress response 0 0 0
Highly expressed, multiple-stress response 8 3 1
Single-stress response 0 2 0
Multiple-stress response 108 23 9
Monocistronic (677 total transcripts detected)
Highly expressed, single-stress response 0 0 0
Highly expressed, multiple-stress response 7 7 2
Single-stress response 3 3 2
Multiple-stress response 152 33 23

Validation of putative assignments with targeted deletion mutagenesis

In order to test the relevance of the stress responses recorded via microarrays and proteomics, as well as the inferred functional annotations applied to the HyP and CHyP genes in D. vulgaris during this study, two targeted deletions were constructed. A two-gene operon, DVU0303–DVU0304, predicted to be part of the Fur regulon (56) showed altered transcription rates in all conditions tested (Supplementary Table 6). The second targeted deletion was the monocistronic gene DVUA0095 that increased in expression only upon exposure to chromate (Figure 5B; Supplementary Table 5). Both deletion mutants and the wild type grew similarly in the control unamended medium containing 30 mM sodium lactate and 60 mM sodium sulfate at pH 7.2, pH 5.5 and when amended with 250 mM NaCl (Figure 7A–C). These results were as predicted for ΔDVUA0095 since it responded only to chromium exposure. However, the growth results of Δ(DVU0303-DVU0304) was unexpected since both genes originally increased in transcription at pH 5.5 and in 250 mM NaCl. However, such a lack of correlation between gene expression and cellular fitness in an imposed treatment is not uncommon (46).

Figure 7.

Figure 7.

Growth curves of the two targeted deletion mutants (genes DVU0303-0304 that were differentially expressed in 13 and 9 stresses, respectively (open circle), and a deletion of DVUA0095 that was only up-regulated in the presence of chromate (open triangle) (log2 R = 4.2; see also Figure 2A) and wild-type D. vulgaris (filled diamond) under various stress conditions. (A) Baseline lactate/sulfate growth, (B) pH 5.5, (C) 250 mM NaCl, (D) 100 mM NaNO3, (E) 1 mM NaNO2, (F) 2 mM NaNO2, (G) 0.2 mM K2CrO4, (H) 0.4 mM K2CrO4 and (I) 0.45 mM K2CrO4. Each stress experiment was conducted twice.

Other stress conditions included amendments of 100 mM NaNO3, or NaNO2 at 1 mM or 2 mM. Expression of DVU0303 in wild-type cultures increased with NaNO3 or NaNO2 (35,36). A slightly increased sensitivity of Δ(DVU0303-DVU0304) with these stressors (Figure 7D–F) was observed. The interpretation of changes in mutant growth rate and extent may suggest a functional role in tolerance to the treatment or a general metabolic perturbation that impedes the production of cell material. Surprisingly, the ΔDVUA0095 mutant also grew more poorly relative to the wild type when exposed to the various nitrogen species (Figure 7E and F). Reasons for this phenotype are not clear, but could include oxidative stress and subsequent protein or DNA damage due to the unstable nature of nitrite.

The final stress conditions used to test the deletion mutants were three concentrations of K2CrO4 at 0.2 mM, 0.4 mM and 0.45 mM. When wild-type cultures were challenged with 0.45 mM K2CrO4, the transcriptional analysis showed large increases in expression of DVUA0095 as well as for DVU0303-DVU0304. At the lower concentration of 0.2 mM K2CrO4, there were no apparent effects on the growth of any of the strains (Figure 7G). At 0.4 mM K2CrO4, ΔDVUA0095 exhibited a lag phase of 75 h, compared to 25–30 h in ΔDVU0303–DVU0304) and the wild type (Figure 7H), suggesting that the removal of this gene interfered with the response to this level of K2CrO4. At 0.45 mM K2CrO4, both ΔDVUA0095 and ΔDVU0303-DVU0304) showed extended lag phases of 190 and 175 h, respectively, approximately twice that of the wild type (Figure 7I).

CONCLUSIONS

The main purpose of this study was to validate the expression of D. vulgaris genes annotated as HyP and CHyP and to infer additional functions when possible. Overall, 98% of the HyP and CHyP genes were found to be transcribed via microarrays with 63% also being translated. Among these, many displayed specific transcriptional responses to single stresses whereas others showed responses to multiple treatments. Some of these genes were also shown to be influenced by or in regulons of, the global regulatory systems of Fur and/or PerR. The fact that these genes actually produce proteins increases the possibility that they may play some role in the responses to environmental perturbations.

Assessment of the ΔDVUA0095 and Δ(DVU0303–DVU0304) mutants confirmed a likely role for DVUA0095 in cellular responses to chromate as inferred from the microarray results. However, further complexity exists as revealed by some surprising phenotypes of the mutants as described earlier. Ongoing and future work is anticipated to include a detailed analysis of many HyP and CHyP genes, particularly those that displayed responses to only a single stress. This will include gene tagging with eventual protein complex elucidation, assessment of gene deletions and additional stress treatments. Through these efforts, a better understanding of the physiological and metabolic aspects of bacteria with sequenced genomes such as D. vulgaris will be achieved. This type of methodical analysis of unknown genes may well be useful as a means to derive more meaningful functional annotations in other organisms where a few or many RNA microarray and/or proteomic datasets exist. Reanalysis of these data can range from simply comparing HyP and CHyP transcript or proteome abundances in different culturing conditions to the more discreet stress conditions such as were used to generate the data for the present work. In addition, decreasing the lists of HyP and CHyP genes will improve annotations through inter-organismal comparisons.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

Environmental Stress Pathways Project and the Virtual Institute for Microbial Stress and Survival (http://vimss.lbl.gov), supported by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research, Genomics Program [GTL DE-AC02-05CH11231] between Lawrence Berkeley National Laboratory and the U.S. Department of Energy. Funding for open access charge: Environmental Stress Pathways Project.

Conflict of interest statement. None declared.

Supplementary Material

[Supplementary Data]
gkp164_index.html (757B, html)

ACKNOWLEDGEMENTS

We thank Amy Kucken (UMC) for performing the RT–PCR work and Angela Norbeck (PNNL) for assistance with revision of the proteomics database. The AMT tag proteomics portion of this research was performed using the Environmental Molecular Sciences Laboratory (EMSL), a national scientific user facility sponsored by the Department of Energy's Office of Biological and Environmental Research located at Pacific Northwest National Laboratory.

REFERENCES

  • 1.Elias DA, Yang F, Mottaz HM, Belieav AS, Lipton MS. Enrichment of functional redox reactive proteins and identification by mass spectrometry results in several terminal Fe(III)-reducing candidate proteins in Shewanella oneidensis MR-1. J. Microbiol. Methods. 2007;68:367–375. doi: 10.1016/j.mimet.2006.09.023. [DOI] [PubMed] [Google Scholar]
  • 2.Clark ME, He Q, He Z, Huang KH, Alm EJ, Wan XF, Hazen TC, Arkin AP, Wall JD, Zhou JZ, et al. Temporal transcriptomic analysis as Desulfovibrio vulgaris Hildenborough transitions into stationary phase during electron donor depletion. Appl. Environ. Microbiol. 2006;72:5578–5588. doi: 10.1128/AEM.00284-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Zhang W, Gritsenko MA, Moore RJ, Culley DE, Nie L, Petritis K, Strittmatter EF, Smith RD, Brockman FJ. A proteomic view of Desulfovibrio vulgaris metabolism as determined by liquid chromatography coupled with tandem mass spectrometry. Proteomics. 2006;6:4286–4299. doi: 10.1002/pmic.200500930. [DOI] [PubMed] [Google Scholar]
  • 4.Beliaev AS, Thompson DK, Fields MW, Wu L, Lies DP, Nealson KH, Zhou J. Microarray transcription profiling of a Shewanella oneidensis etrA mutant. J. Bacteriol. 2002;184:4612–4616. doi: 10.1128/JB.184.16.4612-4616.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Beliaev AS, Thompson DK, Khare T, Lim H, Brandt CC, Li G, Murray AE, Heidelberg JF, Giometti CS, Yates J, III, et al. Gene and protein expression profiles of Shewanella oneidensis during anaerobic growth with different electron acceptors. Omics. 2002;6:39–60. doi: 10.1089/15362310252780834. [DOI] [PubMed] [Google Scholar]
  • 6.Drepper T, Eggert T, Circolone F, Heck A, Kraub U, Guterl J, Wendorff M, Losi A, Gartner W, Jaeger K. Reporter proteins for in vivo fluorescence without oxygen. Nat. Biotechnol. 2007;25:443–445. doi: 10.1038/nbt1293. [DOI] [PubMed] [Google Scholar]
  • 7.Regoes A, Hehl AB. SNAP-tag™ mediated live cell labeling as an alternative to GFP in anaerobic organisms. Biotechnology. 2005;39:809–812. doi: 10.2144/000112054. [DOI] [PubMed] [Google Scholar]
  • 8.Butland G, Peregrín-Alvarez JM, Li J, Yang W, Yang X, Canadien V, Starostine A, Richards D, Beattie B, Krogan N, et al. Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature. 2005;433:531–537. doi: 10.1038/nature03239. [DOI] [PubMed] [Google Scholar]
  • 9.Wall JD, Arkin P, Balci NC, Rapp-Giles BJ. In: Microbial Sulfur Metabolism. Dahl C, Friedrich CG, editors. Berlin: Springer-Verlag; 2007. pp. 1–11. [Google Scholar]
  • 10.Bender KS, Yen HB, Hemme CL, Yang Z, He Z, He Q, Zhou J, Huang KH, Alm EJ, Hazen TC, et al. Analysis of a ferric uptake regulator (Fur) mutant of Desulfovibrio vulgaris Hildenborough. Appl. Environ. Microbiol. 2007;73:5389–5400. doi: 10.1128/AEM.00276-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Beliaev AS, Saffarini DA. Shewanella putrefaciens mtrB encodes an outer membrane protein required for Fe(III) and Mn(IV) reduction. J. Bacteriol. 1998;180:6292–6297. doi: 10.1128/jb.180.23.6292-6297.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Beliaev AS, Saffarini DA, McLaughlin JL, Hunnicutt D. MtrC, an outer membrane decahaem c cytochrome required for metal reduction in Shewanella putrefaciens MR-1. Mol. Microbiol. 2001;39:722–730. doi: 10.1046/j.1365-2958.2001.02257.x. [DOI] [PubMed] [Google Scholar]
  • 13.Kolker E, Makarova KS, Shabalina S, Picone AF, Purvine S, Holzman T, Cherny T, Armbruster D, Munson RS, Kolesov G, et al. Identification and functional analysis of hypothetical genes expressed in Haemophilus influenzae. Nucleic Acids Res. 2004;32:2353–2361. doi: 10.1093/nar/gkh555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Delcher AL, Harmon D, Kasif S, White O, Salzberg SL. Improved microbial gene identification with GLIMMER. Nucleic Acids Res. 1999;27:4636–4641. doi: 10.1093/nar/27.23.4636. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Salzberg SL, Delcher AL, Kasif S, White O. Microbial gene identification using interpolated Markov models. Nucleic Acids Res. 1998;26:544–548. doi: 10.1093/nar/26.2.544. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Doerks T, vonMering C, Bork P. Functional clues for hypothetical proteins based on genomic context analysis in prokaryotes. Nucleic Acids Res. 2004;32:6321–6326. doi: 10.1093/nar/gkh973. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Huynen M, Snel BWL, III, Bork P. Predicting protein function by genomic context: quantitative evaluation and qualitative inferences. Gen. Res. 2000;10:1204–1210. doi: 10.1101/gr.10.8.1204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Lu P, Szafron D, Greiner R, Wishart DS, Fyshe A, Pearcy B, Poulin B, Eisner R, Ngo D, Lamb N. PA-GOSUB: a searchable database of model organism protein sequences with their predicted gene ontology molecular function and subcellular localization. Nucleic Acids Res. 2005;33:D147–D153. doi: 10.1093/nar/gki120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.vonMering C, Jensen LJ, Snel B, Hooper SD, Krupp M, Foglierini M, Jouffre N, Huynen MA, Bork P. STRING: known and predicted protein–protein associations, integrated and transferred across organisms. Nucleic Acids Res. 2005;33:D433–D437. doi: 10.1093/nar/gki005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Enault F, Suhre K, Poirot O, Abergel C, Claverie J. Phydbac (phylogenomic display of bacterial genes): an interactive resource for the annotation of bacterial genomes. Nucleic Acids Res. 2003;31:3720–3722. doi: 10.1093/nar/gkg603. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sjölander K. Phylogenomic inference of protein molecular function: advances and challenges. Bioinformatics. 2004;20:170–179. doi: 10.1093/bioinformatics/bth021. [DOI] [PubMed] [Google Scholar]
  • 22.Elias DA, Monroe ME, Marshall MJ, Romine MF, Beliaev AS, Fredrickson JK, Anderson GA, Smith RD, Lipton MS. Global detection and characterization of hypothetical proteins in Shewanella oneidensis MR-1 using LC-MS based proteomics. Proteomics. 2005;5:3120–3130. doi: 10.1002/pmic.200401140. [DOI] [PubMed] [Google Scholar]
  • 23.Elias DA, Monroe ME, Smith RD, Fredrickson JK, Lipton MS. Confirmation of the expression of a large set of conserved hypothetical proteins in Shewanella oneidensis MR-1. J. Microbiol. Methods. 2006;66:223–233. doi: 10.1016/j.mimet.2005.11.009. [DOI] [PubMed] [Google Scholar]
  • 24.Kolker E, Picone AF, Galperin MY, Romine MF, Higdon R, Makarova KS, Kolker N, Anderson GA, Qiu X, Auberry KJ, et al. Global profiling of Shewanella oneidensis MR-1: expression of hypothetical genes and improved functional annotations. Proc. Natl Acad. Sci. USA. 2005;102:2099–2104. doi: 10.1073/pnas.0409111102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Heidelberg JF, Seshadri R, Haveman SA, Hemme CL, Paulsen IT, Kolonay JF, Eisen JA, Ward N, Methe B, Brinkac LM, et al. The genome sequence of the anaerobic, sulfate-reducing bacterium Desulfovibrio vulgaris Hildenborough. Nat. Biotechnol. 2004;22:554–559. doi: 10.1038/nbt959. [DOI] [PubMed] [Google Scholar]
  • 26.Chhabra SR, He Q, Huang KH, Gaucher SP, Alm EJ, He Z, Hadi MZ, Hazen TC, Wall JD, Zhou J, et al. Global analysis of heat shock response in Desulfovibrio vulgaris Hildenborough. J. Bacteriol. 2006;188:1817–1828. doi: 10.1128/JB.188.5.1817-1828.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Fournier M, Aubert C, Dermoun Z, Durand M, Moinier D, Dolla A. Response of the anaerobe Desulfovibrio vulgaris Hildenborough to oxidative conditions: proteome and transcript analysis. Biochimica. 2006;88:85–94. doi: 10.1016/j.biochi.2005.06.012. [DOI] [PubMed] [Google Scholar]
  • 28.Mukhopadhyay A, He Z, Alm EJ, Arkin AP, Baidoo EE, Borglin SC, Chen W, Hazen TC, He Q, Holman H, et al. Salt stress in Desulfovibrio vulgaris Hildenborough: an integrated genomics approach. J. Bacteriol. 2006;188:4068–4078. doi: 10.1128/JB.01921-05. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zhang W, Culley DE, Scholten JCM, Hogan M, Vitiritti L, Brockman FJ. Global transcriptomic analysis of Desulfovibrio vulgaris on different electron donors. Anton. van Leeuwen. 2006;89:221–237. doi: 10.1007/s10482-005-9024-z. [DOI] [PubMed] [Google Scholar]
  • 30.Postgate JR. The Sulphate-Reducing Bacteria. 2nd. Cambridge, MA, USA: Cambridge University Press; 1984. [Google Scholar]
  • 31.Elias DA, Krumholz LR, Wong D, Long PE, Suflita JM. Characterization of microbial activities and U reduction in a shallow aquifer contaminated by uranium mill tailings. Microb. Ecol. 2003;46:83–91. doi: 10.1007/s00248-002-1060-x. [DOI] [PubMed] [Google Scholar]
  • 32.Elias DA, Suflita JM, McInerney MJ, Krumholz LR. Periplasmic cytochrome c3 of Desulfovibrio vulgaris is directly involved in H2-mediated metal but not sulfate reduction. Appl. Environ. Microbiol. 2004;70:413–420. doi: 10.1128/AEM.70.1.413-420.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Lovley DR, Phillips EJP. Reduction of chromate by Desulfovibrio vulgaris and its c3 cytochrome. Appl. Environ. Microbiol. 1994;60:726–728. doi: 10.1128/aem.60.2.726-728.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Zhang W, Culley DE, Gritsenko MA, Moore RJ, Nie L, Scholten JCM, Petritis K, Strittmatter EF, Smith RD, et al. LC–MS/MS based proteomic analysis and functional inference of hypothetical proteins in Desulfovibrio vulgaris. Biochem. Biophys. Res. Commun. 2006;349:1412–1419. doi: 10.1016/j.bbrc.2006.09.019. [DOI] [PubMed] [Google Scholar]
  • 35.Redding AM, Mukhopadhyay A, Joyner DC, Hazen TC, Keasling JD. Study of nitrate stress in Desulfovibrio vulgaris Hildenborough using iTRAQ proteomics. Brief. Funct. Gen. Prot. 2006:1–11. doi: 10.1093/bfgp/ell025. [DOI] [PubMed] [Google Scholar]
  • 36.He Q, Huang KH, He Z, Alm EJ, Fields MW, Hazen TC, Arkin AP, Wall JD, Zhou J. Energetic consequences of nitrite stress in Desulfovibrio vulgaris Hildenborough inferred from global transcriptional analysis. Appl. Environ. Microbiol. 2006;72:4370–4381. doi: 10.1128/AEM.02609-05. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Stolyar S, He Q, Joachimiak MP, He Z, Yang ZK, Borglin SE, Joyner DC, Huang K, Alm E, Hazen TC, et al. Response of Desulfovibrio vulgaris to alkaline stress. J. Bacteriol. 2007;189:8944–8952. doi: 10.1128/JB.00284-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Mukhopadhyay A, Redding AM, Joachimiak MP, Arkin AP, Borglin SE, Dehal PS, Chakraborty R, Geller JT, Hazen TC, He Q, et al. Cell-wide responses to low-oxygen exposure in Desulfovibrio vulgaris Hildenborough. J. Bacteriol. 2007;189:5996–6010. doi: 10.1128/JB.00368-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.He Z, Wu L, Fields MW, Zhou J. Use of microarrays with different probe sizes for monitoring gene expression. Appl. Environ. Microbiol. 2005;71:5154–5162. doi: 10.1128/AEM.71.9.5154-5162.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Alm EJ, Huang KH, Price MN, Koche RP, Keller K, Dubchak IL, Arkin AP. The MicrobesOnline web site for comparative genomics. Gen. Res. 2005;15:1015–1022. doi: 10.1101/gr.3844805. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Lipton MS, Pasa-Tolic L, Anderson GA, Anderson DJ, Auberry DL, Battista JR, Daly MJ, Fredrickson J, Hixson KK, Kostandarithes H, et al. Global analysis of the Deinococcus radiodurans proteome by using accurate mass tags. Proc Natl Acad Sci USA. 2002;99:11049–11054. doi: 10.1073/pnas.172170199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Eng JK, McCormack AL, Ill JRY. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass. Spectrosc. 1994;5:976–989. doi: 10.1016/1044-0305(94)80016-2. [DOI] [PubMed] [Google Scholar]
  • 43.Harkewicz R, Belov ME, Anderson GA, Pasa-Tolic L, Masselon CD, Prior DC, Udseth HR, Smith RD. ESI-FTICR mass spectrometry employing data-dependent external ion selection and accumulation. J. Am. Soc. Mass Spectrosc. 2002;13:144–154. doi: 10.1016/S1044-0305(01)00343-9. [DOI] [PubMed] [Google Scholar]
  • 44.Keller A, Nesvizhskii AI, Kolker E, Aebersold R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 2002;74:5383–5392. doi: 10.1021/ac025747h. [DOI] [PubMed] [Google Scholar]
  • 45.Strittmatter EF, Kangas LJ, Petritis K, Mottaz HM, Anderson GA, Shen Y, Jacobs JMDGCII, Smith RD. Application of peptide LC retention time information in a discriminant function for peptide identification by tandem mass spectrometry. J. Prot. Res. 2004;3:760–769. doi: 10.1021/pr049965y. [DOI] [PubMed] [Google Scholar]
  • 46.Giaever G, Chu AM, Ni L, Connelly C, Riles L, Veronneau S, Dow S, Lucau-Danila A, Anderson K, Andre B, et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002;418:387–391. doi: 10.1038/nature00935. [DOI] [PubMed] [Google Scholar]
  • 47.Shoemaker D, Lashkari DA, Morris D, Mittmann M, Davis. RW. Quantitative phenotypic analysis of yeast deletion mutants using a highly parallel molecular bar-coding strategy. Nat. Gen. 1996;14:450–456. doi: 10.1038/ng1296-450. [DOI] [PubMed] [Google Scholar]
  • 48.Price MN, Dehal PS, Arkin AP. FastBLAST: homology relationships for millions of proteins. PLoS One. 2008;3:e3589. doi: 10.1371/journal.pone.0003589. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Pruitt KD, Maglott DR. RefSeq and Locus Link: NCBI gene-centered resources. Nucleic Acids Res. 2001;29:137–140. doi: 10.1093/nar/29.1.137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Price MN, Huang KH, Alm EJ, Arkin AP. A novel method for accurate operon predictions in all sequenced prokaryotes. Nucleic Acids Res. 2005;33:880–892. doi: 10.1093/nar/gki232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003;4:41–55. doi: 10.1186/1471-2105-4-41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Marchler-Bauer A, Panchenko AR, Shoemaker BA, Thiessen PA, Geer LY, Bryant SH. CDD: a database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res. 2002;30:281–283. doi: 10.1093/nar/30.1.281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Gardy JL, Laird MR, Chen F, Rey S, Walsh CJ, Ester M, Brinkman FSL. PSORTb v.2.0: expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis. Bioinformatics. 2005;21:617–623. doi: 10.1093/bioinformatics/bti057. [DOI] [PubMed] [Google Scholar]
  • 54.Gardy JL, Spencer C, Wang K, Ester M, Tusnady GE, Simon I, Hua S, deFays K, Lambert C, Nakai K, et al. PSORT-B: improving protein subcellular localization prediction for Gram-negative bacteria. Nucleic Acids Res. 2003;31:3613–3617. doi: 10.1093/nar/gkg602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Krogh A, Larsson B, Heijne Gv, Sonnhammer ELL. Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. J. Mol. Biol. 2001;305:567–580. doi: 10.1006/jmbi.2000.4315. [DOI] [PubMed] [Google Scholar]
  • 56.Rodionov DA, Dubchak I, Arkin AP, Alm E, Gelfand MS. Reconstruction of regulatory and metabolic pathways in metal-reducing δ-proteobacteria. Gen. Biol. 2004;5:R90.1–R90.27. doi: 10.1186/gb-2004-5-11-r90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Escolar LA, Perez-Martin J, Lorenzo VD. Opening the iron box: transcriptional metalloregulation by the Fur protein. J. Bacteriol. 1999;181:6223–6229. doi: 10.1128/jb.181.20.6223-6229.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Hantke K. Iron and metal regulation in bacteria. Curr. Opin. Microbiol. 2001;4:172–177. doi: 10.1016/s1369-5274(00)00184-3. [DOI] [PubMed] [Google Scholar]
  • 59.Rowland BM, Taber HW. Duplicate isochorismate synthase genes of Bacillus subtilis: regulation and involvement in the biosyntheses of menaquinone and 2,3-dihydroxybenzoate. Mol. Microbiol. 1996;178:854–861. doi: 10.1128/jb.178.3.854-861.1996. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Schneider R, Hantke K. Iron-hydroxamate uptake systems in Bacillus subtilis: identification of a lipoprotein as part of a binding proteindependent transport system. Mol. Microbiol. 1993;8:111–121. doi: 10.1111/j.1365-2958.1993.tb01208.x. [DOI] [PubMed] [Google Scholar]
  • 61.Bsat N, Herbig A, Casillas-Martınez L, Setlow P, Helmann JD. Bacillus subtilis contains multiple Fur homologues; identification of the iron uptake (Fur) and peroxide regulon (PerR) repressors. Mol. Microbiol. 1998;29:189–198. doi: 10.1046/j.1365-2958.1998.00921.x. [DOI] [PubMed] [Google Scholar]
  • 62.Lee JW, Helmann JD. The PerR transcription factor senses H2O2 by metal-catalysed histidine oxidation. Nature. 2006;440:363–367. doi: 10.1038/nature04537. [DOI] [PubMed] [Google Scholar]
  • 63.Heidelberg JF, Paulsen IT, Nelson KE, Gaidos EJ, Nelson WC, Read TD, Eisen JA, Seshadri R, Ward N, Methe B, et al. Genome sequence of the dissimilatory metal ion-reducing bacterium Shewanella oneidensis. Nat. Biotechnol. 2002;20:1118–1123. doi: 10.1038/nbt749. [DOI] [PubMed] [Google Scholar]
  • 64.Luo Q, Hixson KK, Callister SJ, Lipton MS, Morris BEL, Krumholz LR. Proteome analysis of Desulfovibrio desulfuricans G20 mutants using the accurate mass and time (AMT) tag approach. J. Prot. Res. 2007;6:3042–3053. doi: 10.1021/pr070127o. [DOI] [PubMed] [Google Scholar]
  • 65.Romine MF, Elias DA, Monroe ME, Auberry K, Fang R, Fredrickson JK, Anderson GA, Smith RD, Lipton MS. Validation of Shewanella oneidensis MR-1 small proteins by AMT tag-based proteome analysis. OMICS. 2004;8:239–254. doi: 10.1089/omi.2004.8.239. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

[Supplementary Data]
gkp164_index.html (757B, html)

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

RESOURCES