Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2005 Jul 25;102(31):10913–10918. doi: 10.1073/pnas.0504766102

The psychrophilic lifestyle as revealed by the genome sequence of Colwellia psychrerythraea 34H through genomic and proteomic analyses

Barbara A Methé *,, Karen E Nelson *, Jody W Deming ‡,§, Bahram Momen , Eugene Melamud , Xijun Zhang , John Moult , Ramana Madupu *, William C Nelson *, Robert J Dodson *, Lauren M Brinkac *, Sean C Daugherty *, Anthony S Durkin *, Robert T DeBoy *, James F Kolonay *, Steven A Sullivan *, Liwei Zhou *, Tanja M Davidsen *, Martin Wu *, Adrienne L Huston **, Matthew Lewis *, Bruce Weaver *, Janice F Weidman *, Hoda Khouri *, Terry R Utterback *, Tamara V Feldblyum *, Claire M Fraser *
PMCID: PMC1180510  PMID: 16043709

Abstract

The completion of the 5,373,180-bp genome sequence of the marine psychrophilic bacterium Colwellia psychrerythraea 34H, a model for the study of life in permanently cold environments, reveals capabilities important to carbon and nutrient cycling, bioremediation, production of secondary metabolites, and cold-adapted enzymes. From a genomic perspective, cold adaptation is suggested in several broad categories involving changes to the cell membrane fluidity, uptake and synthesis of compounds conferring cryotolerance, and strategies to overcome temperature-dependent barriers to carbon uptake. Modeling of three-dimensional protein homology from bacteria representing a range of optimal growth temperatures suggests changes to proteome composition that may enhance enzyme effectiveness at low temperatures. Comparative genome analyses suggest that the psychrophilic lifestyle is most likely conferred not by a unique set of genes but by a collection of synergistic changes in overall genome content and amino acid composition.

Keywords: proteome, psychrophily, bioremediation, astrobiology, three-dimensional homology modeling


By volume, most of Earth's biosphere is cold and marine, with 90% of the ocean's waters at 5°C or colder. Fully 20% of Earth's surface environment is frozen, including permanently frozen soil (permafrost), terrestrial ice sheets (glacial ice), polar sea ice, and snow cover (1). Although a diversity of microorganisms can be recovered from these environments, only cold-adapted organisms can be active in them (2). Among the cold-adapted bacteria, the genus Colwellia (1) within the γ-proteobacteria, provides an unusual case: All characterized members are strictly psychrophilic (requiring temperatures of ≤20°C to grow on solid media) having been obtained from stably cold marine environments, including deep sea and Arctic and Antarctic sea ice (3). Many members of this genus produce extracellular polymeric substances relevant to biofilm formation and cryoprotection (4, 5) and enzymes capable of degrading high-molecular-weight organic compounds. These traits are likely to make Colwellia species important to carbon and nutrient cycling wherever they occur in the cold marine environment, from contaminated sediments to ice formations under study as analogs for possible habitats on a younger Earth (6) and other planets and moons (e.g., Mars and Europa) (1, 7).

Colwellia psychrerythraea 34H isolated from Arctic marine sediments (8) represents the type species of the genus Colwellia (3). It grows reliably in heterotrophic media over a temperature range of approximately -1°C to 10°C, with cardinal growth temperatures (optimum of 8°C, maximum of 19°C, and extrapolated minimum of -14.5°C) (5) ranking among the lowest for all characterized bacteria (2). Maximum cell yield is achieved at subzero temperature (-1°C), cells continue to swim in sugar solutions down to -10°C (9), and growth can occur under deep-sea pressures. C. psychrerythraea also produces extracellular polysaccharides and, in particular, cold-active enzymes with low temperature optima for activity and marked heat instability (10).

The features that define cold adaptation and our comprehension of them continue to evolve. Several current biochemical models of enzyme catalysis are predicated on an increased flexibility in certain regions of cold-active enzyme architecture and high activity coupled with a concomitant increase in thermolability (11). However, the adaptations to protein architecture essential to cold-active enzymes are still not well understood, and inquiries to unlock these adaptations continue to be an active area of investigation (12). Nonetheless, the biochemical properties of cold-active enzymes have made them attractive for exploitation in a number of biochemical, bioremediation, and industrial processes (13). Given these attractions (the known traits of C. psychrerythraea, its products, and its place in a genus of global occurrence across all manner of cold marine habitat), C. psychrerythraea was selected as a model organism for genomic studies of bacterial cold adaptation.

Materials and Methods

Sequencing, Gene Identification, and Genome Analysis. Cloning, sequencing, and assembly were as described for genomes sequenced by The Institute for Genomic Research (14, 15). Open reading frames (ORFs) [or coding sequences (CDSs)] likely to encode proteins were predicted by glimmer (16) (Table 1). This program, based on interpolated Markov models, was trained with ORFs >90 bp from the genomic sequence that had blastx hits to The Institute for Genomic Research nonredundant internal protein database with an expectation value better than 1× 10-5, as well as with any C. psychrerythraea genes available in GenBank. All predicted proteins >30 aa were searched against a nonredundant protein database as described in ref. 14. Frameshifts and point mutations were detected and corrected where appropriate. Remaining frameshifts and point mutations are considered to be authentic and were annotated as “authentic frameshift” or “authentic point mutation,” or, in the case of multiple lesions within a single CDS, “degenerate.” Protein membrane-spanning domains were identified by toppred (17). The 5′ regions of each ORF were inspected to define initiation codons using homologies, position of ribosomal binding sites, and transcriptional terminators. Two sets of hidden Markov models were used to determine ORF membership in families and superfamilies: pfam 14.0 (18) and tigrfams 4.0 (19). pfam 14.0 hidden Markov models were also used with a constraint of a minimum of two hits to find repeated domains within proteins and mask them. Domain-based paralogous families were then built by performing all-versus-all searches on the remaining protein sequences using a modified version of a method described in ref. 15.

Table 1. General features of the C. psychrerythraea 34H genome.

Size, bp 5,373,180
G + C percentage 37.9
Predicted CDSs, n 4,937
Avg. size of CDS, bp 924
Percentage coding 85
rRNA operons (16S-23S-5S), n 9
tRNAs, n 88
Structural RNAs, n 1
CDSs similar to known protein, n 2,664
CDSs similar to proteins of unknown, n 543
Function*
Number of conserved hypothetical 690
Proteins
Hypothetical proteins,n 1,041
ρ-Independent terminators, n 584
*

Unknown function: substantial sequence similarity to a named protein for which no function is currently attributed

Conserved hypothetical protein: sequence similarity to a translation of another ORF; however, no experimental evidence for the protein exists

Hypothetical protein: no substantial similarity to any sequenced protein

The replicative origin was determined by colocalization of genes (dnaA and dnaN) often found near the origin in prokaryotic genomes and GC nucleotide skew (G-C/G+C) analysis (20). Regions of atypical nucleotide composition were identified by χ2 analysis: The distribution of all 64 trinucleotides (3-mer) was computed for the complete genome in all six ORFs, followed by the 3-mer distribution in 2,000-bp windows. Windows overlapped by 1,000 bp. For each window, the χ2 statistic on the difference between its 3-mer content and that of the whole genome was computed (see Fig. 2, which is published as supporting information on the PNAS web site). Information on additional comparative genomic analyses can be found as Supporting Text, which is published as supporting information on the PNAS web site.

Amino Acid Composition and Statistical Analysis. Twenty-two predicted proteomes from complete genomes were chosen for analysis to represent a range of genome G+C percentage content and optimal growth temperature (OGT) (Table 2), including several close mesophilic relatives of Colwellia (e.g., Shewanella oneidensis and Vibrio spp.), and more divergent lineages, such as the psychrophilic δ-proteobacterium Desulfotalea psychrophila (21) and Gram-positive bacteria. Because no complete genome sequence from a thermophilic representative of the γ-proteobacteria was available, other thermophilic representatives for which the complete sequence is available from the bacterial domain were included.

Table 2. A list of 22 organisms used for predicted proteome composition analyses.

Name OGT class G + C percentage Lineage
Caulobacter crescentus CB15 M 67.1 α-Proteobacteria
Corynebacterium glutamicum ATCC 13032 M 53.7 Actinomycetale
Escherichia coli O157:H7 M 50.5 γ-Proteobacteria
Desulfovivrio vulgaris Hildenborough M 63.2 δ-Proteobacteria
Haemophilus influenzae Rd KW20 M 38 γ-Proteobacteria
Listeria innocua CLIP11262 M 37.3 Firmicute
Listeria monocytogenes EGD M 37.9 Firmicute
Oceanobacillus iheyensis HTE831 M 35.7 Firmicute
Pasteurella multocida Pm70 M 40.3 γ-Proteobacteria
Pseudomonas aeruginosa PA01 M 66.4 γ-Proteobacteria
Pseudomonas putida KT2440 M 61.4 γ-Proteobacteria
S. oneidensis MR-1 M 45.9 γ-Proteobacteria
Vibrio cholerae E1 Tor N16961 M 47 γ-Proteobacteria
Vibrio parahaemolyticus RIMD 2210633 M 45.3 γ-Proteobacteria
Vibrio vulnificus CMCP6 M 47 γ-Proteobacteria
Yersinia pestis KIM M 47.6 γ-Proteobacteria
C. psychrerythraea 34H P 37.9 γ-Proteobacteria
D. psychrophila LSv54 P 46.8 δ-Proteobacteria
Aquifex aeolicus VF5 T 43.5 Aquificaceae
Thermosynechococcus elongatus BP-1 T 53.9 Cyanobacteria
Thermotoga maritima MSB8 T 46.1 Thermotogales
Thermoanaerobacter tengcongensis MB4T T 37.5 Clostridia

The OGTs of the selected organisms range between 25°C and 37°C for mesophilic (M) genomes, 8°C and 10°C for psychrophilic (P) genomes, and 75°C and 85°C for thermophilic (T) genomes. The G + C percentage is the average G + C content of the complete genome. Lineage refers to the organism's phylogenetic placement based on 16S rRNA phylogenetic analysis.

Three-Dimensional Protein Homology Modeling. The set of predicted CDSs from C. psychrerythraea and 21 other completed genomes (Table 2) were searched against the Protein Data Bank to identify potential three-dimensional templates. Searches used the blastp algorithm with an e-value cutoff of <0.001. To ensure that homology models were of high quality, the following criteria were applied to sequence and template selection. Structural templates had to cover at least 80% of the length of the CDS, and minimum sequence identity between the template and CDS had to be >30% to avoid alignment errors. For multiple potential models for a given template from a genome, the one with greatest similarity to the template was selected. To lessen issues associated with paralogous model comparisons, the CDS used for the model was determined by bidirectional best matches to the C. psychrerythraea CDS.

After these filtering steps, the remaining sequences were processed through a homology building pipeline by using target-template alignment (clustalw), backbone copy (ape) and side-chain building (scwrl) (22). A total of 2,026 models from 173 templates including 624,000 residues was constructed. Surface-exposed area was calculated for each residue in the model by using the stride program (23). From the surface composition analysis, modeled residues were subdivided into two categories: exposed and buried. If total exposed area for a residue was <10% of maximum exposed area for the residue type, the residue in question was defined as buried.

Canonical Discriminant Analysis (CDA). The candisc procedure (with parametric, linear classification rules and prior probabilities proportional to sample size) of the sas system (sas/stat 9.1, SAS, Cary, NC) was used to perform CDAs. CDA was used to identify and rank amino acid proportions that could discriminate between the three OGT classes. To explain the total variance of a data set, CDA elucidates a number of canonical discriminant functions (CDFs) equal to the smaller number of independent variables or the number of class variables minus one. This analysis included 20 independent variables (the proportions of the 20 amino acids) and three OGT classes resulting in the computation of two CDFs, all of which were significant (based on large eigenvalues and P values < 0.05) for each data set.

Total canonical structure was chosen over total standardized canonical scores as an index describing the property and structure of the CDFs due to the presence of significant pairwise correlations among some of the independent variables. Total canonical structure indicates correlation between original variables and canonical scores of a given CDF and can therefore be considered variable loadings. Because of the presence of linear dependence of the variables, the analysis was conducted by removing one variable at random. Each variable was tested until the one with the least influence on the canonical loadings was detected.

Results

General Genome Features and Comparisons. The C. psychrerythraea 34H genome consists of a single circular chromosome of 5,373,180 bp with 4,937 predicted CDSs. General genome features are presented in Table 1 (see also Fig. 2). The C. psychrerythraea genome offers a distinct phylogenetic framework for evaluating evolution of the psychrophilic lifestyle.

Membrane Fluidity. An important challenge to life at cold temperatures is the ability to maintain the cell membrane in a liquid-crystalline state (13). C. psychrerythraea genome analyses have revealed the presence of a suite of CDSs important to this function. CDSs predicted to function in polyunsaturated fatty acid synthesis (a well established cold adaptation) (24), including a putative operon related to polyketide-like polyunsaturated fatty acid synthases (CPS3104, CPS3103, CPS3102, and CPS3099) (25) were identified, as was a fatty acid cis/trans isomerase that would confer an ability to alter the ratio of cis- to trans-esterified fatty acids in phospholipids (CPS0087). Polyunsaturated fatty acid synthesis and increased cis-isomerization, for example, enhance membrane fluidity at low temperatures.

C. psychrerythraea genome analyses further elucidated several copies of genes vital to fatty acid and phospholipid biosynthesis (see Table 4, which is published as supporting information on the PNAS web site). The genome possesses a 3-oxoacyl-(acyl-carrier-protein) reductase (CPS2297), which can catalyze the first reduction step in fatty acid biosynthesis, and two putative copies (CPS0665 and CPS1608). One of these copies (CPS1608) is located in an operon with other fatty acid metabolism genes in an approximate 15-kilobase region of the genome populated by CDSs involved in fatty acid metabolism and branched-chain amino acid catabolism. The physical proximity of these CDSs to one another may indicate that the branched-chain portion resulting from branched-chain amino acid catabolism is incorporated into branched-chain fatty acid synthesis. The introduction of branched-lipids into membrane architecture is a mechanism that reduces membrane viscosity at cold temperatures (24). The C. psychrerythraea genome also possesses multiple copies of putative β-keto-acyl carrier proteins (KAS-II and KAS-III) (see Table 4) central to fatty acid synthesis and control of straight and branched-chain lipid ratios in the cell membrane (26, 27).

Carbon, Energy, and Nitrogen Reserves. Genome analyses reveal the capacity of C. psychrerythraea to produce polyhydroxyalkanoate (PHA) compounds, a family of polyesters that serve as intracellular carbon and energy reserves, of which some forms have been linked to pressure adaptation (28). PHA compounds are of industrial interest for their thermoplastic and elastomeric properties and as sources for fine chemical synthesis (29, 30). The ability of C. psychrerythraea to produce PHA compounds is likely linked with its significant capacity to produce and degrade fatty acids suggested in part by the expansion (multiple gene duplications) of acyl-CoA dehydrogenase and enoyl-CoA hydratase gene families, whose roles are central to the utilization of medium- and long-chain fatty acids that can be oxidized via the β-oxidation cycle and produce substrates for PHA biosynthesis (31). Because PHA composition depends in part on carbon sources and the manner in which they are catabolized (2931), these gene family expansions may infer versatility in the nature of PHAs that C. psychrerythraea can synthesize. This versatility is further suggested by the presence of three copies of granule-associated protein genes (CPS4086, CPS4085, and CPS4084) physically located among genes devoted to PHA synthesis. Two of these copies have no homologs in other bacterial lineages.

Genomic investigations further disclosed an ability of C. psychrerythraea to synthesize and degrade polyamides similar to cyanophycin, protein-like polymers that function as nitrogen reserves. Polyamides are of industrial interest as possible biopolymer substitutes for polyacrylates (32). For C. psychrerythraea, the collective gene complement for biosynthesis of PHA and cyanophycin-like compounds may be of particular benefit to the psychrophilic lifestyle by ensuring intracellular reserves of nitrogen and carbon to aid in circumventing any cold-imposed limitations to carbon and nitrogen uptake (33).

Compatible Solutes. Genome analyses of C. psychrerythraea indicate an overall expansion in transporter families involved in the uptake of compatible solutes that may serve multiple roles, including osmoprotection and cryoprotection (34). The genome possesses at least five putative transporters involved in the movement of quaternary ammonium compounds of the betaine/carnitine/choline transporter family (CPS4027, CPS4009, CPS3860, CPS2003, and CPS1335) as well as homologs to the ATP-binding cassette transport system for direct uptake of glycine betaine (CPS4933, CPS4934, and CPS4935). Furthermore, two lineage-specific duplications of genes encoding betaine aldehyde hydrogenase, choline dehydrogenase, and a BetI family regulator that function in the production and regulation of glycine betaine from the uptake of choline-containing compounds are also present.

The C. psychrerythraea genome possesses four copies of serine hydroxymethyltransferase (glyA) (CPS4031, CPS3844, CPS2427, and CPS0728), which catalyzes the interconversion of glycine and serine, and four copies of formyltetrahydrofolate deformylase (purU) (CPS4357, CPS4036, CPS3620, and CPS2482), which regulates intracellular concentrations of the tetrahydrofolate onecarbon pool. Collectively, these enzymes play critical roles in regulating key biosynthetic pathways, such as purine and lipid biosynthesis. Two of the glyA and purU copies are located on the C. psychrerythraea chromosome in putative operons that also encode CDSs for sarcosine oxidase. Sarcosine oxidase demethylates sarcosine to glycine and 5,10-methylenetetrahydrofolate, the substrates for glyA (35). Because sarcosine can be derived from the catabolism of the osmoprotectant and cryoprotectant betaine, the metabolism of choline, betaine, and sarcosine may be linked. C. psychrerythraea may derive benefits from these genes and metabolic links due to their dual influences on the production of protective compounds and as sources of carbon, nitrogen, and energy (see Supporting Text).

Extracellular Compounds. The synthesis of extracellular polysaccharides and degradative enzymes is important to the overall metabolism and possibly to the cold-adaptation of C. psychrerythraea in its environment. Extracellular polysaccharides can serve as cryoprotectants (4, 5), and extracellular enzyme production may represent another mechanism for overcoming threshold requirements for dissolved organic carbon in cold environments (5, 33). For instance, over half of the enzymes assigned to the degradation of proteins and peptides in the C. psychrerythraea genome are predicted to be localized external to the cytoplasm, among the highest percentage in any completed genome (see Table 5, which is published as supporting information on the PNAS web site). The genome further encodes an expansion of putative members of the extracellular factor subfamily of σ-70 transcription factors that have multiple roles, including regulating extracellular polysaccharide biosynthesis, and of paralogous families of glycosyl transferases, which are also likely to function in extracellular polysaccharide synthesis.

Unusual Capacities and Genes. Genome analyses of C. psychrerythraea demonstrate the presence of many well described CDSs related to DNA metabolism and protein synthesis (and such common behaviors as motility) implying that overall essential enzymatic functions inherent to these basic processes are similar to other proteobacteria. However, the presence of some CDSs with distant homology to γ-proteobacterial sequences and expansions of other gene families alludes to the existence of additional as yet undescribed mechanisms possibly related to cold adaptation, including posttranslational modifications (36) (see Table 6, which is published as supporting information on the PNAS web site).

Five CDSs encode for common forms of cold-shock proteins (CPS4529, CPS2895, CPS0737, CPS0718, and CPS0148), of which four appear to be localized to the cytoplasm. The fifth (CPS0148) is predicted to contain an unusual protein architecture by the inclusion of three transmembrane-spanning regions. In addition, two CDSs (CPS2624 and CPS1911) present in C. psychrerythraea bear modest homology to cold-shock domain proteins from Vibrio spp. and S. oneidensis and may represent as yet uncharacterized proteins relevant to cold adaptation.

The C. psychrerythraea genome also contains a suite of CDSs with roles in the synthesis and catabolism of complex, high-molecular-weight organic compounds and possible C1 metabolism that ultimately facilitate a wide range of responses to its environment, emphasizing the versatile roles that C. psychrerythraea can play in carbon and nutrient cycles of cold environments (Supporting Text; see also Fig. 3, which is published as supporting information on the PNAS web site). These activities include not only the catabolism of complex compounds to provide energy and carbon sources but also mechanisms of detoxification relevant to the bioremediation of cold environments and other biotechnological applications. For instance, C. psychrerythraea possesses a homolog to the 2,4,6-trichlorophenol monooxygenase of Ralstonia eutropha (CPS2047) along with two homologs to reductive dehalogenases involved in the degradation of pentachlorophenol (CPS1905 and CPS1668) (37). Genome analyses also suggest the presence of putative dioxygenases (CPS1846 and CPS4358) and monooxygenases (CPS3582, CPS3527, and CPS1273) critical to the cleavage of ring bearing and aliphatic compound degradation.

A particularly unusual finding in the C. psychrerythraea genome is the presence of CDSs involved in the biosynthesis and utilization of coenzyme F420. These coenzymes were first discovered in methanogens, where they are critical participants in methanogenesis (38). Since their discovery, homologs to coenzyme F420 sequences have been reported in only a few bacterial lineages. In Rhodococcus spp., homologs to coenzyme F420 have been linked to polynitroaromatic compound degradation, such as 2,4,-dinitrophenol (39). These findings suggest possible roles in C. psychrerythraea related to C1 or aromatic compound metabolism.

The ability to respond to reactive oxygen species is a vital function when undergoing aerobic metabolism and is likely to be of further importance in C. psychrerythraea because of the need to protect cell membrane polyunsaturated fatty acids, which are generally more susceptible than saturated fatty acids to oxidative damage (40). Genome analyses reveal an enhanced antioxidant capacity in C. psychrerythraea through the presence of a variety of CDSs encoding antioxidants, including three copies of catalase genes (CPS2441, CPS3328, and CPS1344). In addition to the typical iron- or manganese-containing superoxide dismutase (SOD) C. psychrerythraea also possesses a putative nickel-containing SOD (CPS0444), a form that has not been reported in proteobacterial lineages. An alternative SOD may provide a mechanism to neutralize reactive oxygen species while circumventing any environmentally imposed iron limitations.

The C. psychrerythraea genome includes two putative filamentous phage genomes (Fig. 1). Despite sharing nine identical CDSs and corresponding intergenic regions (conserved in sequence and gene order), the two phage genomes diverge from one another at both ends where putative repressor genes are found, indicative of a possible past recombination event in a circularized phage genome. The proximity of these phage genomes to integrases and transposases suggests their involvement in larger mobile genetic elements.

Fig. 1.

Fig. 1.

Scatter plot of the scores of CDF 1 and CDF 2 indicating separation based on OGT. (a) The scatter plot from the total primary residue analysis. (b) The scatter plot from the surface-exposed residue analysis. (c) The scatter plot from the buried residue analysis. ▪, psychrophiles; •, mesophiles; ▴, thermophiles.

Amino Acid Composition Comparisons with Other Bacterial Genomes. To successfully thrive in cold environments, psychrophiles must synthesize enzymes that perform effectively at low temperatures. Cold-temperature environments present several challenges, in particular reduced reactions rates, increased viscosity, and altered microscopic structure (including phase changes) of the surrounding medium. To cope with these conditions cold-adapted enzymes have been found to exhibit an increase in enzyme turnover (Kcat) or improvement of catalytic efficiency (Kcat/KM) at a given temperature, relative to their mesophilic counterparts (11). These changes have been suggested to originate from localized increases in enzyme flexibility or “molecular plasticity” in critical locations of the protein architecture. This plasticity is believed to result in a lowering of the transition state barrier for the catalyzed reaction, relative to mesophilic counterparts, and may ultimately require lower thermodynamic stability (1113).

The availability of a whole-genome sequence provides an important opportunity to investigate these phenomena from a proteome level in the bacterial domain. The predicted amino acid composition of the entire C. psychrerythraea proteome was compared with those from 21 other complete, predicted proteomes from bacteria across a range of OGTs and genome G+C percentage content (Table 2). The predicted proteomes of each organism were first compared based on their primary compositions. Next, predicted CDSs were examined for matches to known three-dimensional protein structures. The amino acid compositions of the surface and buried residues were then estimated for the subset of CDSs for which significant matches to three-dimensional structures could be determined. Amino acid composition for each data set was analyzed for statistically significant differences that may be related to OGT.

The proportion of polar residues was among the highest, whereas charged amino acids proportions were among the lowest for the two psychrophilic genomes in the surface composition. An overall decrease in nonpolar residues from the exposed composition was noted only for C. psychrerythraea (see Table 7, which is published as supporting information on the PNAS web site).

CDA (41, 42) was performed on each data set to identify and rank amino acid proportions that could discriminate among the three OGT groups: psychrophile, mesophile, or thermophile. CDA quantifies a set of underlying constructs, CDFs, that are the linear functions of the original variables (amino acids) minimizing within (thermal) group and maximizing among (thermal) group variations. The contribution of the original variables on the construction of these functions can then be quantified through the use of their loadings, which correspond to the correlation coefficients between the original variables and the function scores.

For each data set, CDA correctly grouped each organism based on amino acid composition into one of the three types of thermal classes (psychrophile, mesophile, or thermophile) (Fig. 1 and Table 3). Along CDF 1, the mesophile and psychrophile groups were much closer to one another relative to the thermophiles, and the greatest discrimination was between the thermophiles relative to the mesophiles and psychrophiles. Along CDF 2, the greatest discrimination occurred between the psychrophiles relative to the mesophiles and thermophiles (Fig. 1).

Table 3. Total canonical structure indicating correlation coefficient (r) between original variables (amino acids) and the scores of CDF 1 and CDF 2 for the primary, surface-exposed, and buried residue data sets.

CDF 1
CDF 2
“R” group character
Variable r P r P
Primary
   Asp 0.68 0.0005 0.05 0.80 Acidic
   Glu -0.66 0.0009 0.03 0.89 Acidic
   His 0.62 0.0022 0.12 0.59 Basic
   Val -0.56 0.0072 0.25 0.25 Hydrophobic
   Ser 0.56 0.0073 -0.47 0.02 Polar
   Tyr -0.53 0.010 -0.10 0.64 Polar/aromatic
   Met 0.53 0.011 0.07 0.74 Hydrophobic
   Thr 0.50 0.017 -0.02 0.94 Polar
   Ala 0.47 0.026 0.19 0.39 Hydrophobic
   Lys -0.47 0.029 -0.16 0.47 Basic
   Gln 0.45 0.037 0.05 0.83 Polar
Surface-exposed
   Ser 0.73 0.0001 0.43 0.04 Polar
   Gln 0.71 0.0002 0.045 0.84 Polar
   Val -0.67 0.0006 -0.12 0.59 Hydrophobic
   Glu -0.58 0.0048 -0.04 0.85 Acidic
   Lys -0.56 0.0062 0.20 0.36 Basic
   Met 0.52 0.012 0.008 0.97 Hydrophobic
   His 0.45 0.03 -0.45 0.03 Basic
   Asp 0.17 0.44 -0.48 0.02 Acidic
   Arg 0.17 0.45 -0.42 0.05 Basic
Buried
   Tyr -0.84 <.0001 -0.04 0.84 Polar/aromatic
   Asp 0.79 <.0001 -0.05 0.82 Acidic
   His 0.72 0.0001 0.17 0.46 Basic
   Ser 0.60 0.0031 -0.54 0.009 Polar
   Thr 0.58 0.0043 -0.10 0.65 Polar
   Glu -0.57 0.0055 0.14 0.54 Acidic
   Ala 0.51 0.015 0.10 0.65 Hydrophobic
   Lys -0.50 0.019 -0.09 0.69 Basic
   Ile -0.42 0.048 -0.36 0.10 Hydrophobic

Data are ranked in descending order by the absolute value of the correlations for CDF 1 and only significant variables (P < 0.05) are shown. “R” group character refers to the chemical character of the amino acid residue side chain. Significant variables on CDF 1 are shown in normal font. Data in bold are significant on CDF 1 and CDF 2, and data in bold italic are significant on CDF 2 only. A positive correlation between a variable and CDF 1 for all data sets indicates an increase in that variable in the transition from thermophile to mesophile and psychrophile. A negative correlation between a variable and CDF 2 in the primary and buried data sets indicates an increase in that variable in the transition from mesophile and thermophile to psychrophile. A negative correlation between a variable and CDF 2 in the surface data set indicates a decrease in that variable in the transition from mesophile and thermophile to psychrophile.

CDF 1 from the primary data set was defined by significant correlations with aspartate (acidic), histidine (weakly basic), serine, threonine, glutamine (polar/noncharged), methionine, alanine (hydrophobic), glutamate (acidic), valine (hydrophobic), tyrosine (polar/aromatic), and lysine (basic). A negative correlation (indicating an increase in the relative proportion from thermophile to psychrophile) with serine was the only significant contributor to CDF 2 (Fig. 1 and Table 3). The buried data set had similar variables defining CDF 1 and CDF 2 when compared with the primary data set (Table 3).

The surface data set represented the greatest separation of the thermal classes along CDF 1 and, in particular, between the mesophiles and psychrophiles along CDF 2 (Fig. 1). Significant decreases from mesophile to psychrophile were noted for aspartate, arginine, and histidine, which is consistent with the overall decreases in charged amino acid composition (see Table 7). A trend toward the substitution of aspartate for glutamate was also noted, and an increase in serine content was again suggested in the transition from thermophile to psychrophile (Fig. 1 and Table 3).

The results of this analysis indicate more significant differences in amino acid composition between the thermophiles and either mesophiles or psychrophiles than between mesophiles and psychrophiles (Table 3). These results may in part reflect the limit of resolution of this study. Changes related to protein composition which could not be directly detected by this analysis, including the influence of noncovalent interactions (e.g., hydrogen bonds, van der Waals forces, and hydrophobic interactions) (43), may be important contributors to differences in protein thermostability, and the same set of changes is not likely to occur in every enzyme class. In addition, several of the organisms included as mesophiles in this study have psychrotolerant physiologies [e.g., S. oneidensis MR-1 (44) and the Listeria spp. (34)], which may further confound the identification of differences in amino acid composition between mesophiles and psychrophiles.

Nonetheless, even with the inclusion of divergent bacterial lineages, significant differences in amino acid composition between the three thermal classes could be identified, the results of which are consistent with several reported trends. Several studies of protein thermostability have suggested a decrease in polar residues and increase in charged amino acids as temperature increases (4548), trends that were generally supported in this study. Increased serine in both psychrophilic genomes revealed in this study would contribute to increased polar surface amino acids. Of interest are the presence of four copies of glyA (which controls the interconversion of glycine and serine) in C. psychrerythraea, which suggests that these gene multiplications may play a role in maintaining proportions of this key amino acid. An apparent favoring of aspartic acid over glutamate, particularly on the surface of psychrophilic proteins, is consistent with studies of thermophilic proteins; in C. psychrerythraea, these substitutions would translate to a decrease in the unfolding transition temperature of proteins, effectively making them less heat-stable (46).

Discussion

The genome sequence of C. psychrerythraea has provided an important opportunity to better understand this organism's potential functions in the marine environment and to gain insight into adaptations that help define and influence the psychrophilic lifestyle. Genome analyses revealed a variety of metabolic capabilities and roles in carbon and nutrient cycling, including some that may be useful to bioremediation in cold environments.

From a genome-level perspective, adaptations potentially beneficial to life in cold environments can be seen in several broad categories. Several of the adaptive strategies appear to increase fitness by effectively overcoming multiple obstacles at low temperature, including temperature-dependent barriers to carbon and nitrogen uptake. These strategies are reflected in expansions of gene families related to cell membrane synthesis, a capacity for uptake or synthesis of compounds that in part may confer cryotolerance, including PHAs (which may also aid in pressure adaptation), cyanophycin-like compounds, and glycine betaine, as well as the capacity to produce copious quantities of extracellular enzymes and polysaccharides.

The three-dimensional protein homology modeling and CDA examination in this study has provided a comprehensive comparison of proteome composition across OGTs and divergent lineages in the bacterial domain to determine whether signals possibly related to thermal adaptation of proteins could be detected. Differences likely to enhance architectural changes to enzymes favoring their effectiveness at cold temperatures appear consistent with some previously reported trends. In particular, a trend toward increased polar residues (particularly serine), the substitution of aspartate for glutamate, and a general decrease in charged residues on the surface of proteins were noted. Each of these changes is consistent with prevailing theories that increased flexibility and reduced thermostability contribute to enzyme cold adaptation. However, effects such as those arising from noncovalent interactions are also likely contributors to the stability of enzymes at different temperatures, and modifications may differ depending on the class of enzyme.

Given the existence of psychrophiles in lineages across the tree of life, multiple mechanisms contributing to cold adaptation may exist. To date, genome analyses suggest that cold adaptation consists of a collection of synergistic changes in overall genome configuration reflected in terms of gene content and amino acid composition rather than the presence of a unique set of genes indicative of and responsible for conferring a psychrophilic lifestyle.

Supplementary Material

Supporting Information

Acknowledgments

We thank O. White, M. Heaney, S. Lo, M. Holmes, V. Sapiro, R. Karamchedu, and R. Deal for informatics, database, and software support; The Institute for Genomic Research faculty and sequencing core for expert advice and assistance; and L. E. Wells for phage-gene analysis. This work was supported by the United States Department of Energy Office of Biological and Environmental Research through the Microbial Genomes Program. J.W.D. acknowledges support from the National Aeronautics and Space Administration Astrobiology Institute.

Author contributions: B.A.M., K.E.N., J.W.D., and C.M.F. designed research; B.A.M., K.E.N., B.M., E.M., X.Z., J.M., R.M., W.C.N., R.J.D., L.M.B., S.C.D., A.S.D., R.T.D., J.F.K., S.A.S., L.Z., T.M.D., M.W., A.L.H., M.L., B.W., J.F.W., H.K., T.R.U., and T.V.F. performed research; J.W.D., B.M., E.M., X.Z., and J.M. contributed new reagents/analytic tools; B.A.M., K.E.N., J.W.D., B.M., E.M., X.Z., J.M., R.M., W.C.N., R.J.D., L.M.B., S.C.D., A.S.D., R.T.D., J.F.K., S.A.S., L.Z., T.M.D., M.W., A.L.H., M.L., B.W., J.F.W., H.K., T.R.U., and T.V.F. analyzed data; and B.A.M., K.E.N., J.W.D., B.M., E.M., J.M., and A.L.H. wrote the paper.

Abbreviations: CDS, coding sequence; OGT, optimal growth temperature; CDA, canonical discriminant analysis; CDF, canonical discriminant function; PHA, polyhydroxyalkanoate.

Data deposition: The annotated genome sequence has been deposited in the GenBank database (accession no. CP000083).

References

  • 1.Deming, J. W. & Eicken, H. (2005) Life in Ice (Cambridge Univ. Press, Cambridge, U.K.).
  • 2.Bowman, J. P. (2005) Adv. Microb. Ecol., in press. [DOI] [PubMed]
  • 3.Deming, J. W. & Junge, K. (2005) Bergey's Manual of Systematic Bacteriology (Bergey's Manual Trust, East Lansing, MI), Vol. 2.
  • 4.Krembs, C., Deming, J. W., Junge, K. & Eicken, H. (2002) Deep-Sea Res. 49, 2163-2181. [Google Scholar]
  • 5.Huston, A. L. (2003) Ph.D. thesis (Univ. of Washington, Seattle).
  • 6.Kirschvink, J. L., Gaidos, E. J., Bertani, L. E., Beukes, N. J., Gutzmer, J., Maepa, L. N. & Steinberger, R. E. (2000) Proc. Nat. Acad. Sci. USA 97, 1400-1405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Deming, J. W. (2002) Cur. Opin. Microbiol. 3, 301-309. [DOI] [PubMed] [Google Scholar]
  • 8.Huston, A. L., Krieger-Brockett, B. B. & Deming, J. W. (2000) Appl. Environ. Microbiol. 2, 383-388. [DOI] [PubMed] [Google Scholar]
  • 9.Junge, K., Eicken, H. & Deming, J. W. (2003) Appl. Environ. Microbiol. 69, 4282-4284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Huston, A. L., Methé, B. & Deming, J. W. (2004) Appl. Environ. Microbiol. 70, 3321-3328. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Georlette D., Blaise, V., Collins, T., D'Amico, S., Gratia, E., Hoyoux, A., Marx, J. C., Sonan, G., Feller, G. & Gerday, C. (2004) FEMS Microbiol. Rev. 28, 25-42. [DOI] [PubMed] [Google Scholar]
  • 12.Marx, J. C., Blaise, V., Collins, T., D'Amico, S., Delille, D., Gratia, E., Hoyoux, A., Huston, A. L., Sonan, G., Feller, G. & Gerday, C. (2004) Cell Mol. Biol. 50, 643-655. [PubMed] [Google Scholar]
  • 13.Feller, G. & Gerday, C. (2003) Nat. Rev. Microbiol. 1, 200-208. [DOI] [PubMed] [Google Scholar]
  • 14.Methé, B. A., Nelson, K. E., Eisen, J. A., Paulsen, I. T., Nelson, W., Heidelberg, J. F., Wu, D., Wu, M., Ward, N., Beanan, M. J., et al. (2003) Science 302, 1967-1969. [DOI] [PubMed] [Google Scholar]
  • 15.Heidelberg, J. F., Seshadri, R., Haveman, S. A., Hemme, C. L., Paulsen, I. T., Kolonay, J. F., Eisen, J. A., Ward, N., Methé, B. A., Brinkac, L. M., et al. (2004) Nat. Biotechnol. 22, 554-559. [DOI] [PubMed] [Google Scholar]
  • 16.Delcher, A. L., Harmon, D., Kasif, S., White, O. & Salzberg, S. L. (1999) Nucleic Acids Res. 27, 4636-4641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Nielsen, H., Engelbrecht, J., Brunak, S. & vonHeijne, G. (1997) Int. J. Neural Syst. 8, 581-599. [DOI] [PubMed] [Google Scholar]
  • 18.Bateman, A., Birney, E., Cerruti, L., Durbin, R., Etwiller, L., Eddy, S. R., Griffiths-Jones, S., Howe, K. L., Marshall, M. & Sonnhammer, E. L. L. (2002) Nucleic Acids Res. 20, 276-280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Haft, D. H. & Selengut, J. D. (2003) Nucleic Acids Res. 31, 41-43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Lobry, J. R. (1996) Mol. Biol. Evol. 13, 660-665. [DOI] [PubMed] [Google Scholar]
  • 21.Rabus, R., Ruepp, A., Frickey, T., Rattei, T., Fartmann, B., Stark, M., Bauer, M., Zibat, A., Lombardot, T., Becker, I., et al. (2004) Environ. Microbiol. 6, 887-902. [DOI] [PubMed] [Google Scholar]
  • 22.Bower, M. J., Cohen, F. E. & Dunbrack, R. L., Jr. (1997) J. Mol. Biol. 267, 1268-1282. [DOI] [PubMed] [Google Scholar]
  • 23.Frishman, D. & Argos, P. (1995) Proteins 23, 566-579. [DOI] [PubMed] [Google Scholar]
  • 24.Russell, N. J. (1997) Comp. Biochem. Physiol. 118, 489-493. [DOI] [PubMed] [Google Scholar]
  • 25.Metz, J. G., Roessler, P., Facciotti, D., Levering, C., Dittrich, F., Lassner, M., Valentine, R., Lardizabal, K., Domergue, F., Yamada, A., et al. (2001) Science 293, 290-292. [DOI] [PubMed] [Google Scholar]
  • 26.Lai, C. Y. & Cronan, J. E. (2004) J. Bacteriol. 186, 1869-1878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Choi, K.-H., Heath, R. J. & Rock, C.O. (2000) J. Bacteriol. 182, 365-370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Martin, D. D., Bartlett, D. H. & Roberts, M. F. (2002) Extremophiles 6, 507-514. [DOI] [PubMed] [Google Scholar]
  • 29.Madison, L. L. & Huisman, G. W. (1999) Microbiol. Mol. Biol. Rev. 63, 21-53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Lee, S. Y., Park, S. H., Lee, Y. & Lee, S. H. (2001) Production of Chiral and Other Valuable Compounds from Microbial Polyesters (Wiley, Weinheim, Germany).
  • 31.Park, S. J. & Lee, S. Y. (2003) J. Bacteriol. 185, 5391-5397. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Frey, K. M., Oppermann-Sanio, F. B., Schmidt, H. & Steinbuchel, A. (2002) Appl. Environ. Microbiol. 68, 3377-3384. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Pomeroy, L. R. & Wiebe, W. J. (2001) Aquatic Microbiol. 23, 187-204. [Google Scholar]
  • 34.Wemekamp-Kamphuis, H. H., Sleator, R. D., Wouters, J. A., Hill, C. & Abee, T. (2004) Appl. Environ. Microbiol. 70, 2912-2918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Chlumsky, L. J., Zhang, L. & Jorns, M. S. (1995) J. Biol. Chem. 270, 18252-18259. [DOI] [PubMed] [Google Scholar]
  • 36.Dalluge, J. J., Hamamoto, T., Horikoshi, K., Morita, R. Y., Stetter, K. O. & McCloskey, J. A. (1997) J. Bacteriol. 179, 1918-1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Cai, M. & Xun, L. (2002) J. Bacteriol. 184, 4672-4680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Hagemeier, C. H., Shima, S., Thauer, R. K., Bourenkov, G., Bartunik, H. D. & Ermler, U. (2003) J. Mol. Biol. 332, 1047-1057. [DOI] [PubMed] [Google Scholar]
  • 39.Heiss, G., Trachtmann, N., Abe, Y., Masahiro, T. & Kackmuss, H.-J. (2003) Appl. Environ. Microbiol. 69, 2748-2754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Barriere, C., Cento, D., Lebert, A., Leroy-Setrin, S., Berdague, J. L. & Talon, R. (2001) FEMS Microbiol. Lett. 201, 181-185. [DOI] [PubMed] [Google Scholar]
  • 41.McLachlan, G. J. (1992) Discriminant Analysis and Statistical Pattern Recognition (Wiley, New York).
  • 42.Momen, B. & Zehr, J. P. (1998) Ecol. Appl. 8, 497-507. [Google Scholar]
  • 43.Lazardis, T., Archontis, G. & Karplus, M. (1995) Advances in Protein Chemistry (Academic, San Diego). [DOI] [PubMed]
  • 44.Abboud, R., Popa, R., Souza-Egipsy, V., Giometti, C. S., Tollaksen, S., Mosher, J. J., Findlay, R. H. & Nealson, K. H. (2005) Appl. Environ. Microbiol. 71, 811-816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Haney, P. J., Badger, H. J., Buldak, G. L., Reich, C. I., Woese, C. R. & Olsen, G. J. (1999) Proc. Nat. Acad. Sci. USA 96, 3578-3583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Lee, D. Y., Kyeong-Ae, K., Yi, Y. G. & Key-Sun, K. (2004) Biochem. Biophys. Res. Com. 320, 900-906. [DOI] [PubMed] [Google Scholar]
  • 47.Saunders, N. F., Thomas, T., Curmi, P. M., Mattick, J. S., Kuczek, E., Slade, R., Davis, J., Franzmann, P. D., Boone, D., Rusterholtz, K., et al. (2003) Genome Res. 13, 1580-1588. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Farias, S. T. & Bonato, M. C. (2003) Gen. Mol. Res. 2, 383-393. [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information
pnas_0504766102_1.pdf (642KB, pdf)
pnas_0504766102_2.pdf (60.8KB, pdf)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES