Abstract
The publicly available annotated archaeal genome sequences (23 complete and three partial annotations, October 2005) were searched for the presence of potential two-component open reading frames (ORFs) using gene category lists and BLASTP. A total of 489 potential two-component genes were identified from the gene category lists and BLASTP. Two-component genes were found in 14 of the 21 Euryarchaeal sequences (October 2005) and in neither the Crenarchaeota nor the Nanoarchaeota. A total of 20 predicted protein domains were identified in the putative two-component ORFs that, in addition to the histidine kinase and receiver domains, also includes sensor and signalling domains. The detailed structure of these putative proteins is shown, as is the distribution of each class of two-component genes in each species. Potential members of orthologous groups have been identified, as have any potential operons containing two or more two-component genes. The number of two-component genes in those Euryarchaeal species which have them seems to be linked more to lifestyle and habitat than to genome complexity, with most examples being found in Methanospirillum hungatei, Haloarcula marismortui, Methanococcoides burtonii and the mesophilic Methanosarcinales group. The large numbers of two-component genes in these species may reflect a greater requirement for internal regulation. Phylogenetic analysis of orthologous groups of five different protein classes, three probably involved in regulating taxis, suggests that most of these ORFs have been inherited vertically from an ancestral Euryarchaeal species and point to a limited number of key horizontal gene transfer events.
Keywords: histidine kinase, hybrid kinase, response regulator
Introduction
Two-component systems are one of the key means by which bacteria respond to environmental changes (Hoch 2000, Stock et al. 2000, Alves and Savageau 2003, Hellingwerf 2005). They are assumed to be of bacterial origin, having radiated into archaea and some eukaryotes by horizontal gene transfer (HGT) (Koretke et al. 2000). Two-component systems consist of a sensor and a response protein. The sensor protein is characterized by a histidine kinase (HK) made up of two main domains, a phosphoacceptor (HisKA) and a histidine kinase ATPase (HATPase) and, in many cases, other sensory domains are present (Galperin et al. 2001, Zhulin et al. 2003). The response protein (response-regulator, RR) is characterized by a response regulator domain that has a conserved aspartate residue. The histidine kinase autophosphorylates a conserved histidine residue in response to a signal and the phosphate group is then transferred to the conserved aspartate residue of the response-regulator. The transfer of the phosphate group to the response-regulator elicits a response causing a change in taxis, development or gene expression. Histidine kinases and response-regulators are sometimes found together in a single polypeptide known as a hybrid kinase (Hoch 2000, Stock et al. 2000).
The recognition of the Archaea as a distinct division of life has been strengthened by the availability of a number of complete genome sequences, representing three phyla (Euryarchaota, Crenarchaeota and Nanoarchaeota). This has, in turn, enabled a more rigorous phylogenetic analysis based on the fusion of ribosomal protein sequences (Matte-Tailliez et al. 2002, Brochier et al. 2004, Bapteste et al. 2005, Makarova and Koonin 2005) and clusters of conserved orthologous genes (COGs) (Makarova and Koonin 2003). Analysis of genome sequences has revealed genes of bacterial origin in the genomes of archaea and vice versa (Nelson et al. 1999). The importance of HGT in the evolution of prokaryotes and the implications for phylogeny and definition of species is still being discussed (Ochman et al. 2000, Forterre et al. 2002, Boucher et al. 2003, Koonin 2003, Kurland et al. 2003, Lawrence and Hendrickson 2003).
For bacteria, it has been shown that the number of two-component genes possessed by an organism is related to the complexity of its genome, its physiology and the changeability of its habitat (Ashby 2004, Galperin 2005). The greater the value of any of those parameters, the greater the need for regulation of cellular activities.
The aim of this study was to analyze the complement of genes in archaeal genomes that could encode two-component proteins. The putative two-component proteins were classified by their domain structure, and the number of each class was determined for each species. Potential orthologous groups and those that may be part of operons, with two or more two-component genes, are indicated. Phylogenetic analysis of possible orthologous groups representing five classes of protein, three associated with taxis, is shown.
Materials and methods
Genome sequence data
The list of publicly available Euryarchaeal genome sequences is shown in Table 1, along with brief details of the habitat, physiology, genome size, putative number of open reading frames (ORFs) and the abbreviation used with gene sequences (Makarova and Koonin 2003). The sequences and annotations for the annotated sequences (up to October 2005) were accessed at the Integrated Microbial Genomes (IMG) server (http://img.jgi.doe.gov/pub/main.cgi) and HaloLex (http:// www.halolex.mpg.de/). The identity of potential two component genes was determined by reference to the published assignments located at http://www.tigr.org/tdb/ (Bult et al. 1996, Smith et al. 1997, Kawarabayasi et al. 1998, Klenk et al. 1997, Ng et al. 2000, Deppenmeier et al. 2002, Slesarev et al. 2002, Galagan et al. 2002, Cohen et al. 2003, Baliga et al. 2004, Falb et al. 2005). This was supplemented by BLASTP (Altschul et al. 1997) searches of each genome with a battery of two component domains (domains used include receivers from CheY and OmpR, HisKA/HATPase and Hpt; see Table A1) from Methanosarcina acetivorans and E. coli K12 at IMG (http://img.jgi.doe.gov/pub/main.cgi), The Integrated Genome Resource (TIGR, http://www.tigr.org/tdb/) or the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/).
Table 1.
Strain | Genus | Gene name | Temp opt. | Physiology | Genome | No. protein- |
prefix | (°C) | size (MB) | coding genes | |||
Archaeoglobus fulgidus | Archaeoglobales | Aful | 83 | Anaerobic, sulphate-reducing | 2.18 | 2,456 |
DSM 4304 | chemolitho- or chemorgano- | |||||
autotroph | ||||||
Ferroplasma acidarmanus | Thermoplasmales | 40 | Acidophile | 1.97 | 1740 | |
Incomplete | ||||||
Haloarcula marismortui | Halobacteriales | Hma | 37 | Aerobic halophilic chemorganotroph | 4.3 | 4,242 |
Halobacterium sp. | Halobacteriales | Halo | 37 | Aerobic halophilic chemorganotroph | 2.57 | 2,656 |
NRC-1 | ||||||
Methanocaldococcus jannaschii | Methanococcales | 85 | Chemolithoautotroph, anaerobe, | 1.66 | 1,758 | |
DSM 2661 | methanogen | |||||
Methanococcus maripaludis | Methanococcales | Mmar | 37 | Mesophilic anaerobe, | 1.67 | 1,742 |
S2 | hydrogenotrophic | |||||
methanogen | ||||||
Methanopyrus kandleri | Methanopyrales | 110 | Chemolithoautotroph. anaerobe, | 1.69 | 1,691 | |
AV19 | methanogen | |||||
Methanococcoides burtonii | Methanosarcinales | Mbur | ~0 | Anaerobe, psychrotolerant | ~2.6 | ~2872 |
DSM 6242 Incomplete | methanogen | |||||
Methanosarcina acetivorans | Methanosarcinales | Mace | 37 | Chemoorganoheterotroph, | 5.75 | 4,540 |
C2A | anaerobe, N2-fixing, | |||||
methanogen | ||||||
Methanosarcina barkeri | Methanosarcinales | Mbar | 37 | Chemoorganoheterotroph, | 4.87 | 3,380 |
str. fusaro | anaerobe, methanogen | |||||
Methanosarcina mazei | Methanosarcinales | Mmaz | 37 | Chemoorganoheterotroph, | 4.1 | 3,371 |
Go1 | anaerobe, N2-fixing, | |||||
methanogen | ||||||
Methanospirillum hungatei | Methanomicrobiales | Mhun | 37 | Chemoorganoheterotroph | 3.53 | 3,356 |
JF-1 | anaerobe, methanogen | |||||
Methanothermobacter | Methanobacteriales | Mthe | 65 | Chemolithoautotroph, anaerobe, | 1.75 | 1,914 |
thermautotrophicus str. | N2-fixing methanogen | |||||
Delta H | ||||||
Natronomonas pharaonis | Halobacteriales | Npha | 20 | Haloalkaliphile | 2.75 | 2,843 |
Picrophilus torridus | Thermoplasmales | 50 | Extreme acidophile | 1.58 | 1,582 | |
DSM 9790 | ||||||
Pyrococcus abyssi | Thermococcales | PAB | 96 | Anaerobic heterotroph | 1.76 | 1,769 |
GE5 | ||||||
Pyrococcus furiosus | Thermococcales | 96 | Anaerobic heterotroph | 1.91 | 2,115 | |
DSM 3638 | ||||||
Pyrococcus horikoshii | Thermococcales | PH | 98 | Anaerobic heterotroph | 1.74 | 2,110 |
OT3 | ||||||
Thermococcus kodakaraensis | Thermococcales | Thermoco | 102 | Anaerobic heterotroph | 2.09 | 2,358 |
Thermoplasma acidophilum | Thermoplasmales | 59 | Fac. anaerobe, chemorganotroph, | 1.56 | 1,482 | |
DSM 1728 | thermoacidophile | |||||
Thermoplasma volcanium | Thermoplasmales | 60 | Fac. anaerobe, chemorganotroph, | 1.55 | 1,499 | |
GSS1 | thermoacidophile |
Bioinformatic analysis
Putative two-component domains were initially assigned in ORFs by Pfam batch analysis (http://www.sanger.ac.uk/Software/Pfam/; Bateman et al. 2004). Domains were recorded for each two-component gene only if they were scored as “Pfam’s trusted match thresholds.” Domain assignments were checked and modified using the more extensive domain assignments at InterPro (http://www.ebi.ac.uk/interpro/). The results were used to classify the putative two-component proteins by domain organization using a nomenclature adapted from Ohmori et al. (2001), with each group subdivided by the organization of the identified signalling domains. The cartoon style diagrams that present the domain organization of the deduced sequences were constructed from these data, with the sizes of the domains roughly in proportion to each other. For clarity, each gene name begins with a four character acronym (except for Haloarcula marismortui, Pyrococcus abyssi GE5 and Pyrococcus horikoshii OT3, see Table 1) followed by either the locus tag that can be found at IMG or HaloLex (http://img.jgi.doe.gov/pub/main.cgi and http://www.halolex.mpg.de/) or the gene object identifier if the sequencing or annotation is incomplete.
To determine orthologous groups, orthology information, based on the bidirectional best hits from BLASTPs of each organism against each other organism polypeptide, is accessible at IMG (http://img.jgi.doe.gov/pub/main.cgi). This definition is not completely accurate, but it provides a useful approximation as it is not always possible to know whether the polypeptides arose from a single gene present in the last common ancestor (orthologues) or from a gene duplication within a genome (paralogues). Alignments for phylogenetic analysis were performed by TCoffee (Notredame et al. 2000) and accessed at the Centre Nationale de la Recherche Scientifique website (http://igs-server.cnrs-mrs.fr/Tcoffee/tcoffee_cgi/index.cgi) and ClustalW alignments (Thompson et al. 1994) were performed at the European Bioinformatics Institute (http://www.ebi.ac.uk/clustalw/). Representatives from three bacterial phyla were included in the alignments (chosen by having the best match to one of the archaeal ORFs, either as an orthologue or by BLASTP at IMG). Phylogenetic analysis by neighbor-joining (Bootstrap 250) was performed using MEGA version 3.0 (Kumar et al. 2004) and by maximum-likelihood (Felsenstein 1996) using Molphy, accessed at the Institut Pasteur, biological software website (http://bioweb.pasteur.fr/intro-uk.html#phylo).
Closely linked two-component genes and probable operons that contain two or more two-component genes were constructed from the chromosome map images at IMG (http://img.jgi.doe.gov/pub/main.cgi), TIGR (http://www.tigr.org/tdb/) and the biology of extremophiles website (http://www-archbac.u-psud.fr/homepage.html)
Results and discussion
The structural classification of potential two-component proteins is shown in Table 2. No two-component encoding gene could be found in the Crenarchaeota or Nanoarchaeota (data not shown). Sensor domains are drawn as ellipses and two-component (HisKA, HATPase_c and response regulator) and output domains are drawn as rectangles. parentheses followed by figures indicate the number of similar domains that may be found in the proteins listed in each subclass.
Table 2.
Histidine kinases
Different types of histidine kinases (HK) are listed in Table 2A. Histidine kinases contain two domains; a dimerization and a phosphoacceptor domain (HisKA or HisKA_2) and a HATPase_c domain (Grebe and Stock 1999). HisKA and HisKA_2 are part of a His kinase A phosphoacceptor domain superfamily that also includes HWE_HK and HisKA_3 (Karniol and Vierstra 2004; Pfam accession CL0025).
HKI
Histidine kinase Is are HKs containing HisKA and HATPase domains. There may be other domains in some of these examples which are not currently recognized. The HKI ORFs vary greatly in size, ranging from 175 to 592 amino acids in length.
HKII
Histidine kinase IIs are HKs containing sensor GAF and PAS/PAC domains. The GAF domains (cGMP phosphodiesterase, adenylyl cyclases, bacterial transcription factors FhlA) are associated with small molecule binding, in particular cAMP and cGMP (Aravind and Ponting 1997, Ho et al. 2000, Anantharaman et al. 2001). The GAF domain is usually found in combination with PAS (Drosophila period clock protein, vertebrate aryl hydrocarbon receptor nuclear translocator and Drosophila single-minded protein) or PAC (PAS-associated C-terminal motif) domain, or both. One class of PAS domains is known to bind cofactors such as heme and FAD (Bibikov et al. 2000, Sardiwal et al. 2005). Sensing of light, oxygen or redox potential by PAS domains requires cofactors, whereas sensing signals such as voltage, xenobiotics and nitrogen availability does not (Ponting and Aravind 1997, Gilles-Gonzalez and Gonzalez 2004). The PAC domains are proposed to contribute to the PAS domain fold. The shared feature of GAF and PAS/PAC domains is the binding of a diverse set of regulatory small molecules that often remain unidentified; all three domains are common signal transduction system components (Anantharaman and Aravind 2001, Zhulin et al. 2003). There is one example containing Cache and one containing SBP_bac_3 (bacterial extracellular solute-binding proteins, family 3). The Cache domain is a signalling domain found in animal calcium channel subunits and it is thought to form an extracellular or periplasmic ligand sensor (Anantharaman and Aravind 2001). SBP_bac_3 is involved in active transport of solutes across the cytoplasic membrane and in the initiation of signal transduction pathways (Tam and Saier 1994). This is by far the largest subgroup of ORFs, containing 161 out of the total of 489 (33% of the total).
HKIII
Histidine kinase IIIs are HKs that possess a HAMP “linker” (histidine kinase, adenylyl cyclase, methyl-accepting chemotaxis protein and phosphatase) domain. The HAMP domain is usually associated with the transmission of a signal across a membrane from periplasmic ligand-binding domains (Aravind and Ponting 1999, Appleman and Stewart 2003, Zhu and Inouye 2004). Eight examples of HKIIIs have an N-terminal putative periplasmic signalling CHASE4 domain and four have an N-terminal periplasmic signalling Cache domain (Anantharaman and Aravind 2001, Zhulin et al. 2003). These domains are positioned next to the HAMP domain, presumably for efficient transfer of the signal.
HKVI
Histidine kinase IVs are the CheA-like chemotaxis signalling proteins that contain an N-terminal Hpt (histidine phosphotransfer) and Hkd (histidine kinase dimerization) domain and a C-terminal CheW domain. Some contain one or two P2 domains between the Hpt and Hkd. The Hpt domain is involved in mediating phosphotransfer from one receiver domain to another (Hoch 2000). Hkd (H-kinase-dim) is the dimerization domain of CheA and CheW that interacts with methyl-accepting chemotaxis proteins (MCPs), relaying signals to CheY, and thereby affecting flagellar rotation (West et al. 1995). The P2 domain is involved in enhancing the interaction of CheY with the HK (Jahreis et al. 2004, Stewart and van Bruggen 2004). Thermococcus kodakaraensis has two open reading frames with a frame shift mutation that probably encodes for a CheA-like protein. All of the HKVI genes discussed are located close to other genes that could be involved with signal transduction and are probably transcribed as single operons (see Table A2).
HATPase_c
These contain no dimerization or phosphoacceptor domains currently recognized at INTERPRO.
His_KA
There are five groupings that contain His_KA without a discernable HATPase_c domain.
Response regulators
Response regulators (RR) are listed in Table 2B. These contain a characteristic receiver (RR/T_reg) domain, which is about 120 amino acids long and contains a conserved aspartate residue about halfway along the molecule that accepts a phosphate group from an HK.
RR I
Response regulator Is are simple orphan (no other domain detected) RRs, representing the second largest group of two-component ORFs (24% of the total).
RRIII
Response regulator IIs contain an RR fused to a potential DNA binding domain. Such regulators are found only in H. marismortui. Of these, there are only three examples that containeither the HTH_10 or DUF24 domain (PF04967 and PF 01638). These are the only RRs that are possibly transcriptional regulators, but there may be other currently unidentified DNA-binding domains in other RRs or hybrid kinases.
RRIV
Response regulator IVs contain an N-terminal RR fused to output or signal domains. There are 16 examples of CheB fused to the RR. The CheB domain is related to methylesterase and is likely to be concerned with chemotaxis (West et al. 1995). There is one example of two RRs fused to a glycosyl transferase domain in M. thermoautotrophicus (Pfam Accession number: PF00353). The glycosyl transferase domain is involved in transferring sugar moieties from a donor to recipient molecules. There are a lot of Methanospirilum hungatei ORFs fused with PAS/PAC or GAF domains, or both, however, as the annotation is incomplete, some of these ORFs may turn out to be part of hybrid kinases.
Hybrid kinases
Hybrid kinases (HY) are shown in Table 2C. They are defined as containing both HK and RR domains. The nomenclature is based on the position and number of RR with respect to the HK. There is an incomplete HYI in A. fulgidus that has a PAC/ PAS and GAF sensor domain, but no discernable HATPase domain.
HYI
Hybrid kinase Is have a single RR N-terminal to the HK.
HYII
Hybrid kinase IIs have a single RR C-terminal to the HK. There is only one example in M. acetivorans.
HYIII
Hybrid kinase IIIs have two RRs either N or C-terminal to the HK.
Distribution of putative two-component ORFs
The total number of ORFs within each class of two-component proteins, for each species of Euryarchaeota, is shown in Table 3. No two-component ORFs were found in the four Crenarchaeota species or N. equitans (data not shown). No two-component ORF was found in M. jannaschii, M. kandleri, P. furiosus or in any of the members of the Thermoplasmales. The three other Pyrococcus species each have only three two-component ORFs (Thermococcus kodakaraensis HKVI that has a frame shift Tkod_61070420/30, that could be a sequencing error has been counted as one), representing 0.17% of the protein-coding capacity of the genome. Methanococcus maripaludis and Halobacterium also have a small number, six (0.34%) and 16 (0.64%), respectively. Archaeoglobus fulgidus, M. thermoautotrophicus and the four Methanosarcinales groups have a comparatively large number of two-component ORFs, from 23 to 67. This represents from 1.03 to 1.48% of the coding capacity of the four complete genome assignments. Haloarculamarismortui has the largest number of two-component encoding genes, of the complete annotations, which represents 1.93% of the total protein coding capacity of the genome. Methanospirillum hungatei appears to have the largest number of two-component genes at 87, though the annotation of the genome is incomplete (so no percentage is given in Table 3). The HKs form half to two-thirds of the two-component ORFs for each species (except Pyrococcus sp. and Methanospirillum hungatei). The DNA-binding domains (putative) were only detected as part of the RRs in H.marismortui.
Table 3.
Species | HKI | HKII | HKIII | HKVI | HK | HK | RRI | RRIII | RRIV | RR | HY | Total 2-C | % 2-C |
other | total | total | |||||||||||
Archaeoglobus fulgidus | 2 | 10 | 1 | 1 | 6 | 20 | 9 | 0 | 1 | 10 | 1 | 31 | 1.26 |
Haloarcula marismortui | 6 | 23 | 2 | 2 | 6 | 39 | 10 | 3 | 8 | 21 | 22 | 82 | 1.93 |
Halobacterium sp. NRC-1 | 1 | 7 | 0 | 1 | 1 | 10 | 4 | 0 | 1 | 5 | 2 | 17 | 0.64 |
M. thermoautotrophicus | 5 | 7 | 0 | 0 | 0 | 12 | 3 | 0 | 5 | 8 | 3 | 23 | 1.2 |
Methanococcus maripaludis | 0 | 0 | 2 | 1 | 0 | 3 | 2 | 0 | 1 | 3 | 0 | 6 | 0.34 |
Methanococcoides burtonii | 4 | 17 | 3 | 1 | 5 | 30 | 9 | 0 | 1 | 10 | 5 | 45 | incomp |
Methanosarcina acetivorans | 1 | 38 | 4 | 2 | 3 | 48 | 13 | 0 | 2 | 15 | 4 | 67 | 1.48 |
Methanosarcina barkeri | 1 | 21 | 0 | 1 | 1 | 24 | 10 | 0 | 1 | 11 | 2 | 35 | 1.03 |
Methanosarcina mazei | 1 | 20 | 4 | 2 | 1 | 28 | 12 | 0 | 2 | 14 | 3 | 45 | 1.33 |
Methanospirillum hungatei | 3 | 1 | 1 | 3 | 1 | 9 | 36 | 0 | 12 | 48 | 30 | 87 | incomp |
Natronomonas pharaonis | 0 | 17 | 1 | 1 | 3 | 22 | 6 | 0 | 3 | 9 | 11 | 42 | incomp |
Pyrococcus abyssi | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 2 | 0 | 3 | 0.17 |
Pyrococcus horikoshii | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 2 | 0 | 3 | 0.14 |
Thermococcus kodakarensis | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 2 | 0 | 3 | 0.13 |
489 |
The PAS/PAC and GAF sensory domains are found in 293 of the 489 putative proteins surveyed. These sensory domains are absent in the Pyrococcus sp. A total of 18 ORFs were found that contain the HAMP domain that would in most cases be involved in transferring signals from sensor domains detecting information outside the cell.
Orthologous groups
Potential orthologous groups are shown in Table 4. These results are based on the bidirectional best hits from BLASTPs at IMG. The identification of orthologous groups at IMG may not be correct in all cases as some groupings may include ORFs that are due to gene duplication, hence a paralogue (in a different organism) rather than an orthologue. It is, nevertheless, a useful tool for assigning putative orthologous groups when no functional information is available. The groups have been named with a three letter acronym for ease of reference (Table 4). In arr18/19, RRIV-CheB, the grouping was modified from the information at IMG based on the phylogenetic analysis presented in Figure 4 (see below). There are many orthologous groups that contain two or three members, particularly within the Methanosarcinales. The more interesting groups are those that have more members or that have members in different genera. These will be discussed below, particularly those that are part of taxis operons (ahk40–43, arr7, arr10, arr18 and arr19).
Table 4.
Gene name | Classification | Orthologue number |
MaceMA3962, MbarA_0679, MMAZMM0948 | HKI | ahk1 |
AfulAF208 | HKI | ahk2 |
Mbur_401643110 | HATPase | |
MaceMA1878, MbarA_3037 | HKII | ahk3 |
MaceMA1628, MMAZMM0518 | HKII | ahk4 |
MaceMA2890, MbarA_2935, MmazMM3295, MtheMTH174 | HKII | ahk5 |
Mhun_401800550 | HYI PASPAC | |
MaceMA1646, MbarA_3036 | HKII | ahk6 |
MaceMA0203, MbarA_1944 | HKII | ahk7 |
MmazMM2515, MbarA_3447, MaceMA1470 | HKII | ahk8 |
MaceMA0552, MmazMM0168 | HKII | ahk9 |
MaceMA2294, Mbur_401657970 | HKII | ahk10 |
MbarA_2362, MmazMM2990 | HKII | ahk11 |
MaceMA0619, MMAZMM1777 | HKII | ahk12 |
MaceMA1704, MmazMM2990 | HKII | ahk13 |
MaceMA0620, MmazMM1781, MbarA_1535 | HKII | ahk14 |
MaceMA1149, MmazMM2178 | HKII | ahk15 |
MaceMA0490, MmazMM1671, MbarA_2876 | HKII | ahk16 |
MaceMA1645, MbarA_2680, MmazMM2748, Mhun_401774840 | HKII | ahk17 |
MbarA_1687, MmazMM1931, Mhun_401789380, HmarrnAC0789 | HKII | ahk18 |
HaloVNG1234C, HmarrnAC3482 | HKII | ahk19 |
AfulAF0770, MmazMM2518, MaceMA3405 | HKII | ahk20 |
MaceMA4026, MmazMM0889 | HKII | ahk21 |
MaceMA2294, Mbur_401657970, HaloVNG1374G, Mhun_401803800, AfulAF0450, NphaNP1154A | HKII | ahk22 |
HmarrnB0156 | HKIII | |
AfulAF2109, Mhun_401800710, HmarrnAC1626 | HKII | ahk23 |
MaceMA3368, Mbur_401643080, MaceMM2773, MbarA_3250 | HKII | ahk24 |
MaceMA1627, MmazMM0172, MbarA_0404 | HKII | ahk25 |
MaceMA1991, MtheMTH468, MmazMM1915 | HKII | ahk26 |
MbarA_1666, Mhun_401774840 | HKII | ahk27 |
MmazMM2276, MaceMA1274 | HKII | ahk28 |
MmazMM1093, MbarA_3538 | HKII | ahk29 |
MaceMA0759, | HKII | ahk30 |
Mhun_401779810 | HYI | |
MaceMA1844, MtheMTH823 | HKII | ahk31 |
HmarMMP1303, | HKII | ahk32 |
Mbur_401650390 | HKIII | |
(NphaNP1640A) | HKII | ahk33 |
HmarrnAC2616 | HATPasePAS | |
(NphaNP1804A) | HKII | ahk34 |
HmarrnAC0130 | HATPasePAS | |
HmarrnB0133, NphaNP3536A | HKII | ahk35 |
MaceMA2553, MmazMM3099 | HKIII | ahk36 |
MaceMA2555, MMAZMM3101 | HKIII | ahk37 |
AfulAF1721, Mhun_401794250 | HKIII CHASE | ahk38 |
MaceMA1739, MmazMM2629 | HKIII CHASE | ahk39 |
MaceMA3066, MmazMM0328, AfulAF1040, Mbur_401647520 | HKVI | ahk40 |
MaceMA0014, MmazMM1325, MbarA_0984 | HKVI | ahk41 |
PAB1332, PH0484, Tkod_610170420/30, MmarMMP0927 | HKVI | ahk42 |
HaloVNG0971G, HmarrnAC2205, NphaNP2172A, Mhun_401793120 | HKVI | ahk43 |
MaceMA0777, MmazMM1931 | HATPase+ | ahk44 |
Mhun_401789380 | HKI | |
HmarrnAC0789 | HKII | |
MaceMA0863, | HATPaseGAF | ahk45 |
Mbur_401650390 | HKII | |
HaloVNG2036G, HmarrnAC1118 (NphaNP2906A 50%) | RRI | arr1 |
MmarMMP1304, Mbur_401640890, Mhun_401785170 | RRI | arr2 |
MaceMA4671, MmazMM2880 | RRI | arr3 |
MbarA_2510, MmazMM2953 | RRI | arr4 |
MmazMM2954, Mbur_401641840 | RRI | arr5 |
MtheMTH447, Mbur_401660760, Mhun_401806750 | RRI | arr6 |
HaloVNG0974G, AfulAF1042, MmarMMP0933, Mbur_401647540, MaceMA3068, MmazMM0330, PAB1330, PH0482, HmarrnAC2194, Tkod_610170400 , Mhun_401793100, NphaNP2102A | RRI | arr7 |
Mbur_401660760, MaceMA1469, MbarA_3448, MMAZMM2516, MtheMTH1764 (PACGAF), Mhun_401793830 (PAS) | RRI | arr8 |
MaceMA1268, MbarA_3248, HmarrnAC0411, AfulAF1473, MtheMTH445, Mhun_401778650, HmarrnAC0536, (NphaNP0140A) | RRI | arr9 |
MaceMA0016, MbarA_0986, MmazMM1327 | RRI | arr10 |
MaceMA2861, MbarA_2388, MmazMM0049, HmarrnAC0339 (PAS) | RRI | arr11 |
MaceMA4376, MbarA_1051, MmazMM1068, AfulAF2419, Mbur_401644620, Mhun_401789540 | RRI | arr12 |
MaceMA1366, MbarA_2896, MMAZMM2351, Mbur_401657490, Mhun_401803780 | RRI | arr13 |
MaceMA2445, MbarA_3109, MMAZMM3007, Mbur_401658280, Mhun_401776210 | RRI > 200aa | arr14 |
HmarrnAC3308, Mhun_401796600 | RRI > 200aa | arr15 |
MaceMA0018, MbarA_0988, MMAZMM1328 | RRI | arr16 |
HmapNG7223, | RRIII-HTH | arr17 |
Mhun_401799920 | HYIHATPase | |
MaceMA0015, MbarA_0985, MmazMM1326 | RR CheB | arr18 |
PAB1331, PH0483, Tkod_610170410, HaloVNG0973G, HmarrnAC2204, MaceMA3067, MmazMM0329, Mbur_401647530, AfulAF1041, MmarMMP0926, NphaNP2174A, Mhun_401793110 | RR CheB | arr19 |
MtheMTH1607 | RR IV | arr20 |
AfulAF0448 | His_KAPACPASGAF | |
MtheMTH440, | RRIVPAS | arr21 |
Mbur_401642690, Mhun_401789580 | HYI PASPAC | |
HmarrnAC1142 | RRIVPAS | arr22 |
Mhun_401803140 | HYI PASPAC | |
Mhun_401784480, Mbur_401644630 | HYI | ahy1 |
MaceMA1267, MbarA_3247, MtheMTH446, Mhun_401795550 | HYI PASPAC | ahy2 |
AfulAF1472 | HYIHisKA | |
HmarrnAC3379, | HYI PASGAF | ahy3 |
HaloVNG1175G | HKIIPASGAF | |
HmarrnAC2533 | HYI PASGAF | ahy4 |
HaloVNG0916G, (NphaNP1912A) | HKIIPASGAF | |
HmapNG7156, HaloVNG2334 | HYI PASGAF | ahy5 |
MaceMA2256, MmazMM2881 | HYIII | ahy6 |
MaceMA4377, MbarA_1052 | HYIII | ahy7 |
HmarrnAC0475 | HYI | ahy8 |
Mhun_401787170 | HisKA | |
Mhun_401801890, | HYI | ahy9 |
MtheMTH444 | HKI | |
HmarrnAC2044, Mhun_401801140 | HYIPASPAC | ahy10 |
Mhun_401774520, HaloVNG5037G, Mbur_401644900 | HYIPASPAC | ahy11 |
HmarrnAC3050, Mhun_401777310 | HYIPASPAC | ahy12 |
Phylogenetic analysis
Figures 1 to 5 show neighbor-joining and the supplementary Figures A1 to A5 contain the maximum likelihood phylogenetic analyses of alignments from TCoffee analysis. Phylogenetic analysis of the same ORFs was also performed on alignments made by ClustalW (data not shown), but the results were not found to differ significantly. Figure 7 contains the ORFs from the two HKII groups, ahk5 and ahk22, with the three closest bacterial ORFs (to MaceMA2890) from Cyanobacteria, Firmicutes and Proteobacteria. Figure 8 is composed of ORFs from the four HKVI ‘CheA like’ groups, ahk40–43, and the three closest bacterial ORFs (to MaceMA0014) from Thermatogae, Firmicutes and Proteobacteria. Figure 3 is the analysis of a number of RRI orphans, arr7 and arr10 (from putative taxis operons), arr12 and arr14 (> 200 amino acids) with bacterial ORFs from Thermatogae, Firmicutes and Proteobacteria (closest to MaceMA3068). Figure 4 shows the results for the two RRIV CheB orthologous groups from taxis operons, arr18 and arr19 with bacterial representatives from Thermatogae, Proteobacteria and Actinobacteria (closest to MaceMA0015). Figure 5 shows results for ahy2 and bacterial representatives from Cyanobacteria, Actinobacteria and Proteobacteria.
Linked genes
Genes that are located close to each other on the genome and transcribed in the same orientation are shown in Table A2. Most of these are likely to be part of operons. This provides clues to some cognate pairs of HKs, HYs and RRs. All putative HKVI encoding genes are located with other “chemotaxis” genes in “chemotaxis operons,” including two such operons for M. acetivorans and M. mazei. Included in these “chemotaxis operons” are the orthologous groups, ahk40–43, arr7, arr10 and arr18 and arr19 (Table 4).
Conclusions
Distribution of two-component ORFs
Ten species, representing four genera have at least 17 putative two-component ORFs. Some of these two-component ORFs are quite sophisticated in structure, including the multiple sensor HKIIs, CheA-like HKVIs and the hybrid kinases. The results presented here show that a number of euryarchaeal species have an extensive array of two-component sensory ORFs. These proteins may sense a number of different internal signals by means of PAS/PAC domains and their associated cofactors (Ponting and Aravind 1997, Gilles-Gonzalez and Gonzalez 2004). In addition, the potential to sense other small molecules (particularl cNMPs) via the GAF domains (Aravind and Ponting 1997, Ho et al. 2000, Anantharaman et al. 2001) and extracellular signals, by the CHASE4 and Cache putative sensory domains, via HAMP domains (Anantharaman and Aravind 2001, Zhulin at al. 2003) shows that these organisms (in particular H.marismortui, Natronomonas pharaonis, Methanospirillum hungatei and the Methanosarcinales) possess sophisticated and complex sensory networks. As yet, none of these putative two-component genes have a functional name, so functions can be assigned only by similarity. The DNA-binding RRs are common in bacteria that regulate gene expression (Ashby 2004, Galperin 2005). However only three RRs have been identified with putative DNA binding domains, all in H. marismortui. If regular indiscriminate HGT were taking place, one would expect to see more DNA-binding RRs in archaeal sequences. Presumably the large number of orphan RRs are involved in regulation of cellular activity by interacting directly with other proteins. Transcriptional control is probably maintained by the many DNA-binding domains that have been identified as part of one-component systems in archaea (Ulrich et al. 2005). In these systems the DNA-binding output domain is linked directly to a sensor domain without any phosphotransfer.
Of the species that have the most two-component genes, H. marismortui and Natronomonas pharaonis are halophilic and the Methanosarcinales and Methanospirillum hungatei are mesophiles. The mesophiles coexist with a large and diverse population of bacteria, giving ample opportunity for HGT, whereas the opportunity for HGT in the halophilic organisms would be more restricted. This begs the question of how the distribution of two-component genes that can be seen in the Euryarchaeota arose. Was it through HGT exclusively or by vertical transfer from a common ancestral euryarchaeal organism coupled with gene duplications?
Phylogeny and inheritance of two-component ORFs
The phylogenetic analysis of five different sets of orthologous ORFs, chosen because they are found in most of the species that contain two-component ORFs (Figures 1–5), were found to closely match the published phylogenies for these organisms (Matte-Tailliez et al. 2002, Brochier et al. 2004, Bapteste et al. 2005).
For ahk5 and ahk22, shown in Figure 1, the phylogeny of each group agrees with the current phylogeny of these organisms and the position of the three bacterial examples indicates that the two groups may have arisen through a separate HGT event in an ancestral euryarchaeal species for ahk5 and possibly, into an ancestral methanogen for ahk22.
The results for the CheA-like HKVI ORFs are shown in Figure 8. Ahk40, ahk42 and ahk43 (except Mhun_401793120) cluster together and probably represent vertical inheritance from a single HGT event into an ancestral Euryarchaeota species (one bacterial ORF from T. maritima giving the best match). Ahk41 appears to be a separate group, found in the Methanosarcinales, that clusters on its own and seems to be more closely associated with the Firmicutes and Proteobacterial examples, presumably representing a separate HGT event. The three Methanospirillum hungatei ORFs seem to be due to separate (Mhun_401793120 probably should not be in ahk43) HGT events and Mhun_401784470 and Mhun_401776240 are probably true paralogues.
Figure 3 shows the results for four orphan RR groups. The two groups, associated with putative taxis operons ahk7 and ahk10, group closely together, however, ahk10, which is found only in the Methanosarcinales is probably due to an HGT event into a direct ancestor of this group. The other two orthologous groups, arr12 and arr14 are quite separate from the first two mentioned groups (arr7 and arr10) and probably arose from separate HGT events into the ancestors of methanogens (AfulAF2419 appears to be a distant member of ahk12).
Figure 4 shows the results for the two RRIV-CheB orthologous groups associated with taxis. The arr18 orthologous group found in Methanosarcinales groups separately from arr19, being closer to two of the bacterial ORFs. Therefore arr18 appears to be the result of a separate HGT event in an ancestor of the Methanosarcinales, whereas arr19 appears to be the result of an HGT event into an ancestor of Euryarchaeota.
The phylogeny for ahy2 (the biggest hybrid kinase orthologous group), shows that these members probably arose from more than one HGT event. The combined results for the orthologous groups found in potential taxis operons are shown in Table A2.
The operon that contains HKVI (ahk40/42/43), RRI (arr7) and RRIV-CheB (arr19) appears to have arisen as an HGT event that transferred the whole operon into an ancestor of the Euryarchaeota. In contrast, the taxis operon containing HKVI (ahk41), RRI (arr10) and RRIV-CheB (arr18) appears to have arisen from a separate HGT event of the whole operon into a direct ancestor of the Methanosarcinales.
The results presented here suggest that HGT has taken place from bacterial species both into ancestral Euryarchaeota and more recently into the methanogens. However the large numbers of two-component genes in the mesophilic methanogens and the Halobacteriales probably reflect their well known metabolic flexibility (Bapteste et al. 2005, Falb et al. 2005). This in turn, necessitates an increased requirement for regulation of cellular activity in a changing environment rather than the increased potential for HGT from bacteria. Most of the two-component ORFs that can be observed in these groups of organisms are probably derived from paralogous gene duplication events, the number of two-component ORFs observed would be driven by the requirement to control cellular activity as the organisms evolve. A limited number of HGT events could be sufficient to account for the diversity of phosphotransfer and sensory domains.
Any function of two-component ORFs is inferred by homology to known bacterial genes (e.g. HKVI and chemotaxis) and awaits in situ or in vitro studies, or both. This highlights the importance of interfacing between bioinformaticians and biochemists to plan experiments in an informed way, particularly where orthologues are identified and found in more than one genus and hence may play central roles in cellular regulation.
Acknowledgments
Mark Ashby was supported by New Initiative Funding from the University of the West Indies. The author wishes to thank John Allen, Elke Dittmann, Conrad Mullineaux and Ruth-Sarah Rose for critical reading of this manuscript.
Appendix
Table A1.
Domain | Species | Gene | Amino acid region |
CheY Receiver | E. coli | NP_416396.1 | |
M. ace | MA0016 | ||
M. ace | MA3068 | ||
OmpR Receiver | E. coli | NP_417864.1 | aa 6–124 |
Histidine kinase | E. coli | NP_417863.1 | aa 234–439 |
M. ace | MA0490 (HisKA_2) | aa 639–847 | |
Hpt | E. coli | NP_415513.1 | aa 815–896 |
M. ace | NP_614988 | aa 5–106 |
Table A2.
Assigned gene no. | Classification | Orthologue number | Additional information |
Archaeoglobus fulgidus | |||
AF448 | His_KA PACPASGAF | ||
AF449 | RRI-CheY | ||
AF450 | HKIII + 1 PASPAC | ||
AF1042 | RRI-CheY | arr7 | AF1055 mcp |
AF1041 | RRIV-CheB | arr18 | AF1044CheW |
AF1040 | HKVI | arr40 | AF1039 CheC |
AF1472 | HYI-HisKA | ahy2 | |
AF1473 | RRI-CheY | arr9 | |
AF2419 | RRI-CheY | arr12 | |
AF2420 | His_KA PACPASGAF | ||
Halobacterium | |||
VNG0974G | RRI-CheY | arr7 | VNG0976G CheW |
VNG0973G | RRIV-CheB | arr19 | VNG0970G CheC1 VNG0971G |
HKVI | ahk43 | VNG0967G CheD | |
VNG0966G CheR | HKIII | ahk22 | |
VNG1374G | HATPase HAMP | ||
VNG1375G | |||
VNG2036C | RRI-CheY | arr1 | |
VNG2037C | HKII | ||
Haloarcula marismortui | |||
rrnAC0410 | HKII + 2PASPAC | arr9 | |
rrnAC0411 | RRI-CheY | ||
rrnAC0412 | HYI | ||
rrnAC0413 | HKII + 1PASPAC | ||
rrnAC0456 | HATPase_c+ gyra | Top6B | |
rrnAC0457 | HATPase_c | Top6A | |
rrnAC2204 | RRIV-CheB | arr19 | rrnAC2206 CheR |
rrnAC2205 | HKVICheW-CheA | ahk43 | |
rrnAC2692 | HYI + PASPACGAF | ||
rrnAC2694 | HATPase_c + PAS | ||
Methanobacter thermoautotrophicus | |||
MTH444 | HKI | ahy9 | |
MTH445 | RRI-CheY | arr9 | |
MTH446 | HYI + (PASPAC)2 | ahy2 | |
MTH447 | RRI-CheY | arr6 | |
MTH457 | RRIV-PACPAS | ||
MTH459 | HKI | ||
MTH548 | RRIV-G_transf | ||
MTH549 | RRI-CheY | ||
MTH901 | HYI | ||
MTH902 | HYI-PASPAC | ||
Methanococcus maripaludis | |||
MMP0926 | RRI-CheB | arr19 | MMP0925 CheW |
MMP0927 | HKVI | ahk42 | MMP0928 CheD |
MMP0933 | RRI-CheY | arr7 | MMP0929 mcp |
MMP1303 | HKIV | ahk32 | |
MMP1304 | RRI-CheY | arr2 | |
Methanosarcina acetivorans | |||
MA0014 | HKVI | ahk41 | |
MA0015 | RRI-CheB | arr19 | |
MA0016 | RRI-CheY | arr10 | |
MA0018 | RRI-CheY | arr16 | MA0019 mcp |
MA0020 CheW | |||
Methanosarcina acetivorans cont’d | |||
MA0551 | HKII + PASPAC | ||
MA0552 | HKIII | ahk9 | |
MA0619 | HKIII | ahk12 | |
MA0620 | HKIII | ahk14 | |
MA0758 | HKIII | ||
MA0759 | HKIII | ahk30 | |
MA1267 | HYI + PASPAC | ahy2 | |
MA1268 | RRI-CheY | arr9 | |
MA1269 | RRI-CheY | ||
MA1270 | HKII + PASPAC | ||
MA1468 | RRI-CheY | ||
MA1469 | RRI-CheY | arr8 | |
MA1470 | HKIII | ahk8 | |
MA1627 | HKIII | ahk25 | |
MA1628 | HKII + PASPAC | ahk4 | |
MA1645 | HKIII | ahk17 | |
MA1646 | HKII + PASPAC | ahk6 | |
MA2012 | RRI-CheY | ||
MA2013 | HYII + Hpt | ||
MA3066 | HKVI | ahk40 | MA3063 CheR |
MA3068 | RRI-CheY | arr7 | MA3064 CheD |
MA3067 | RRI- CheB | arr18 | MA3065 CheC |
MA3070 CheW | |||
MA3368 | HKIII | ahk24 | |
MA3370 | HKII + PASPAC | ||
MA4376 | RRI-CheY | arr12 | |
MA4377 | HYIII CHSE4HP | ahy7 | |
Methanosracina barkeri | |||
MbarA_0984 | HKVI | ahk41 | MbarA_0983 CheR |
MbarA_0985 | RRIV-CheB | arr19 | |
MbarA_0986 | RRI-CheY | arr10 | |
MbarA_0988 | RRI-CheY | MbarA_0989 mcp | |
Opp orientation to above | MbarA_0990 CheW | ||
MbarA_1051 | RRI-CheY | arr12 | |
MbarA_1052 | HYIII | ahy7 | |
MbarA_3036 | HKII | ||
MbarA_3037 | HKII | ||
MbarA_3247 | HYIPASPAC | ||
MbarA_3248 | RRI-CheY | ||
MbarA_3250 | HKII | ||
MbarA_3447 | HKII | ||
MbarA_3448 | |||
Methanosracina mazei | |||
MM0168 | HKII | ahk9 | |
MM0169 | HKIII | ||
MM0328 | HKVI | ahk40 | MM3025 CheR |
MM0329 | RRI- CheB | arr18 | MM0326 CheD |
MM0330 | RRI-CheY | arr7 | MM0327 CheC |
MM0332 CheW | |||
MM0333 mcp | |||
MM1325 | HKVI | ahk41 | MM1323 CheC |
MM1326 | RRI-CheY | arr19 | MM1324 |
MM1327 | arr10 | CheB | |
MM1328 opp orientation to above | RRI-CheY | arr16 | MM1329 mcp |
MM1330 CheW | |||
Methanosracina mazei cont’d. | |||
MM2275 | HKIII | ahk28 | |
MM2276 | HKIII | ||
MM2277 | HKIII | ||
MM2515 | HKIII | ahk8 | |
MM2516 | RRI-CheY | arr8 | |
MM2880 | RRI-CheY | arr3 | |
MM2881 | HYIII | ahy6 | |
MM2953 | RRI-CheY | arr4 | |
MM2954 | RRI-CheY | arr5 | |
MM2955 | HKVIPASPACCach | ||
MM3205 | HYIII | ||
MM3206 | RRI-CheY | ||
Pyrococcus abyssi | |||
PAB1330 | RRI-CheY | arr7 | PAB1329 CheR |
PAB1331 | RRVI-CheB | arr19 | PAB1333 CheC |
PAB1332 | HKVI | ahk42 | PAB1334 CheC |
PAB1335CheW | |||
PAB1336 mcp | |||
Pyrococcus horikoshi | |||
PH0482 | RRI-CheY | arr7 | PH0481 CheB |
PH0483 | RRIV-CheB | arr19 | |
PH0484 | HKVI | ahk42 |
References
- R1.Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z., Miller W., Lipman D.J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R2.Alves R., Savageau M.A. Comparative analysis of prototype two-component systems with either bifunctional or monofunctional sensors: differences in molecular structure and physiological function. Mol. Microbiol. 2003;48:25–51. doi: 10.1046/j.1365-2958.2003.03344.x. [DOI] [PubMed] [Google Scholar]
- R3.Anantharaman V., Aravind L. The CHASE domain: a predicted ligand-binding module in plant cytokinin receptors and other eukaryotic and bacterial receptors. Trends Biochem. Sci. 2001;26:579–582. doi: 10.1016/s0968-0004(01)01968-5. [DOI] [PubMed] [Google Scholar]
- R4.Anantharaman V., Koonin E.V., Aravind L. Regulatory potential, phyletic distribution and evolution of ancient, intracelluar small-molecule-binding domains. J. Mol. Biol. 2001;307:1271–1292. doi: 10.1006/jmbi.2001.4508. [DOI] [PubMed] [Google Scholar]
- R5.Appleman J.A., Stewart V. Mutational analysis of a conserved signal-transducing element: the HAMP linker of the Escherichia coli nitrate sensor NarX. J. Bacteriol. 2003;185:89–97. doi: 10.1128/JB.185.1.89-97.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R6.Aravind L., Ponting C.P. The GAF domain: an evolutionary link between diverse phototransducing proteins. Trends Biochem. Sci. 1997;22:458–459. doi: 10.1016/s0968-0004(97)01148-1. [DOI] [PubMed] [Google Scholar]
- R7.Aravind L., Ponting C.P. The cytoplasmic helical linker domain of receptor histidine kinase and methyl-accepting proteins is common to many prokaryotic signalling proteins. FEMS Microbiol. Lett. 1999;176:111–116. doi: 10.1111/j.1574-6968.1999.tb13650.x. [DOI] [PubMed] [Google Scholar]
- R8.Ashby M.K. Survey of the number of two-component response regulator genes in the complete and annotated genome sequences of prokaryotes. FEMS Microbiol. Lett. 2004;231:277–281. doi: 10.1016/S0378-1097(04)00004-7. [DOI] [PubMed] [Google Scholar]
- R9.Baliga N.S., Bonneau R., Facciotti M.T., et al. Genome sequence of Haloarcula marismortui: a halophilic archaeon from the Dead Sea. Genome Res. 2004;14:2221–2234. doi: 10.1101/gr.2700304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R10.Bapteste E., Brochier C., Boucher Y. Higher-level classification of the Archaea: evolution of methanogenesis and methanogens. Archaea. 2005;1:353–363. doi: 10.1155/2005/859728. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R11.Bateman A., Coin L., Durbin R., et al. The Pfam protein families database. Nucleic Acids Res. 2004;32:D138–141. doi: 10.1093/nar/gkh121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R12.Bibikov S.I., Barnes L.A., Gitin Y., Parkinson J.S. Domain organisation and flavin adenine dinucleotide-binding determinants in the aerotaxis signal transducer Aer of Escherichia coli . Proc. Natl. Acad. Sci. USA. 2000;97:5830–5835. doi: 10.1073/pnas.100118697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R13.Boucher Y., Douady C.J., Papke R.T., Walsh D.A., Boudreau M.E.R., Nesbø C.L., Case R.J., Doolittle W.F. Lateral gene transfer and the origins of prokaryotic groups. Annu. Rev. Genet. 2003;37:283–328. doi: 10.1146/annurev.genet.37.050503.084247. [DOI] [PubMed] [Google Scholar]
- R14.Brochier C., Forterre P., Gribaldo S. Archaeal phylogeny based on proteins of the transcription and translation machineries: tackling the Methanopyrus kandleri paradox. Genome Biol. 2004;5:R17. doi: 10.1186/gb-2004-5-3-r17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R15.Bult C.J., White O., Olsen G.J., Zhou L., Fleischmann R.D., Sutton G.G., Blake J.A., Fitzgerald L.M., Clayton R.A., Gocayne J.D. Complete genome sequence of the methanogenic archaeon, Methanococcus janaschii . Science. 1996;273:1058–1073. doi: 10.1126/science.273.5278.1058. [DOI] [PubMed] [Google Scholar]
- R16.Cohen G.N., Barbe V., Flament D., et al. An integrated analysis of the genome of the hyperthermophilic archaeon Pyrococcus abysii . Mol. Microbiol. 2003;47:1495–1512. doi: 10.1046/j.1365-2958.2003.03381.x. [DOI] [PubMed] [Google Scholar]
- R17.Deppenmeier U., Johann A., Hartsch T., et al. The genome of Methanosarcina mazei: evidence for lateral gene transfer between bacteria and archaea. J. Mol. Microbiol. Biotechnol. 2002;4:453–461. [PubMed] [Google Scholar]
- R18.Falb M., Pfeiffer F., Palm P., Rodewald K., Hickmann V., Tittor J., Oesterhelt D. Living with two extremes: conclusions from the genome sequence of Natromonas pharaonis . Genome Res. 2005;15:1336–1343. doi: 10.1101/gr.3952905. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R19.Felsenstein J. Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. Methods Enzymol. 1996;266:418–427. doi: 10.1016/s0076-6879(96)66026-1. [DOI] [PubMed] [Google Scholar]
- R20.Forterre P., Brochier C., Philippe H. Evolution of the Archaea. Theor. Popul. Biol. 2002;61:409–422. doi: 10.1006/tpbi.2002.1592. [DOI] [PubMed] [Google Scholar]
- R21.Galagan J.E., Nusbaum C., Roy A., et al. The genome of M. acetivorans reveals extensive metabolic and physiological diversity. Genome Res. 2002;12:532–542. doi: 10.1101/gr.223902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R22.Galperin M.Y. A census of membrane-bound and intracellular signal transduction proteins in bacteria: bacterial IQ, extroverts and introverts. BMC Microbiol. 2005;5:35. doi: 10.1186/1471-2180-5-35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R23.Galperin M.Y., Nikolskaya A.N., Koonin E.V. Novel domains of the prokaryotic two-component signal transduction systems. FEMS Microbiol. Lett. 2001;203:11–21. doi: 10.1111/j.1574-6968.2001.tb10814.x. [DOI] [PubMed] [Google Scholar]
- R24.Gilles-Gonzalez M.-A., Gonzalez G. Signal transduction by heme-containg PAS-domain proteins. J. Appl. Physiol. 2004;96:774–783. doi: 10.1152/japplphysiol.00941.2003. [DOI] [PubMed] [Google Scholar]
- R25.Grebe T.W., Stock J.B. The histidine protein kinase superfamily. Adv. Microb. Physiol. 1999;41:139–227. doi: 10.1016/s0065-2911(08)60167-8. [DOI] [PubMed] [Google Scholar]
- R26.Hellingwerf K.J. Bacterial observations: a rudimentary form of intelligence? Trends Microbiol. 2005;13:152–158. doi: 10.1016/j.tim.2005.02.001. [DOI] [PubMed] [Google Scholar]
- R27.Ho Y.-S., Burden L., Hurley J.H. Structure of the GAF domain, a ubiquitous signaling motif and a new class of cyclic GMP receptor. EMBO J. 2000;19:5288–5299. doi: 10.1093/emboj/19.20.5288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R28.Hoch J.A. Two-component and phosphorelay signal transduction. Curr. Opin. Microbiol. 2000;3:165–170. doi: 10.1016/s1369-5274(00)00070-9. [DOI] [PubMed] [Google Scholar]
- R29.Jahreis K., Morrison T.B., Garzón A., Parkinson J.S. Chemotactic signaling by an Escherichia coli CheA mutant that lacks the binding domain for phosphoacceptor partners. J. Bacteriol. 2004;186:2662–2672. doi: 10.1128/JB.186.9.2664-2672.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R30.Karniol B., Vierstra R.D. The HWE histidine kinases, a new family of bacterial two-component sensor kinases with potentially diverse roles in environmental signaling. J. Bacteriol. 2004;186:445–453. doi: 10.1128/JB.186.2.445-453.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R31.Kawarabayasi Y., Sawada M., Horikawa H., et al. Complete sequence and gene organisation of the genome of a hyper-thermophilic archaebacterium, Pyrococcus horikoshii OT3. DNA Res. 1998;5:55–76. doi: 10.1093/dnares/5.2.55. [DOI] [PubMed] [Google Scholar]
- R32.Klenk H-P., Clayton R.A., Tomb J.-F., et al. The complete sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeglobus fulgidus . Nature. 1997;390:364–370. doi: 10.1038/37052. [DOI] [PubMed] [Google Scholar]
- R33.Koonin E.V. Horizontal gene transfer: the path to maturity. Mol. Microbiol. 2003;50:725–727. doi: 10.1046/j.1365-2958.2003.03808.x. [DOI] [PubMed] [Google Scholar]
- R34.Koretke K.K., Lupas A.N., Warren P.V., Rosenberg M., Brown J.R. Evolution of two-component signal transduction. Mol. Biol. Evol. 2000;17:1956–1970. doi: 10.1093/oxfordjournals.molbev.a026297. [DOI] [PubMed] [Google Scholar]
- R35.Kumar S., Tamura K., Nei M. MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief. Bioinform. 2004;5:150–163. doi: 10.1093/bib/5.2.150. [DOI] [PubMed] [Google Scholar]
- R36.Kurland C.G., Canback B., Berg O.G. Horizontal gene transfer: A critical view. Proc. Natl. Acad. Sci. USA. 2003;100:9658–9662. doi: 10.1073/pnas.1632870100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R37.Lawrence J.G., Hendrickson H. Lateral gene transfer: when will adolescence end? Mol. Microbiol. 2003;50:739–749. doi: 10.1046/j.1365-2958.2003.03778.x. [DOI] [PubMed] [Google Scholar]
- R38.Makarova K.S., Koonin E.V. Comparative genomics of archaea: how much have we learned in six years, and what’s next? Genome Biol. 2003;4:115. doi: 10.1186/gb-2003-4-8-115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R39.Makarova K.S., Koonin E.V. Evolutionary and functional genomics of the Archaea. Curr. Opin. Microbiol. 2005;8:586–594. doi: 10.1016/j.mib.2005.08.003. [DOI] [PubMed] [Google Scholar]
- R40.Matte-Tailliez O., Brochier C., Forterre P., Philippe H. Archael phylogeny based on ribosomal proteins. Mol. Biol. Evol. 2002;19:631–639. doi: 10.1093/oxfordjournals.molbev.a004122. [DOI] [PubMed] [Google Scholar]
- R41.Nelson K.E., Clayton R.A., Gill S.R., et al. Evidence for lateral gene transfer between archaea and bacteria from genome sequence of Thermatoga maritima . Nature. 1999;399:323–329. doi: 10.1038/20601. [DOI] [PubMed] [Google Scholar]
- R42.Ng W.V., Kennedy S.P., Mahairas G.G., et al. Genome sequence of Halobacterium species NRC-1. Proc. Natl. Acad. Sci. USA. 2000;97:12176–12181. doi: 10.1073/pnas.190337797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R43.Notredame C., Higgins D.G., Heringa J. T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 2000;302:205–217. doi: 10.1006/jmbi.2000.4042. [DOI] [PubMed] [Google Scholar]
- R44.Ochman H., Lawrence J.G., Groisman E.A. Lateral gene transfer and the nature of bacterial innovation. Nature. 2000;405:209–304. doi: 10.1038/35012500. [DOI] [PubMed] [Google Scholar]
- R45.Ohmori M., Ikeuchi M., Sato N., et al. Characterization of genes encoding multi-domain proteins in the genome of the filamentous nitrogen-fixing cyanobacterium Anabaena sp. strain PCC 7120. DNA Res. 2001;8:271–284. doi: 10.1093/dnares/8.6.271. [DOI] [PubMed] [Google Scholar]
- R46.Ponting C.P., Aravind L. PAS: a multifunctional domain family comes to light. Curr. Biol. 1997;7:R674–R677. doi: 10.1016/s0960-9822(06)00352-6. [DOI] [PubMed] [Google Scholar]
- R47.Sardiwal S., Kendall S.L., Movahedzadeh F., Rison S.C., Stoker N.G., Djordjevic S. A GAF domain in the hypoxia/ NO-inducible Mycobacterium tuberculosis DosS protein binds haem. J. Mol. Biol. 2005;353:929–936. doi: 10.1016/j.jmb.2005.09.011. [DOI] [PubMed] [Google Scholar]
- R48.Slesarev A.I., Mezhevaya K.V., Makarova K.S., et al. The complete genome of hyperthermophile Methanopyrs kanleri AV19 and monophyly of archaeal methanogens. Proc. Natl. Acad. Sci. USA. 2002;99:4644–4649. doi: 10.1073/pnas.032671499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R49.Smith D.R., Doucette-stamm L.A., Deloughery C., et al. Complete genome sequence of Methanobacterium thermoautotrophicum DH: Functional analysis and comparative genomics. J. Bacteriol. 1997;179:7135–7155. doi: 10.1128/jb.179.22.7135-7155.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R50.Stewart R.C., van Bruggen R. Association and dissociation kinetics for CheY interacting with the P2 domain of CheA. J. Mol. Biol. 2004;336:287–301. doi: 10.1016/j.jmb.2003.11.059. [DOI] [PubMed] [Google Scholar]
- R51.Stock A.M., Robinson V.L., Goudreau P.N. Two-component signal transduction. Annu. Rev. Biochem. 2000;69:183–215. doi: 10.1146/annurev.biochem.69.1.183. [DOI] [PubMed] [Google Scholar]
- R52.Tam R., Saier M.H. Structural, functional, and evolutionary relationships among extracellular solute-binding receptors of bacteria. Microbiol. Rev. 1994;57:320–346. doi: 10.1128/mr.57.2.320-346.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R53.Thompson J.D., Higgins D.G., Gibson T.J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22:4673–4680. doi: 10.1093/nar/22.22.4673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R54.Ulrich L.E., Koonin E.V., Zhulin I.B. One-component systems dominate signal transduction in prokaryotes. Trends Microbiol. 2005;13:52–56. doi: 10.1016/j.tim.2004.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R55.West A.H., Martinez-hackert E., Stock A.M. Crystal structure of the catalytic domain of the chemotaxis receptor methylesterase, CheB. J. Mol. Biol. 1995;250:276–290. doi: 10.1006/jmbi.1995.0376. [DOI] [PubMed] [Google Scholar]
- R56.Zhu Y., Inouye M. The HAMP linker in histidine kinase dimeric receptors is critical for symmetric transmembrane signal transduction. J. Biol. Chem. 2004;279:48152–48158. doi: 10.1074/jbc.M401024200. [DOI] [PubMed] [Google Scholar]
- R57.Zhulin I.B., Nikolskaya A.N., Galperin M.Y. Common extracellular sensory domains in transmembrane receptors for diverse signal transduction pathways in bacteria and archaea. J. Bacteriol. 2003;185:285–294. doi: 10.1128/JB.185.1.285-294.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]