Abstract
Helicobacter pylori (H. pylori) have a unique ability to survive in extreme acidic environments and to colonize the gastric mucosa. It can cause diverse gastric diseases such as peptic ulcers, chronic gastritis, mucosa-associated lymphoid tissue (MALT) lymphoma, gastric cancer, etc. Based on genomic research of H. pylori, over 1600 genes have been functionally identified so far. However, H. pylori possess some genes that are uncharacterized since: (i) the gene sequences are quite new; (ii) the function of genes have not been characterized in any other bacterial systems; and (iii) sometimes, the protein that is classified into a known protein based on the sequence homology shows some functional ambiguity, which raises questions about the function of the protein produced in H. pylori. Thus, there are still a lot of genes to be biologically or biochemically characterized to understand the whole picture of gene functions in the bacteria. In this regard, knowledge on the 3D structure of a protein, especially unknown or hypothetical protein, is frequently useful to elucidate the structure-function relationship of the uncharacterized gene product. That is, a structural comparison with known proteins provides valuable information to help predict the cellular functions of hypothetical proteins. Here, we show the 3D structures of some hypothetical proteins determined by NMR spectroscopy and X-ray crystallography as a part of the structural genomics of H. pylori. In addition, we show some successful approaches of elucidating the function of unknown proteins based on their structural information.
Keywords: Helicobacter pylori, structural genomics, NMR, X-ray, unknown protein, hypothetical protein, structural homology
1. H. pylori as a Pathogen
Helicobacter pylorus is one of the pathogens involved in various gastric diseases such as peptic ulcers, chronic gastritis, mucosa-associated lymphoid tissue lymphoma, and gastric cancer [1–3]. Infection with H. pylori is associated with an increased risk of gastric adenocarcinoma and has attracted attention as a cofactor in the pathogenesis of this malignant condition [4]. Moreover, the risk of developing cancer is related to the physiologic and histologic changes induced by a H. pylori infection in the stomach [5]. Despite a general decline in the incidence of gastric cancer, it remains the fourth most common cancer and second leading cause of cancer-related deaths worldwide [6]. However, most H. pylori infections do not cause cancer. The sporadic distribution of the disease caused by H. pylori looks to be dependent on host-related factors: the host (human individual) genetics controlling the inflammatory response, the age when the H. pylori infection was acquired, poor nutrition, storage of food, and the pattern of food consumption can be considered as host-related factors [7–9].
In addition, bacterial factors associated with the risk of gastric cancer are also emphasized, and molecular and cell biology approaches aimed at understanding the interaction between H. pylori and transforming epithelial cells have been carried out. Since H. pylori is a highly heterogeneous bacterial species, both genotypically and phenotypically, and is highly adapted for survival in the gastric niche, it is not easy to figure out the major bacterial factors that are directly associated with etiopathogenesis [10,11]. Based on the current knowledge, several virulence factors such as genes within the cag (cytotoxin-associated antigen) pathogenicity island, including the gene encoding the CagA protein, as well as polymorphic variation in the VacA vacuolating exotoxin and the blood group antigen binding adhesions, BabA and SabA, are regarded as possible bacterial factors [6,10,12]. A duodenal ulcer-promoting gene (dupA), located in the “plasticity region” of the H. pylori genome, was reported as a potential virulence marker [10,13]. Other bacterial factors such as peptidoglycan, lipopolysaccharide(LPS), γ-glutamyl trans-peptidase(GGT), and protease HtrA may be linked to pathogenicity [14].
Although a huge amount of biological data on H. pylori has been accumulated, enzymes or proteins of unknown function still make up more than a third of the open reading frames (ORF) of H. pylori. An unknown protein could be defined as a protein whose function has not yet been characterized, and a hypothetical protein could be defined as a protein that is supposed to exist in an organism although its existence has not been shown experimentally. Therefore, in a broad sense, hypothetical proteins could be included in unknown proteins. To completely understand the pathogenic mechanism of H. pylori, it is very important to elucidate the functions of these unknown proteins. To fill in the “missing parts list” is accordingly one of the greatest challenges for post-genomic biology, and a tremendous opportunity to discover new biological and pathogenic machinery in H. pylori.
2. H. pylori Genomic Sequence
The sequencing of the H. pylori genome started in 1997 with the H. pylori strain 26695 [15]. It was isolated from an English patient with chronic gastritis. The chromosome of strain 26695 is circular and composed of 1.67 mega base pairs (Table 1). The average G-C content is approximately 38.9% and the genome has 1590 open reading frames (ORF) that are possibly protein-coding loci [1], together with the RNA coding genes (2 copies of 16S rRNA and 23S rRNA genes, 36 tRNA genes). From the following analysis of the same genome, it was suggested that a smaller number of ORFs is in the sequence of strain 26695 [16].
Table 1.
Organism | Gene | Size (Mb) | GC% | Protein (unknown) | Type | Project |
---|---|---|---|---|---|---|
Helicobacter pylori | 1480 | 1.57 | 38.9 | 1405 (476) | chr a | Gyeongsang National University College of Medicine and 21c Frontier Human Genome Functional Research Project Helicobacter pylori 52 genome sequencing project |
| ||||||
Helicobacter pylori 2017 |
1647 | 1.55 | 39.3 | 1593 (525) | chr | Pathogen Biology Laboratory, University of Hyderabad Helicobacter pylori 2017 genome sequencing project |
| ||||||
Helicobacter pylori 2018 |
1655 | 1.56 | 39.3 | 1603 (459) | chr | Pathogen Biology Laboratory, University of Hyderabad Helicobacter pylori 2018 genome sequencing project |
| ||||||
Helicobacter pylori 26695 |
1627 | 1.67 | 38.9 | 1573 (1301) | chr | TIGR (The Institute for Genome Research) Causes gastric inflammation and peptic ulcer disease |
| ||||||
Helicobacter pylori 35A |
1560 | 1.57 | 38.9 | 1470 (362) | chr | Baylor College of Medicine Reference genome for the Human Microbiome Project |
| ||||||
Helicobacter pylori 51 |
1495 | 1.59 | 38.8 | 1415 (386) | chr | Gyeongsang National University College of Medicine and 21c Frontier Human Functional Genome Research Project Bacterium isolated from duodenal ulcer patient |
| ||||||
Helicobacter pylori 83 |
1656 | 1.62 | 38.7 | 1609 (445) | chr | Baylor College of Medicine Reference genome for the Human Microbiome Project |
| ||||||
Helicobacter pylori 908 |
1646 | 1.55 | 39.3 | 1595 (444) | chr | University of Hyderabad, India Helicobacter pylori 908 genome sequencing project |
| ||||||
Helicobacter pylori B38 |
1571 | 1.58 | 39.2 | 1382 (643) | chr | Institut Pasteur Causes peptic ulcers |
| ||||||
Helicobacter pylori B8 |
1744 | 1.67 | 38.8 | 1702 (736) | chr | CeBitec, Bielefeld University Helicobacter pylori B8 genome sequencing project |
5 | 0.01 | 35.9 | 5 (3) | plsm b | ||
| ||||||
Helicobacter pylori Cuz20 |
1606 | 1.64 | 38.9 | 1564 (538) | chr | Dept. of Molec. Microbiology, Washington University Medical School, Saint Louis Helicobacter pylori Cuz20 genome sequencing project |
| ||||||
Helicobacter pylori F16 |
1543 | 1.58 | 38.9 | 1500 (494) | chr | The University of Tokyo Helicobacter pylori F16 genome sequencing project |
| ||||||
Helicobacter pylori F30 |
1522 | 1.57 | 38.8 | 1479 (470) | chr | The University of Tokyo Helicobacter pylori F30 genome sequencing project. |
5 | 0.01 | 34.1 | 5 (1) | plsm | ||
| ||||||
Helicobacter pylori F32 |
1533 | 1.58 | 38.9 | 1490 (485) | chr | The University of Tokyo Helicobacter pylori F32 genome sequencing project. |
1 | 0 | 36.7 | 1 (0) | plsm | ||
| ||||||
Helicobacter pylori F57 |
1563 | 1.61 | 38.7 | 1520 (498) | chr | The University of Tokyo Helicobacter pylori F57 genome sequencing project. |
| ||||||
Helicobacter pylori G27 |
1570 | 1.65 | 38.9 | 1493 (470) | chr | University of Oregon Strain used extensively in H. pylori research |
11 | 0.01 | 34.9 | 11 (5) | plsm | ||
| ||||||
Helicobacter pylori Gambia94/24 |
1646 | 1.71 | 39.1 | 1604 (611) | chr | Berg lab, Washington University Medical School Helicobacter pylori Gambia94/24 genome sequencing project |
1 | 0 | 37.4 | 1 (1) | plsm | ||
| ||||||
Helicobacter pylori HPAG1 |
1573 | 1.60 | 39.1 | 1531 (515) | chr | Washington University (WashU) Isolated from a Swedish patient with chronic atrophic gastritis |
8 | 0.01 | 36.4 | 8 (5) | plsm | ||
| ||||||
Helicobacter pylori India7 |
1638 | 1.68 | 38.9 | 1600 (561) | chr | Berg lab, Washington University Medical School Helicobacter pylori Ind7 genome sequencing project |
| ||||||
Helicobacter pylori J99 |
1534 | 1.64 | 39.2 | 1488 (560) | chr | Astrazeneca-Boston Causes gastric inflammation and peptic ulcer disease |
| ||||||
Helicobacter pylori Lithuania75 |
1588 | 1.62 | 38.8 | 1546 (522) | chr | Berg lab, Washington University Medical School Helicobacter pylori Lit75 genome sequencing project |
19 | 0.02 | 33.7 | 19 (12) | plsm | ||
| ||||||
Helicobacter pylori P12 |
1624 | 1.67 | 38.8 | 1568 (450) | chr | Max von Pettenkofer-Institut für Hygiene und Medizinische Mikrobiologie, Ludwig-Maximilians-Universität München Clinical isolate |
10 | 0.01 | 35.1 | 10 (2) | plsm | ||
| ||||||
Helicobacter pylori PeCan4 |
1597 | 1.63 | 38.9 | 1555 (529) | chr | Dept. of Molec. Microbiology, Washington University Medical School, Saint Louis Helicobacter pylori PeCan4 genome sequencing project |
8 | 0.01 | 32.9 | 8 (0) | plsm | ||
| ||||||
Helicobacter pylori Puno120 |
1567 | 1.62 | 38.9 | 1525 (518) | chr | Washington University Medical School Helicobacter pylori Puno120 genome sequencing |
15 | 0.01 | 35.8 | 15 (13) | plsm | ||
| ||||||
Helicobacter pylori Puno135 |
1615 | 1.65 | 38.8 | 1573 (532) | chr | Washington University Medical School Genome sequence of Helicobacter pylori strain Puno135 |
| ||||||
Helicobacter pylori SJM180 |
1623 | 1.66 | 38.9 | 1581 (558) | chr | Dept. of Molec. Microbiology, Washington University Medical School, Saint Louis Helicobacter pylori SJM180 genome sequencing project |
| ||||||
Helicobacter pylori SNT49 |
1557 | 1.61 | 39 | 1515 (495) | phage | Washington University Medical School Genome sequence of Helicobacter pylori SNT49 |
4 | 0 | 37.4 | 4 (3) | plsm | ||
| ||||||
Helicobacter pylori Sat464 |
1544 | 1.56 | 39.1 | 1502 (504) | chr | Dept. Molec. Microbiology, Washintgton University Medical School in Saint Louis Helicobacter pylori Sat464 genome sequencing project. |
6 | 0.01 | 33.5 | 6 (4) | plsm | ||
| ||||||
Helicobacter pylori Shi470 |
1647 | 1.61 | 38.9 | 1568 (593) | chr | Washington University Medical School Clinical isolate from the Amazon River region |
| ||||||
Helicobacter pylori SouthAfrica7 |
1585 | 1.65 | 38.4 | 1543 (555) | chr | Berg lab, Washington University Medical Shool Helicobacter pylori SouthAfrica7 genome sequencing project |
29 | 0.03 | 33.7 | 29 (19) | plsm | ||
| ||||||
Helicobacter pylori v225d |
1625 | 1.59 | 39 | 1541 (555) | chr | The Pathosystems Resource Integration Center (PATRIC) Helicobacter pylori v225 genome sequencing |
9 | 0.01 | 32.9 | 9 (7) | plsm | ||
| ||||||
Helicobacter pylori B45 |
27 | 0.02 | 37.3 | 27 (26) | chr S/C c |
Karolinska Institute Helicobacter pylori B45 genome sequencing project |
| ||||||
Helicobacter pylori 98-10 |
1566 | 1.57 | 38.8 | 1527 (1527) | S/C | Vanderbilt University School of Medicine Gastric cancer strain |
| ||||||
Helicobacter pylori B128 |
1770 | 1.65 | 38.8 | 1731 (1731) | S/C | Vanderbilt University School of Medicine Gastric ulcer strain |
| ||||||
Helicobacter pylori HPKX_438_AG0C 1 |
2939 | 1.82 | 39.5 | 2898 (1564) | S/C | Washington University Medical School Clinical isolate |
| ||||||
Helicobacter pylori HPKX_438_CA4C1 |
3962 | 1.57 | 39.2 | 3925 (1548) | S/C | Washington University Medical School Isolate from a patient with gastric carcinoma |
| ||||||
Total | 59,776 | - | - | 57,872 (23,261) | - | - |
Chromosome;
Plasmid;
S/C: Scaffolds or Contigs.
Ongoing studies have found genes that were missing in previous analyses, as in the case of SecE. A general secretion machinery is widely present in bacteria, which functions in the secretion of outer membrane proteins to extracellular environments [18]. From the first annotation results, it was thought that strain 26695 had only a partial general secretion machinery because it lacked SecE [15]. A new small open reading frame between nusG and rmpG (HP1203–HP1204) in the genome sequences was found using an ab initio server, GeneMark, Glimmer, and BlastX [19]. It has a high homology and structural similarity to the SecE protein in related bacteria implying that strain 26695 has a complete general secretion machinery. In addition, small RNA genes are universally present in bacteria [20]. The tmRNA gene (ssrA) has been found in H. pylori, encoding a functional RNA molecule and a small peptide involved in the quality control of translation [21]. In addition, the H. pylori strain contains a sRNA gene encoding the RNA component of RnaseP and the 4.5S RNA gene which is involved in secretion [22,23].
In 2008, the adaptations of H. pylori to a rarely captured event in the evolution of its impact on a host biology were characterized by defining the impact of these adaptations on an intriguing but poorly characterized interaction between this bacterium and gastric epithelial stem cells [24]. H. pylori HPKX_438_AG0C1 and HPXK_438_CA4C1 were isolated from a single patient who progressed from ChAG (chronic atrophic gastritis) to adenocarcinoma using a population-based endoscopy study. ChAG-associated Kx1 and Cancer-associated Kx2 genomes were analyzed to examine the adaptation of H. pylori, respectively. Micro-arrays gave a comprehensive view of the genome diversity of the H. pylori pathogen. This was performed with information on the origin of the hspA together with glmM alleles revealing that H. pylori infection may be acquired by more diverse routes than previously expected [25]. According to cluster analysis, isolates from family D belonged to three different strains, those from family L consisted of two strains, and those from family A were grouped into at least 5 strains. Strains from family D and family L differed by the presence/absence of 24 to 42 CDSs (coding sequences). In family A, one strain was difficult to define due to the small differences in gene profiles between neighboring branches.
In 2009, the complete genome sequence of H. pylori G27 was reported [26]. The G27 strain was originally isolated from an endoscopy patient from Italy [27]. The genome consists of a single circular chromosome with about 1.65 mega base pairs (Table 1) that is AT rich (61.6%), contains 1515 ORFs, and is similar in size and composition to the other published H. pylori genomes of strains 26695, J99, and HPAG [15,16,28]. The G27 strain contains 58 genes that are not found in 26695, J99, or HPAG, as defined by a blastp hit. The majority of these G27-specific genes are predicted to encode hypothetical proteins [26].
In the same year, the genome sequences of two H. pylori strains were analyzed [29]. H. pylori strain 98-10 was isolated from a patient with gastric cancer and strain B128 was isolated from a patient with gastric ulcer disease. Strain 98-10 was most closely related to H. pylori strains of East Asian origin and strain B128 was most closely related to strains of European origin. Strain 98-10 contained multiple features characteristic of East Asian strains, including a type s1c vacA allele and a cagA allele encoding an EPIYA-D tyrosine phosphorylation motif.
Very recently, several genome sequences of different strains were reported accelerating H. pylori genomic and proteomic research [30–38]. Strain 908 is a close relative strain of J99 [39] and was isolated from an African patient living in France, who suffered from duodenal ulcer disease [40]. The B8 strain consists of about 1.67 mega base pairs and a small plasmid of about 6000 base pairs carrying nine putative genes. Interestingly, the B8 strain contains coding sequences, 293 of which are strain-specific, coding mainly for hypothetical proteins with unknown functions [31]. Similarly, the P12 strain contains plasticity zones, encoding for the type IV secretion system and having the typical properties of genomic islands [32]. Another sequenced genome, the Shi470 strain known as the Shiimaa village strain was more Asian- than European-like genome-wide, indicating Amerind ancestry. This strain contains two unique cagA virulence genes and a novel allele of gene hp0519 encoding host tissue interaction protein [33]. There are several H. pylori populations such as hpAfrical, hpEurope, hspEAsia, and hspAmerind because this bacterium has colonized the stomach since early in human evolution and diverged with ancient human migrations [41–43]. One of these populations, the hspAmerind strain V225d, was cultured from a Venezuelan Piaroa Amerindian subject and identified. The V225d strain is cag-positive encoding a multifunctional effector protein injected into host cells by the cag type IV secretion system [34]. Two strains, 2017 and 2018, are the chronological subclones of strain 908 and cultured from the antrum and corpus, respectively. Using comparative genomic analysis [35,37], these two strains are almost identical and descended from the genome of strain 908 [30,36]. The B45 strain was sequenced from a gastric mucosa-associated lymphoid tissue (MALT) lymphoma patient and induced an integrated prophage in this strain by UV irradiation [38].
The Comprehensive Microbial Resource (CMR) is a free tool that allows researchers to access all of the publicly available bacterial genome sequences completed to date [44] (Figure 1). Currently, it provides genomic sequences of three strains of Helicobacter pylori (26695, HPAG1, J99).
3. Structural Reports on H. pylori Proteins
As in the case of other genomic research, Structural Genomics Initiatives are mainly responsible for determination of H. pylori protein structures. These initiatives, together with the structure determination of known proteins, have made enormous strides in the elucidation of unknown protein structure of H. pylori [15,16,24–26,28–38,45–47]. The available structural data have already led to the identification of potentially new drug targets [48] and has been helpful in assigning functions to proteins of which the functions were previously unknown [49,50].
The increase in structure determination for H. pylori has been triggered by the sequencing of the H. pylori 52 and 26695 genomes [15,25,45,47]. The genome sequences and their protein structures yielded many clues to help understand the pathogenesis of H. pylori. Approximately 14% of Lyase structures have been determined and represent the largest proportion of any functional class of which the structures have already been solved (Table S1).
The sequencing of the genome led to a dramatic increase in the number of known structures for H. pylori proteins deposited in the Protein Data Bank (PDB) (Figure 2). The first H. pylori protein structure was determined in 2001 (PDB ID: 1G6O) [51]. In the following four years, 32 more structures were reported (Figure 2). After several sub-species genome sequences of H. pylori became publicly available, the number of structures determined after 2005 increased sharply and at an increasing rate.
Usually, protein solubility is one of the main bottlenecks in structure determination [53]. In the case of H. pylori, methods have already been developed that remedied this problem, such as the development of customized expression strategies for H. pylori proteins in Escherichia coli [54]. The increase in determined structures is also due to the development of improved methods for high-throughput X-ray crystallography. However, the major driving force for this increase was the availability of genome-wide sequence data in the early 2000s.
There are currently 79,356 structures in the PDB as of 14 February 2012, of which 0.35%, a total of 279, are structures of H. pylori proteins. Of these proteins, 28 are unknown in function, which represents 10.03% of the determined H. pylori structures (Table 2).
Table 2.
PDB ID | Chain | Structure | Macromolecule Name | Classification | Scop Fold | Exp. Method | |
---|---|---|---|---|---|---|---|
ID | AA | MW | |||||
1MW7 | A | 240 | 27161.20 | Hypothetical protein HP0162 | SG a, unknown function | YebC-like | X-ray |
1S2X | A | 206 | 23998.70 | Cag-Z | Unknown function | STAT-like | X-ray |
1Z8M | A | 88 | 10394.30 | Conserved hypothetical protein HP0894 |
SG, unknown function | RelE-like | NMR |
1ZHC | A | 76 | 9130.38 | hypothetical protein HP1242 | Unknown function | NMR | |
1ZKE | A, B, C, D, E, F | 83 | 56798.00 | Hypothetical protein HP1531 | SG, unknown function | ROP-like | X-ray |
2ATZ | A | 180 | 22049.45 | Predicted coding region HP0184 | SG, unknown function | Prim-pol domain | X-ray |
2BO3 | A | 94 | 11101.70 | Hypothetical protein HP0242 | SG, unknown function | HP0242-like | X-ray |
2EVV | A, B, C, D | 207 | 95692.83 | hypothetical protein HP0218 | SG, unknown function | X-ray | |
2F6S | A, B | 201 | 47249.90 | cell filamentation protein, putative | SG, unknown function | Fic-like | X-ray |
2G3V | A, B, C, D | 208 | 104975.36 | CAG pathogenicity island protein 13 | Unknown function | X-ray | |
2GTS | A | 86 | 10626.50 | hypothetical protein HP0062 | SG, unknown function | Ferritin-like | X-ray |
2H9Z | A | 86 | 10205.80 | Hypothetical protein HP0495 | SG, unknown function | Ferredoxin-like | NMR |
2I9I | A | 254 | 29526.70 | Hypothetical protein | SG, unknown function | Anticodon-binding domain-like | X-ray |
2JOQ | A | 91 | 10673.20 | Hypothetical protein HP0495 | SG, unknown function | Ferredoxin-like | NMR |
2K0Z | A | 110 | 12948.60 | Uncharacterized protein HP1203 | SG, unknown function | NMR | |
2K6P | A | 92 | 10472.30 | Uncharacterized protein HP1423 | Unknown function | NMR | |
2OTR | A | 98 | 11502.60 | Hypothetical protein HP0892 | SG, unknown function | NMR | |
2OUF | A | 94 | 11148.60 | Hypothetical protein | SG, unknown function | X-ray | |
2UVP | A, B, C, D | 186 | 87079.82 | HOBA, HP1230 | Unknown function | X-ray | |
2XRH | A | 100 | 11635.31 | HP0721 | Unknown function | X-ray | |
3BGH | A, B | 236 | 55233.49 | Putative neuraminyllactose-binding hemagglutinin homolog | SG, unknown function | X-ray | |
3CWX | A, B, C | 176 | 62332.80 | protein CagD | Unknown function | X-ray | |
3CWY | A | 176 | 20841.15 | protein CagD | Unknown function | X-ray | |
3F42 | A, B | 99 | 22671.87 | protein HP0035 | SG, unknown function | X-ray | |
3FX7 | A, B | 94 | 23207.80 | Uncharacterized protein, HP0062 | Unknown function | X-ray | |
3KWL | A | 514 | 60116.00 | Uncharacterized protein | Unknown function | X-ray | |
3MLG | A, B | 189 | 43924.40 | Uncharacterized protein | Unknown function | X-ray | |
3MLI | A, B, C, D | 100 | 47758.96 | Uncharacterized protein HP0242 | Unknown function | X-ray |
Structural genomics.
A complete list of H. pylori protein structures deposited in the PDB is given in the Supporting Information Table S1. The predominant method used to determine these structures was X-ray crystallography, which accounts for 261 of the total number of H. pylori structures currently determined (Figure 2). A further 18 were elucidated by solution-state NMR spectroscopy. Most structures are of individual proteins, although many are bound by small molecule ligands such as substrate analogues and only 11 protein-DNA complexes have been determined (Figure 3, Table S1).
4. Unknown Proteins in H. pylori and Estimation of Their Function
The most typical approach of predicting the function of an unknown protein is to use sequence similarity by finding a similar protein of known function [56]. Based on sequence-similarity, a predictor assigns the known function to the inferred protein. Actually, the functions of enzymes tend to be conserved if they share more than a 40%–50% sequence identity. The sequence-based approach is reasonable, however, approximately 50% of the unknown proteins from a newly sequenced genome could not be assigned to their function using only sequence-similarity approaches [57] (Figure 1). The low efficiency of the sequence-similarity search may be partly caused by gene sequences that are quite new and genes that have not yet been characterized in other bacterial systems. To overcome the weakness of sequence-similarity searches, several trials were employed using so called “similarity free” methods [57]. The methods use physicochemical properties and secondary structure of proteins. Bioinformatics developed the methods and there have been successful cases for characterizing function or structure [58–60]. However, the methods need to be improved since similarity-free methods still depend to a certain extent on similarity.
Another approach to identify function is to use 3D structures. This approach often succeeds in cases where sequence-based methods fail. This may be due to the idea that in many cases evolution retains the folding pattern long after the sequence similarity becomes undetectable. Structural similarity searches use the global fold of the protein [61–64] or detect the functionally important regions of the protein [65–69]. Since structures diverge more slowly than sequences, a sequence comparison may be less sensitive than a structure comparison [70]. However, the structural comparison still has the limitation of false positives being reported and needs to be improved to overcome overestimation of statistical significance like sequence-similarity searches [70]. This means that experimental confirmation is still required for exact assignment of function to an unknown protein.
Some examples of functional elucidation of unknown proteins from H. pylori are provided below. For estimation, we generally conducted four steps: (i) structure determination; (ii) sequence homology search using PSI-BLAST [71]; (iii) structural homology search using the web server DALI [62]; and (iv) experimental confirmation of the function.
4.1. HP0894–HP0895: Toxin-Antitoxin System in H. pylori
The high-quality NMR structure of HP0894 was reported [72]. The HP0894 structure (PDB ID: 1Z8M) has two α-helices, two 310-helices, and four β-strands (α-α-310-β-310-β-β-β). The β-Strands form a four-stranded anti-parallel β-sheet (Figure 4). BLAST conserved domain search [73] showed that HP0894 contains the conserved domain DUF332 (Domain of Unknown Function), which is equivalent to COG 3041 in the National Center for Biotechnology Information Database of Clusters of Orthologous Groups. However, in the Pfam database [74], HP0894 belongs to the plasmid stabilization system protein family (PF05016). From the sequence homology search, we were able to get a hint of the function. However, a search for structural homologs with a Z score higher than 3.0 using the programs DALI showed that HP0894 is structurally similar to Pyrococcus horikoshii Archaeal RelE (PDB code: 1WMI, Z score = 7.8, pairwise RMSD = 2.8 Å), E. coli YoeB (PDB code: 2A6Q, Z score = 8.8, RMSD = 2.9 Å), and Guanyloribonuclease (PDB code: 1RGE, Z score = 3.3, pairwise RMSD = 3.4 Å). These proteins are both ribonucleases, have a similar number of residues as HP0894 (around 90), share a similar β-sheet topology with HP0894, and have a comparable location for two of their helices (Figure 4). As expected, they have no detectable sequence homology with HP0894 in PSI-BLAST searches and Blast2 (pairwise comparison) analyses. The structural homology search revealed HP0894 may have potential ribonuclease activity and represents the toxin-antitoxin (TA) system like RelE [75]. Generally, in a TA system, toxin expression induces arrest of cell growth, whereas the antitoxin neutralizes the toxin by a direct protein-protein interaction [76]. Both proteins of the toxin-antitoxin system are encoded within a single operon, with the toxin gene usually located directly downstream of the antitoxin gene [77]. Thus, we hypothesized: (i) HP0894 is a toxin molecule in H. pylori; (ii) there should be an antitoxin molecule that interacts with HP0894; and (iii) it should be near the gene location for hp0894 on the chromosome, if an antitoxin molecule exists. Actually, we found that HP0895 (hypothetical protein) is an antitoxin molecule [78] locating upstream of the hp0894 gene.
Our experimental data [78] showed that HP0894 and HP0895 forms a stable complex as a large multimer (hexamer, ((HP0895)6, (HP0894–HP0895)6), and the inhibitory effect of HP0894 on E. coli cell growth was neutralized by HP0895. In bacteria, toxins function, or are supposed to function, by inhibiting translation through mRNA cleavage [79]. With a RNA retardation experiment, the in vitro RNase activity of HP0894 was confirmed and HP0895 inhibited this RNase activity [78]. A primer extension experiment showed that HP0894-mediated mRNA cleavage occurred predominantly before adenine (A) or guanine (G) residues and we suggested -U:A- and -C:A- sequences are the most preferred cleavage sites [78]. The binding mode between HP0894 and HP0895 was more deeply studied using NMR and CD spectroscopy and we showed the binding interface of HP0894 [78]. Interestingly, HP0316 (hypothetical protein) that has an 85% sequence identity with HP0895 except for 30 residues at the C-terminal tail did not bind to HP0894, suggesting the C-terminal non-conserved tail of HP0895 may be responsible for binding of HP0894 [78]. Actually, with the synthesized C-terminal peptide of HP0895, the residue-specific interaction sites of HP0894 were cleared (Figure 4). These results indicate that the HP0894–HP0895 TA system, especially through negative regulation of the HP0894 toxin by the HP0895 antitoxin, may be related to the status of infections of H. pylori in the human gastric mucosa and to its survival in that locus.
Notably, HP0892 (hypothetical protein) and HP0894 share high sequence similarity (identity 53%). It is expected that HP0892 may be a paralog of HP0894. As a result, the structure of HP0892 is very similar to that of HP0894 [80] (Figure 5), and HP0892 is structurally similar to Archaeal RelE (aRelE) (Z score = 8.1, RMSD = 2.7 Å) and the YoeB toxin of E. coli (Z score = 9.6, RMSD = 2.9 Å) like HP0894. Based on the above study, HP0892 was speculated to be another toxin molecule. However, there is no comparable protein to the HP0895 antitoxin near the upstream or downstream of hp0892 gene. Thus, the function of HP0892 is still questionable, which implies that most structural homologues do not reveal the function of unknown proteins. According to gene comparison studies using DNA microarrays [81], the hp0892 gene is one of several H. pylori genes absent from a set of five cag pathogenicity island (PAI)-negative strains, while the hp0894 gene is not. This may represent a marker for the identification of virulent strains or may represent novel virulence factors. Therefore, it is probable that the biological role of HP0892 is different from that of HP0894, aRelE, and YoeB, despite the sequence and/or structural similarities among them.
4.2. HP0315: Virulence-Associated Factor, Endoribonuclease
Virulence-associated protein, a product of the vap gene in various organisms, may be insufficient in itself, but is a requisite for virulence. The vap genes are known as factors or enzyme-producing factors that regulate the expression of true virulence genes or activate virulence factors by translational modification, processing of secretions or that are required for the activity of true virulence factors. Several vap genes (vapA, B, C, D, H and I) are known to exist in various organisms [82–84] but how the products of the vap genes are related to virulence remains unclear. H. pylori strain 26695 has only one type of virulence-associated protein, VapD. Two genes in this strain (HP0315 and HP0967) belong to vapD [85]. The exact biological role of the VapD protein has not yet been established, but several suggestions such as toxin, acid tolerance, plasmid stability, etc. have been made [86–88]. Here, we summarized the elucidation of the probable function of HP0315 with structural and biochemical studies.
The structure of HP0315 consists of 10 secondary structure elements: β1 (residues 1–8), α1 (residues 10–17), α1′ (residues 21–35), β2 (residues 38–41), β3 (residues 44–47), α2 (residues 53–66), α2′ (residues 68–73), β4 (residues 75–87) and α3 (residues 88–93). The monomer has a ferredoxin-like fold. It has the β1-(α1-α1′)-β2-β3-(α2-α2′)-β4-α3 instead of the β-α-β-β-α-β structure of the ferredoxin fold. The dimer of HP0315 is butterfly-shaped (PDB code: 3UI3, Figure 6). The β4 strand and the α3 helix associate with the adjacent monomer, forming a dimerization interface [89]. This structure is the first structure of a VapD family to our knowledge. A sequence homology search revealed that HP0315 is related to the CRISPR-associated protein Cas2, a novel family of endoribonucleases, suggesting the potential ribonuclease activity of HP0315. The structure-based alignment also yielded a high score from DALI for one of the Cas2 proteins, SSO1404 (PDB code: 2IVY) although the top-scoring proteins were mainly hypothetical unknown proteins. In addition, the interrelationships between VapD and Cas2 proteins were supported by a genomic analysis [90].
The sequence analysis yielded another interesting result: the two genes HP0315 and HP0316 exist as an operon, which is a functional unit of genomic DNA containing partially overlapping genes under the control of a single regulatory signal or promoter (gene coordinates: HP0315 330872–330588, HP0316 331245–330853, Figure 6). As described above, HP0316 has a sequence similarity of 88.9% with HP0895 [78], which might suggest the HP0315–HP0316 system is identical with the HP0894–HP0895 system. In other words, HP0315 might act as a toxin molecule like HP0894 although no sequence and structural similarity exists between them. However, HP0315 did not bind HP0316 and did not affect the cell viability in in vivo toxicity experiments [89]. From the sequence/structure analysis and biochemical experiments, HP0315 was speculated to be a ribonuclease but not a toxin even though the gene arrangement is similar to that of a TA system [89]. The RNase activity of HP0315 was confirmed by primer extension and gel retardation experiments, revealing purine-specific endoribonuclease activity [89].
Conclusively, HP0315, a member of the VapD family, has a structural similarity with the Cas2 family and has a gene arrangement similar to the TA system; however, it does not belong to any of them, like an evolutionary intermediate. The exact function of HP0315 has not been determined yet. However, considering the relationship with Cas2 and a TA system, as well as the endoribonuclease activity, HP0315 may be related to either cell maintenance or a defense mechanism against invasion, or possibly both such as Cas2 and/or a TA system.
4.3. Others: HP0062, HP0495, HP0827, HP1242, HP1423
The 3D structure of hypothetical protein HP0062 (PDB code: 3FX7) at 1.65 Å resolution was solved [91]. HP0062 is a small protein composed of 86 amino acids but it exists as dimer. The HP0062 monomer folds into a hairpin structure, in which two α-helices (the N- and C-helix) are connected by a short loop (Figure 7A) and the N-helix displays a modified leucine zipper. The protomers dimerize in an antiparallel arrangement, in which the N and C helices of one protomer pack against the N and C helices of the second protomer, forming a four-helix bundle. The two protomers in an asymmetric unit of the orthorhombic crystal are similar, and the topologically equivalent Ca carbons superimpose with a RMSD of 0.79 Å. Actually, the structure of HP0062 was also solved by another group but they reported the protein is monomeric (unpublished, PDB code: 2GTS). Since our gel filtration chromatography revealed the dimeric state of HP0062, it is believed that the biologically relevant form is a dimer [91]. The structural comparison indicated HP0062 has similarity with the coiled-coil segments of over 100 functionally unrelated proteins that are involved in various protein-protein interactions. Thus, the function of HP0062 is hard to directly estimate from the structural information. Interestingly, HP0062 shows extensively similar characteristics to those of the ESAT-6 family of Gram-positive bacteria; small dimer, helix-hairpin-helix structure, no signal peptide but with WXG motif in the hairpin bend (WRD in HP0062), and gene clusters with a protein with FtsK/SpoIIIE domain [92]. On the other hand, HP0062 also has similar characteristics to those of the TTS (Type Three Secretion) chaperones of Gram-negative bacteria; small dimer, an acidic pI, an overall α-helical character and a carboxy-terminal amphipathic α-helix [93]. These results might give a hint for the function of HP0062 as a transport chaperone and/or adaptor protein to facilitate interactions with host receptor proteins.
HP0495 is an 86-residue hypothetical protein with a molecular weight of 10,192.7 Da. The atomic coordinates of the final structure have been deposited in PDB (2H9Z). HP0495 has two α-helices and four β-strand, forming a ferredoxin-like fold, β1-α1-β2-β3-α2-β4 (Figure 7B). HP0495 is a completely unknown protein since HP0495 has a restricted sequence homology with unknown proteins from several bacteria [94,95]. The ubiquitous ones like HP0495 merit the highest priority for functional characterization because they have the greatest potential payoff in new biological knowledge. In this case, the structure of HP0495 and structural homology data may be more important and provide a clue for the function. Unfortunately, a structural homology search using DALI indicated that HP0495 has structural homology with a variety of proteins [94]. This should be because the ferredoxin-like fold of HP0495 is abundant in other structures. Twenty proteins had a higher Z-score of 5.0 from DALI analysis including the NikR protein from Pyrococcus horikoshii (nickel responsive repressor; PDB code: 2BJ9, RMSD = 2.9 Å), LrpA from Thermus thermophilus (transcriptional regulator; PDB code: 1RIS, RMSD = 2.9 Å), S6 protein from Archaeoglobus fulgidus (ribosomal protein; PDB code: 1Y7P, RMSD = 2.9 Å), and a hypothetical YbeD protein from E. coli (unknown; PDB code: 1RWU, RMSD = 3.6 Å). The structural comparison did not show a clear result. However, the function of HP0495 seems to be related to nucleic acid interaction since its homologues are mainly nucleic acid binding proteins and HP0495 possesses positive surface charges (Figure 7B).
HP0827 is classified as a putative single-stranded (ss)-DNA binding protein 12RNP2 precursor protein. The solution structure of HP0827 (PDB code: 2KI2) has a ferredoxin-like fold, β1-α1-β2-β3-α2-β4 [96]. The four β-strands are arranged in a right-handed twist and form an antiparallel β-sheet that packs against the two α-helices (Figure 7C). This protein contains one RRM (RNA Recognition Motif) comprised of two ribonucleo-protein motifs (RNP1, Lys/Arg-Gly-Phe/Tyr-Gly/Ala-Phe/Tyr-Val/Ile/ Leu-X-Phe/Tyr and RNP2, Ile/Val/Leu-Phe/Tyr–Ile/Val/Leu-X-Asn-Leu). Since the RRM motif is an abundant component in protein structures, only the RRM motif could not tell the exact function of HP0062. Actually, a total of 6,056 RRM motifs can be found in 3541 different proteins in the Pfam database [97]. We could not elucidate the biological function of HP0827 from a structural basis, though the structure may provide information on the putative RNA binding site. Further biological studies may be required for this case.
The HP1242 gene encodes a 76-residue conserved hypothetical protein with a molecular weight of 9111 Da. HP1242 adopts a full helical structure, which is composed of three α-helices [98]. These correspond to residues 6–14 (αI), 18–38 (αII), and 43–75 (αIII). The overall structure of HP1242 represents a coiled-coil-like conformation (Figure 7D). Based on the sequence homology, HP1242 is classified as the DUF (Domain of Unknown Function) 465 family, which has an unknown function. These family members are found in several bacterial proteins, and also in the heavy chain of eukaryotic myosin and kinesin, which are predicted to form coiled coil structures. HP1242 has a structural homology with a variety of proteins including the rop protein (transcription regulation), arfaptin 2 fragment (signaling protein), sensory rhodopsin II fragment (membrane protein complex) and so on [99]. This result indicates that the function of HP1242 could not be evaluated by only a structural comparison.
We also determined the solution structure of HP1423, which has 84 amino acid residues. HP1423 is a hypothetical protein as well. According to the Pfam database, HP1423 belongs to S4 (PF01479) superfamily. The S4 domain is a small domain consisting of 60–65 amino acid residues that probably mediates binding to RNA [100]. The structure of HP1423 is composed of five β-strands and three α-helices [101]. The topology can be described as α1-α2-β2-β1-β3-β4-α3-β5 (Figure 7E). Notably, the region, extending from α1 through β3, forms an obvious structural motif, the so called αL motif, because of the two α-helices and the loop between β2 and β3 which forms an L-shaped meander (Figure 7E). This structural motif shows a high degree of conservation between different families within the S4 (PF01479) superfamily and may be important for interaction with RNA [100]. The surface region of the αL motif of HP1423 has a strong concentration of positive charge and the loop between β4 and α3 exposes another positively charged side chain of K67, which may raise the possibility that HP1423 is a RNA binding protein (Figure 7E). The DALI result also showed that HP1423 is structurally similar to proteins that belong to S4 superfamily. The S4 superfamily includes the Hsp15 protein (PDB code: 1DM9-B), ribosomal small subunit pseudouridine synthase A (PDB code: 1VIO-A), 30S ribosomal protein S4 (PDB code: 1FJG-D), and so on. All these homologues contain the αL motif. However, the distribution of positively-charged residues on the protein surfaces was somewhat different between homologous proteins [101], suggesting that HP1423 may bind to RNA through the αL motif in a similar but not exactly same manner as the S4 RNA binding proteins.
5. Different Characteristic with Known Function
Bioinformatics tools have been remarkably developed, providing biologists valuable information for functional elucidation. Nevertheless, prediction of protein function from sequence and structure is a difficult problem, because homologous proteins often have different functions. In addition, the protein that is classified into a known protein, based on the sequence homology, often shows some functional ambiguity since the composition of the operon is quite different from that of the known system. In addition, some of the proteins, which are considered to be well characterized, may have additional functions beyond their listed function [102]. In this regard, it is still worth investigating known proteins from a newly sequenced genome for their cell and biological functions. Here, we present two examples of well-defined proteins that have different characteristics compared to the homologues.
Copper metabolism by copper chaperones has been studied extensively in both eukaryotes and bacteria. In the gram-positive bacterium, Enterococcus hirae, the cop operon is composed of four proteins: two integral membrane P-type ATPases, CopA, and CopB which transport Cu(I) into cells under Cu(I) limiting conditions and eliminate Cu(I) under conditions of high Cu(I) levels, respectively [103,104]. The imported copper ions are transferred from CopA to the CopZ chaperone [105–107] and CopY, a gene repressor, is released from the cop operon promoter when Cu(I) is delivered to CopY by the copper chaperone, CopZ (Figure 8A). In the case of the gram-negative bacterium, H. pylori, copper homeostasis seems to be maintained by only two proteins CopA and CopP (HP1073). The H. pylori cop operon (Figure 8A) is included in a novel stress-responsive operon (sro), which encodes the flagellar motor switch protein CheY, the putative methyltransferase Hsm, the cell division protein FtsH, the putative phosphatidyltransferase Ptr, the heavy metal-binding proteins CopA and CopP, and an open reading frame of unknown function [108]. CopA is a member of the bacterial copper ion ATPase family, and CopP, which is homologous to E. hirae CopZ, is a putative copper binding regulatory protein of 66 amino acids [104,108]. CopA of H. pylori was identified as a Cu(II) export ATPase [109], which shows that its biological role is more similar to that of E. hirae CopB, rather than CopA [110]. Moreover, the CopP gene resides immediately downstream of the CopA gene, while the E. hirae CopZ gene resides upstream of the CopA gene. Therefore, the cop operon organization seems to be evolutionarily modified in each bacterium.
Generally, CopZ proteins share a conserved structure, βαββαβ with a similar metal binding region. Interestingly, HpCopP adopts the βαββα fold with a missing C-terminal β strand [111]. The overall topologies of the secondary structural components are very similar between the CopZs and HpCopP, while some variations in the loop regions appear (Figure 8). The relationship between the unusual fold and the copper specificity was evaluated [111]. We showed that HpCopP was not adequate for Cu(II) binding since the fold stability decreased in the presence of Cu(II) ion, suggesting that the structure of HpCopP is optimized for the transfer of toxic Cu(I). The absence of the C-terminal β-strand may lead to decreased conformational stability of loop I including the CXXC motif (Cu binding motif), which probably contributes to the disulfide bond formation between the two cysteine residues in the presence of Cu(II) ion. These findings should be helpful in evaluating the copper metabolism related with HpCopA and HpCopP in H. pylori.
Acyl carrier protein (ACP) found in bacteria is a monofunctional protein, that is, a type II enzyme in fatty acid biosynthesis. All the ACPs are decorated by acyl carrier protein synthase (ACPS) with fatty acids, which are covalently attached as thioesters to the 4′-phosphopantetheine prosthetic group at highly conserved Ser 36 [112]. Fatty acid binding has little influence on ACP conformation under physiological conditions [113], but it stabilizes ACP against denaturation at alkaline pH [114].
H. pylori ACP (HP0559) is composed of 78 amino acids with a pI value of 3.9, and its primary structure is similar with those of homologous ACPs. Like other ACPs, HpACP forms a helical bundle structure through hydrophobic contacts between the helices (Figure 9). However, we found an unusual behavior of HpACP at neutral pH [115]. HpACP exists as a partially unfolded state at neutral pH, which is a unique characteristic of HpACP (Figure 9). In contrast, the overall helical structure of E. coli ACP was maintained at pH 7 [116] and Vibrio harveyi ACP exhibited a random coil-like conformation at pH 7 [117].
The pH dependent-conformational change of a protein from H. pylori is a very interesting feature, considering that the environment of the stomach has a low pH. A few studies showed the relationship between the mutation of various residues and the pH-dependent structural stability. The mutation of Val 43 to Ile in E. coli ACP increases the stability to pH-induced expansion in electrophoretic systems, concomitantly inducing more compact folding [118]. The mutants F50 A and I54 A of V. harveyi are incapable of adopting a native conformation with increased hydrodynamic radius at neutral pH [117]. In addition, a few basic residues scattered near the N- and C-termini, for example, His 75 of E. coli ACP, are necessary for ACP to maintain a native conformation at neutral pH [119]. Through our structural analysis, we found that several hydrophilic residues (Glu 47, Asn 75, and Lys 76) play an important role in structural stability. Therefore, we could suggest that, unlike other ACPs, the helical bundle of H. pylori ACP is maintained by, not only hydrophobic interactions, but also by hydrophilic interactions and these interactions may be weakened by elevation of the pH because the exchange rate of protons attached to the side chain amide of Asn and Lys may increase [115].
6. Concluding Remarks
Mass genomic sequencing has been yielding many protein sequences that cannot be annotated, and structural genomics projects are yielding many protein structures that have unknown functions. Unknown proteins represent up to about half of the proteins in prokaryotic genomes, and much more than this in higher plants and animals [120]. In bacteria such as H. pylori, 30–40% of the proteins encoded by typical bacterial genomes have no clear known function [121]. Thus, a major issue of genomic studies may be to narrow the gap between the richness of sequences (and/or structures) and functional characterization as subsequent experimental investigation is costly and time-consuming [122]. Actually, only 54% of E. coli gene products have been experimentally investigated so far [123]. Therefore, more robust bioinformatic methods or approaches may be necessary to overcome this situation. Here, we showed several examples of successful cases for elucidating the function of H. pylori unknown proteins based on their structural information, which supports the potential of structural comparison for functional identification. It is hoped that the structural comparison can at least act as a guide to the possible function, even though all structures cannot elucidate the actual function.
Supplementary Information
Acknowledgements
This study was supported by the National Research Foundation of Korea (NRF) grant funded by Korean government (MEST) (Grant number 20110001207 and 2012R1A2A1A01003569). This study was also supported by a grant of the Korea Healthcare technology R&D Project, Ministry for Health, Welfare & Family Affairs, Republic of Korea. (Grant number: A092006). This research was also supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2011-0011603).
References
- 1.Rothenbacher D., Brenner H. Burden of Helicobacter pylori and H. pylori-related diseases in developed countries: Recent developments and future implications. Microbes Infect. 2003;5:693–703. doi: 10.1016/s1286-4579(03)00111-4. [DOI] [PubMed] [Google Scholar]
- 2.Wotherspoon A.C., Doglioni C., Diss T.C., Pan L., Moschini A., de Boni M., Isaacson P.G. Regression of primary low-grade B-cell gastric lymphoma of mucosa-associated lymphoid tissue type after eradication of Helicobacter pylori. Lancet. 1993;342:575–577. doi: 10.1016/0140-6736(93)91409-f. [DOI] [PubMed] [Google Scholar]
- 3.Peek R.M., Jr, Blaser M.J. Helicobacter pylori and gastrointestinal tract adenocarcinomas. Nat. Rev. Cancer. 2002;2:28–37. doi: 10.1038/nrc703. [DOI] [PubMed] [Google Scholar]
- 4.Parsonnet J., Friedman G.D., Vandersteen D.P., Chang Y., Vogelman J.H., Orentreich N., Sibley R.K. Helicobacter pylori infection and the risk of gastric carcinoma. N. Engl. J. Med. 1991;325:1127–1131. doi: 10.1056/NEJM199110173251603. [DOI] [PubMed] [Google Scholar]
- 5.Ferreira A.C., Isomoto H., Moriyama M., Fujioka T., Machado J.C., Yamaoka Y. Helicobacter and gastric malignancies. Helicobacter. 2008;13:28–34. doi: 10.1111/j.1523-5378.2008.00633.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Yamaoka Y. Mechanisms of disease: Helicobacter pylori virulence factors. Nat. Rev. Gastroenterol. Hepatol. 2010;7:629–641. doi: 10.1038/nrgastro.2010.154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.El-Omar E.M. Role of host genes in sporadic gastric cancer. Best Pract. Res. Clin. Gastroenterol. 2006;20:675–686. doi: 10.1016/j.bpg.2006.04.006. [DOI] [PubMed] [Google Scholar]
- 8.Graham D.Y. Helicobacter pylori infection in the pathogenesis of duodenal ulcer and gastric cancer: A model. Gastroenterology. 1997;113:1983–1991. doi: 10.1016/s0016-5085(97)70019-2. [DOI] [PubMed] [Google Scholar]
- 9.Graham D.Y., Lu H., Yamaoka Y. African, Asian or Indian enigma, the East Asian Helicobacter pylori: Facts or medical myths. J. Dig. Dis. 2009;10:77–84. doi: 10.1111/j.1751-2980.2009.00368.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wen S., Moss S.F. Helicobacter pylori virulence factors in gastric carcinogenesis. Cancer Lett. 2009;282:1–8. doi: 10.1016/j.canlet.2008.11.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Blaser M.J., Atherton J.C. Helicobacter pylori persistence: Biology and disease. J. Clin. Invest. 2004;113:321–333. doi: 10.1172/JCI20925. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Mahdavi J., Sonden B., Hurtig M., Olfat F.O., Forsberg L., Roche N., Angstrom J., Larsson T., Teneberg S., Karlsson K.A., et al. Helicobacter pylori SabA adhesin in persistent infection and chronic inflammation. Science. 2002;297:573–578. doi: 10.1126/science.1069076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Lu H., Hsu P.I., Graham D.Y., Yamaoka Y. Duodenal ulcer promoting gene of Helicobacter pylori. Gastroenterology. 2005;128:833–848. doi: 10.1053/j.gastro.2005.01.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Backert S., Clyne M. Pathogenesis of Helicobacter pylori infection. Helicobacter. 2011;1:19–25. doi: 10.1111/j.1523-5378.2011.00876.x. [DOI] [PubMed] [Google Scholar]
- 15.Tomb J.F., White O., Kerlavage A.R., Clayton R.A., Sutton G.G., Fleischmann R.D., Ketchum K.A., Klenk H.P., Gill S., Dougherty B.A., et al. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature. 1997;388:539–547. doi: 10.1038/41483. [DOI] [PubMed] [Google Scholar]
- 16.Alm R.A., Ling L.S., Moir D.T., King B.L., Brown E.D., Doig P.C., Smith D.R., Noonan B., Guild B.C., deJonge B.L., et al. Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature. 1997;397:176–180. doi: 10.1038/16495. [DOI] [PubMed] [Google Scholar]
- 17.NCBI genome database. [accessed on 31 May 2012]. Available online: http://www.ncbi.nlm.nih.gov/genome.
- 18.Bieker K.L., Silhavy T.J. The genetics of protein secretion in E. coli. Trends Genet. 1990;6:329–334. doi: 10.1016/0168-9525(90)90254-4. [DOI] [PubMed] [Google Scholar]
- 19.Medigue C., Wong B.C., Lin M.C., Bocs S., Danchin A. The secE gene of Helicobacter pylori. J. Bacteriol. 2002;184:2837–2840. doi: 10.1128/JB.184.10.2837-2840.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wassarman K.M., Repoila F., Rosenow C., Storz G., Gottesman S. Identification of novel small RNAs using comparative genomics and microarrays. Genes Dev. 2001;15:1637–1651. doi: 10.1101/gad.901001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Dong Q., Zhang L., Goh K.L., Forman D., O’Rourke J., Harris A., Mitchell H. Identification and characterisation of ssrA in members of the Helicobacter genus. Antonie Van Leeuwenhoek. 2007;92:301–307. doi: 10.1007/s10482-007-9152-8. [DOI] [PubMed] [Google Scholar]
- 22.Kazantsev A.V., Pace N.R. Bacterial RNase P: A new view of an ancient enzyme. Nat. Rev. Microbiol. 2006;4:729–740. doi: 10.1038/nrmicro1491. [DOI] [PubMed] [Google Scholar]
- 23.Vogel J., Bartels V., Tang T.H., Churakov G., Slagter-Jager J.G., Huttenhofer A., Wagner E.G. RNomics in Escherichia coli detects new sRNA species and indicates parallel transcriptional output in bacteria. Nucleic Acids Res. 2003;31:6435–6443. doi: 10.1093/nar/gkg867. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Giannakis M., Chen S.L., Karam S.M., Engstrand L., Gordon J.I. Helicobacter pylori evolution during progression from chronic atrophic gastritis to gastric cancer and its impact on gastric stem cells. Proc. Natl. Acad. Sci. USA. 2008;105:4358–4363. doi: 10.1073/pnas.0800668105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Raymond J., Thiberge J.M., Kalach N., Bergeret M., Dupont C., Labigne A., Dauga C. Using macro-arrays to study routes of infection of Helicobacter pylori in three families. PLoS One. 2008;3 doi: 10.1371/journal.pone.0002259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Baltrus D.A., Amieva M.R., Covacci A., Lowe T.M., Merrell D.S., Ottemann K.M., Stein M., Salama N.R., Guillemin K. The complete genome sequence of Helicobacter pylori strain G27. J. Bacteriol. 2009;191:447–448. doi: 10.1128/JB.01416-08. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Covacci A., Censini S., Bugnoli M., Petracca R., Burroni D., Macchia G., Massone A., Papini E., Xiang Z., Figura N., et al. Molecular characterization of the 128-kDa immunodominant antigen of Helicobacter pylori associated with cytotoxicity and duodenal ulcer. Proc. Natl. Acad. Sci. USA. 1993;90:5791–5795. doi: 10.1073/pnas.90.12.5791. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Oh J.D., Kling-Backhed H., Giannakis M., Xu J., Fulton R.S., Fulton L.A., Cordum H.S., Wang C., Elliott G., Edwards J., et al. The complete genome sequence of a chronic atrophic gastritis Helicobacter pylori strain: Evolution during disease progression. Proc. Natl. Acad. Sci. USA. 2006;103:9999–10004. doi: 10.1073/pnas.0603784103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.McClain M.S., Shaffer C.L., Israel D.A., Peek R.M., Jr., Cover T.L. Genome sequence analysis of Helicobacter pylori strains associated with gastric ulceration and gastric cancer. BMC Genomics. 2009;10 doi: 10.1186/1471-2164-10-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Devi S.H., Taylor T.D., Avasthi T.S., Kondo S., Suzuki Y., Megraud F., Ahmed N. Genome of Helicobacter pylori strain 908. J. Bacteriol. 2010;192:6488–6489. doi: 10.1128/JB.01110-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Farnbacher M., Jahns T., Willrodt D., Daniel R., Haas R., Goesmann A., Kurtz S., Rieder G. Sequencing, annotation, and comparative genome analysis of the gerbil-adapted Helicobacter pylori strain B8. BMC Genomics. 2010;11 doi: 10.1186/1471-2164-11-335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Fischer W., Windhager L., Rohrer S., Zeiller M., Karnholz A., Hoffmann R., Zimmer R., Haas R. Strain-specific genes of Helicobacter pylori: Genome evolution driven by a novel type IV secretion system and genomic island transfer. Nucleic Acids Res. 2010;38:6089–6101. doi: 10.1093/nar/gkq378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kersulyte D., Kalia A., Gilman R.H., Mendez M., Herrera P., Cabrera L., Velapatino B., Balqui J., Paredes Puente de la Vega F., Rodriguez Ulloa C.A., et al. Helicobacter pylori from Peruvian amerindians: Traces of human migrations in strains from remote Amazon, and genome sequence of an Amerind strain. PLoS One. 2010;5 doi: 10.1371/journal.pone.0015076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Mane S.P., Dominguez-Bello M.G., Blaser M.J., Sobral B.W., Hontecillas R., Skoneczka J., Mohapatra S.K., Crasta O.R., Evans C., Modise T., et al. Host-interactive genes in Amerindian Helicobacter pylori diverge from their Old World homologs and mediate inflammatory responses. J. Bacteriol. 2010;192:3078–3092. doi: 10.1128/JB.00063-10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Thiberge J.M., Boursaux-Eude C., Lehours P., Dillies M.A., Creno S., Coppee J.Y., Rouy Z., Lajus A., Ma L., Burucoa C., et al. From array-based hybridization of Helicobacter pylori isolates to the complete genome sequence of an isolate associated with MALT lymphoma. BMC Genomics. 2010;11 doi: 10.1186/1471-2164-11-368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Avasthi T.S., Devi S.H., Taylor T.D., Kumar N., Baddam R., Kondo S., Suzuki Y., Lamouliatte H., Megraud F., Ahmed N. Genomes of two chronological isolates (Helicobacter pylori 2017 and 2018) of the West African Helicobacter pylori strain 908 obtained from a single patient. J. Bacteriol. 2011;193:3385–3386. doi: 10.1128/JB.05006-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Furuta Y., Kawai M., Yahara K., Takahashi N., Handa N., Tsuru T., Oshima K., Yoshida M., Azuma T., Hattori M., et al. Birth and death of genes linked to chromosomal inversion. Proc. Natl. Acad. Sci. USA. 2011;108:1501–1506. doi: 10.1073/pnas.1012579108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Lehours P., Vale F.F., Bjursell M.K., Melefors O., Advani R., Glavas S., Guegueniat J., Gontier E., Lacomme S., Alves Matos A., et al. Genome sequencing reveals a phage in Helicobacter pylori. MBio. 2011;2 doi: 10.1128/mBio.00239-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Alvi A., Devi S.M., Ahmed I., Hussain M.A., Rizwan M., Lamouliatte H., Megraud F., Ahmed N. Microevolution of Helicobacter pylori type IV secretion systems in an ulcer disease patient over a ten-year period. J. Clin. Microbiol. 2007;45:4039–4043. doi: 10.1128/JCM.01631-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Prouzet-Mauleon V., Hussain M.A., Lamouliatte H., Kauser F., Megraud F., Ahmed N. Pathogen evolution in vivo: Genome dynamics of two isolates obtained 9 years apart from a duodenal ulcer patient infected with a single Helicobacter pylori strain. J. Clin. Microbiol. 2005;43:4237–4241. doi: 10.1128/JCM.43.8.4237-4241.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Linz B., Balloux F., Moodley Y., Manica A., Liu H., Roumagnac P., Falush D., Stamer C., Prugnolle F., van der Merwe S.W., et al. An African origin for the intimate association between humans and Helicobacter pylori. Nature. 2007;445:915–918. doi: 10.1038/nature05562. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Falush D., Wirth T., Linz B., Pritchard J.K., Stephens M., Kidd M., Blaser M.J., Graham D.Y., Vacher S., Perez-Perez G.I., et al. Science. 2003;299:1582–1585. doi: 10.1126/science.1080857. [DOI] [PubMed] [Google Scholar]
- 43.Wirth T., Wang X., Linz B., Novick R.P., Lum J.K., Blaser M., Morelli G., Falush D., Achtman M. Distinguishing human ethnic groups by means of sequences from Helicobacter pylori: Lessons from Ladakh. Proc. Natl. Acad. Sci. USA. 2004;101:4746–4751. doi: 10.1073/pnas.0306629101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Peterson J.D., Umayam L.A., Dickinson T.M., Hickey E.K., White O. The comprehensive microbial resource. Nucleic Acids Res. 2001;29:123–125. doi: 10.1093/nar/29.1.123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Marais A., Mendz G.L., Hazell S.L., Megraud F. Metabolism and genetics of Helicobacter pylori: The genome era. Microbiol. Mol. Biol. Rev. 1999;63:642–674. doi: 10.1128/mmbr.63.3.642-674.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Merrell D.S., Thompson L.J., Kim C.C., Mitchell H., Tompkins L.S., Lee A., Falkow S. Growth phase-dependent response of Helicobacter pylori to iron starvation. Infect. Immun. 2003;71:6510–6525. doi: 10.1128/IAI.71.11.6510-6525.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Wen Y., Marcus E.A., Matrubutham U., Gleeson M.A., Scott D.R., Sachs G. Acid-adaptive genes of Helicobacter pylori. Infect. Immun. 2003;71:5921–5939. doi: 10.1128/IAI.71.10.5921-5939.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Cremades N., Velazquez-Campoy A., Martinez-Julvez M., Neira J.L., Perez-Dorado I., Hermoso J., Jimenez P., Lanas A., Hoffman P.S., Sancho J. Discovery of specific flavodoxin inhibitors as potential therapeutic agents against Helicobacter pylori infection. ACS Chem. Biol. 2009;4:928–938. doi: 10.1021/cb900166q. [DOI] [PubMed] [Google Scholar]
- 49.Han K.D., Matsuura A., Ahn H.C., Kwon A.R., Min Y.H., Park H.J., Won H.S., Park S.J., Kim D.Y., Lee B.J. Functional identification of toxin-antitoxin molecules from Helicobacter pylori 26695 and structural elucidation of the molecular interactions. J. Biol. Chem. 2011;286:4842–4853. doi: 10.1074/jbc.M109.097840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Han K.D., Park S.J., Jang S.B., Son W.S., Lee B.J. Solution structure of conserved hypothetical protein HP0894 from Helicobacter pylori. Proteins. 2005;61:1114–1116. doi: 10.1002/prot.20691. [DOI] [PubMed] [Google Scholar]
- 51.Yeo H.J., Savvides S.N., Herr A.B., Lanka E., Waksman G. Crystal structure of the hexameric traffic ATPase of the Helicobacter pylori type IV secretion system. Mol. Cell. 2000;6:1461–1472. doi: 10.1016/s1097-2765(00)00142-8. [DOI] [PubMed] [Google Scholar]
- 52.Protein Data Bank. [accessed on 1 March 2012]. Available online: http://www.rcsb.org.
- 53.Goulding C.W., Perry L.J. Protein production in Escherichia coli for structural studies by X-ray crystallography. J. Struct. Biol. 2003;142:133–143. doi: 10.1016/s1047-8477(03)00044-3. [DOI] [PubMed] [Google Scholar]
- 54.Cussac V., Ferrero R.L., Labigne A. Expression of Helicobacter pylori urease genes in Escherichia coli grown under nitrogen-limiting conditions. J. Bacteriol. 1992;174:2466–2473. doi: 10.1128/jb.174.8.2466-2473.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Pettersen E.F., Goddard T.D., Huang C.C., Couch G.S., Greenblatt D.M., Meng E.C., Ferrin T.E. UCSF Chimera—A visualization system for exploratory research and analysis. J. Comput. Chem. 2004;25:1605–1612. doi: 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
- 56.Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. Basic local alignment search tool. J. Mol. Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
- 57.Kannan S., Hauth A.M., Burger G. Function prediction of hypothetical proteins without sequence similarity to proteins of known function. Protein Pept. Lett. 2008;15:1107–1116. doi: 10.2174/092986608786071085. [DOI] [PubMed] [Google Scholar]
- 58.Chou K.C., Shen H.B. Cell-PLoc: A package of Web servers for predicting subcellular localization of proteins in various organisms. Nat. Protoc. 2008;3:153–162. doi: 10.1038/nprot.2007.494. [DOI] [PubMed] [Google Scholar]
- 59.Shen H.B., Chou K.C. EzyPred: A top-down approach for predicting enzyme functional classes and subclasses. Biochem. Biophys. Res. Commun. 2007;364:53–59. doi: 10.1016/j.bbrc.2007.09.098. [DOI] [PubMed] [Google Scholar]
- 60.Dobson P.D., Cai Y.D., Stapley B.J. Doig, A.J. Prediction of protein function in the absence of significant sequence similarity. Curr. Med. Chem. 2004;11:2135–2142. doi: 10.2174/0929867043364702. [DOI] [PubMed] [Google Scholar]
- 61.Dundas J., Ouyang Z., Tseng J., Binkowski A., Turpaz Y., Liang J. CASTp: Computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues. Nucleic Acids Res. 2006;34:W116–W118. doi: 10.1093/nar/gkl282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Holm L., Kääriäinen S., Rosenström P., Schenkel A. Searching protein structure databases with DaliLite v.3. Bioinformatics. 2008;24:2780–2781. doi: 10.1093/bioinformatics/btn507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Holm L., Rosenström P. Dali server: Conservation mapping in 3D. Nucleic Acids Res. 2010;38:W545–W549. doi: 10.1093/nar/gkq366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Kawabata T., Nishikawa K. Protein structure comparison using the markov transition model of evolution. Proteins. 2000;41:108–122. [PubMed] [Google Scholar]
- 65.Nimrod G., Schushan M., Steinberg D.M., Ben-Tal N. Detection of functionally important regions in “hypothetical proteins” of known structure. Structure. 2008;16:1755–1763. doi: 10.1016/j.str.2008.10.017. [DOI] [PubMed] [Google Scholar]
- 66.Aloy P., Querol E., Aviles F.X., Sternberg M.J. Automated structure-based prediction of functional sites in proteins: Applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. J. Mol. Biol. 2001;311:395–408. doi: 10.1006/jmbi.2001.4870. [DOI] [PubMed] [Google Scholar]
- 67.Ondrechen M.J., Clifton J.G., Ringe D. THEMATICS: A simple computational predictor of enzyme function from structure. Proc. Natl. Acad. Sci. USA. 2001;98:12473–12478. doi: 10.1073/pnas.211436698. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Pazos F., Sternberg M.J. Automated prediction of protein function and detection of functional sites from structure. Proc. Natl. Acad. Sci. USA. 2004;101:14754–14759. doi: 10.1073/pnas.0404569101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Pettit F.K., Bare E., Tsai A., Bowie J.U. HotPatch: A statistical approach to finding biologically relevant features on protein surfaces. J. Mol. Biol. 2007;369:863–879. doi: 10.1016/j.jmb.2007.03.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Sierk M.L., Pearson W.R. Sensitivity and selectivity in protein structure comparison. Protein Sci. 2004;13:773–785. doi: 10.1110/ps.03328504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Altschul S.F., Gish W. Local alignment statistics. Methods Enzymol. 1996;266:460–480. doi: 10.1016/s0076-6879(96)66029-7. [DOI] [PubMed] [Google Scholar]
- 72.Han K.D., Park S.J., Jang S.B., Son W.S., Lee B.J. Solution structure of conserved hypothetical protein HP0894 from Helicobacter pylori. Proteins. 2005;61:1111–1113. doi: 10.1002/prot.20691. [DOI] [PubMed] [Google Scholar]
- 73.Marchler-Bauer A., Bryant S.H. CD-Search: Protein domain annotations on the fly. Nucleic Acids Res. 2004;32:W327–W331. doi: 10.1093/nar/gkh454. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Bateman A., Birney E., Cerruti L. The Pfam protein families data base. Nucleic Acids Res. 2002;30:276–280. doi: 10.1093/nar/30.1.276. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Takagi H., Kakuta Y., Okada T., Yao M., Tanaka I., Kimura M. Crystal structure of archaeal toxin-antitoxin RelE-RelB complex with implications for toxin activity and antitoxin effects. Nat. Struct. Mol. Biol. 2005;12:327–331. doi: 10.1038/nsmb911. [DOI] [PubMed] [Google Scholar]
- 76.Gerdes K., Christensen S.K., Løbner-Olesen A. Prokaryotic toxin-antitoxin stress response loci. Nat. Rev. Microbiol. 2005;3:371–382. doi: 10.1038/nrmicro1147. [DOI] [PubMed] [Google Scholar]
- 77.Wilson D.N., Nierhaus K.H. RelBE or not to be. Nat. Struct. Mol. Biol. 2005;12:282–284. doi: 10.1038/nsmb0405-282. [DOI] [PubMed] [Google Scholar]
- 78.Han K.D., Matsuura A., Ahn H.C., Kwon A.R., Min Y.H., Park H.J., Won H.S., Park S.J., Kim D.Y., Lee B.J. Functional identification of toxin-antitoxin molecules from Helicobacter pylori 26695 and structural elucidation of the molecular interactions. J. Biol. Chem. 2011;286:4842–4853. doi: 10.1074/jbc.M109.097840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.Kamada K., Hanaoka F., Burley S.K. Crystal structure of the MazE/MazF complex: Molecular bases of antidote-toxin recognition. Mol. Cell. 2003;11:875–884. doi: 10.1016/s1097-2765(03)00097-2. [DOI] [PubMed] [Google Scholar]
- 80.Han K.D., Park S.J., Jang S.B., Lee B.J. Solution structure of conserved hypothetical protein HP0892 from Helicobacter pylori. Proteins. 2008;70:599–602. doi: 10.1002/prot.21701. [DOI] [PubMed] [Google Scholar]
- 81.Terry C.E., McGinnis L.M., Madigan K.C., Cao P., Cover T.L., Liechti G.W., Peek R.M., Jr, Forsyth M.H. Genomic comparison of cag pathogenicity island (PAI)-positive and -negative Helicobacter pylori strains: Identification of novel markers for cag PAI-positive strains. Infect. Immun. 2005;73:3794–3798. doi: 10.1128/IAI.73.6.3794-3798.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Cheetham B.F., Tattersall D.B., Bloomfield G.A., Rood J.I., Katz M.E. Identification of a gene encoding a bacteriophage-related integrase in a vap region of the Dichelobacter nodosus genome. Gene. 1995;162:53–58. doi: 10.1016/0378-1119(95)00315-w. [DOI] [PubMed] [Google Scholar]
- 83.Katz M.E., Strugnell R.A., Rood J.I. Molecular characterization of a genomic region associated with virulence in Dichelobacter nodosus. Infect. Immun. 1992;60:4586–4592. doi: 10.1128/iai.60.11.4586-4592.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Takai S., Hines S.A., Sekizaki T., Nicholson V.M., Alperin D.A., Osaki M., Takamatsu D., Nakamura M., Suzuki K., Ogino N., et al. DNA sequence and comparison of virulence plasmids from Rhodococcus equi ATCC 33701 and 103. Infect. Immun. 2000;68:6840–6847. doi: 10.1128/iai.68.12.6840-6847.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85.Tomb J., White O., Kerlavage A.R., Clayton R.A., Sutton G.G., Fleischmann R.D., Ketchum K.A., Klenk H.P., Gill S., Dougherty B.A., et al. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature. 1997;388:539–547. doi: 10.1038/41483. [DOI] [PubMed] [Google Scholar]
- 86.Katz M.E., Strugnell R.A., Rood J.I. Molecular characterization of a genomic region associated with virulence in Dichelobacter nodosus. Infect. Immun. 1992;60:4586–4592. doi: 10.1128/iai.60.11.4586-4592.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Benoit S., Benachour A., Taouji S., Auffray Y., Hartke A. Induction of vap genes encoded by the virulence plasmid of Rhodococcus equi during acid tolerance response. Res. Microbiol. 2001;152:439–449. doi: 10.1016/s0923-2508(01)01217-7. [DOI] [PubMed] [Google Scholar]
- 88.Galli D.M., LeBlanc D.J. Characterization of pVT736-1, a rolling-circle plasmid from the gram-negative bacterium Actinobacillus actinomycetemcomitans. Plasmid. 1994;31:148–157. doi: 10.1006/plas.1994.1016. [DOI] [PubMed] [Google Scholar]
- 89.Kwon A.R., Kim J.H., Park S.J., Lee K.Y., Min Y.H., Im H., Lee I., Lee K.Y., Lee B.J. Structural and biochemical characterization of HP0315 from Helicobacter pylori as a VapD protein with an endoribonuclease activity. Nucleic Acids Res. 2012;40:4216–4228. doi: 10.1093/nar/gkr1305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Makarova K.S., Grishin N.V., Shabalina S.A., Wolf Y.I., Koonin E.V. A putative RNA-interference-based immune system in prokaryotes: Computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action. Biol. Direct. 2006;1 doi: 10.1186/1745-6150-1-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Jang S.B., Kwon A.R., Son W.S., Park S.J., Lee B.J. Crystal structure of hypothetical protein HP0062 (O24902_HELPY) from Helicobacter pylori at 1.65 A resolution. J. Biochem. 2009;146:535–540. doi: 10.1093/jb/mvp098. [DOI] [PubMed] [Google Scholar]
- 92.Pallen M.J. The ESAT-6/WXG100 superfamily—And a new Gram-positive secretion system? Trends Microbiol. 2002;10:209–212. doi: 10.1016/s0966-842x(02)02345-4. [DOI] [PubMed] [Google Scholar]
- 93.Plano G.V., Day J.B., Ferracci F. Type III export: New uses for an old pathway. Mol. Microbiol. 2001;40:284–293. doi: 10.1046/j.1365-2958.2001.02354.x. [DOI] [PubMed] [Google Scholar]
- 94.Seo M.D., Park S.J., Kim H.J., Lee B.J. Solution structure of hypothetical protein, HP0495 (Y495_HELPY) from Helicobacter pylori. Proteins. 2007;67:1189–1192. doi: 10.1002/prot.21346. [DOI] [PubMed] [Google Scholar]
- 95.Seo M.D., Park S.J., Kim H.J., Seok S.H., Lee B.J. Backbone 1H, 15N, and 13C resonance assignment and secondary structure prediction of HP0495 from Helicobacter pylori. J. Biochem. Mol. Biol. 2007;40:839–843. doi: 10.5483/bmbrep.2007.40.5.839. [DOI] [PubMed] [Google Scholar]
- 96.Jang S.B., Ma C., Lee J.Y., Kim J.H., Park S.J., Kwon A.R., Lee B.J. NMR solution structure of HP0827 (O25501_HELPY) from Helicobacter pylori: Model of the possible RNA-binding site. J. Biochem. 2009;146:667–674. doi: 10.1093/jb/mvp105. [DOI] [PubMed] [Google Scholar]
- 97.Bateman A., Birney E., Cerruti L., Durbin R., Etwiller L., Eddy S.R., Griffiths-Jones S., Howe K.L., Marshall M., Sonnhammer E.L. The Pfam protein families database. Nucleic Acids Res. 2002;30:276–280. doi: 10.1093/nar/30.1.276. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Kang S.J., Park S.J., Jung S.J., Lee B.J. Backbone 1H, 15N, and 13C resonance assignment of HP1242 from Helicobacter pylori. J. Biochem. Mol. Biol. 2005;38:591–594. doi: 10.5483/bmbrep.2005.38.5.591. [DOI] [PubMed] [Google Scholar]
- 99.Kang S.J., Park S.J., Jung S.J., Lee B.J. Solution structure of HP1242 from Helicobacter pylori. Proteins. 2005;61:1111–1113. doi: 10.1002/prot.20654. [DOI] [PubMed] [Google Scholar]
- 100.Aravind L., Koonin E.V. Novel predicted RNA-binding domains associated with the translation machinery. J. Mol. Evol. 1999;48:291–302. doi: 10.1007/pl00006472. [DOI] [PubMed] [Google Scholar]
- 101.Kim J.H., Park S.J., Lee K.Y., Son W.S., Sohn N.Y., Kwon A.R., Lee B.J. Solution structure of hypothetical protein HP1423 (Y1423_HELPY) reveals the presence of alphaL motif related to RNA binding. Proteins. 2009;75:252–257. doi: 10.1002/prot.22335. [DOI] [PubMed] [Google Scholar]
- 102.Copley S.D. Enzymes with extra talents: Moonlighting functions and catalytic promiscuity. Curr. Opin. Chem. Biol. 2003;7:265–272. doi: 10.1016/s1367-5931(03)00032-2. [DOI] [PubMed] [Google Scholar]
- 103.Odermatt A., Suter H., Krapf R., Solioz M. Primary structure of two P-type ATPases involved in copper homeostasis in Enterococcus hirae. J. Biol. Chem. 1993;268:12775–12779. [PubMed] [Google Scholar]
- 104.Odermatt A., Solioz M. Two trans-acting metalloregulatory proteins controlling expression of the copper-ATPases of Enterococcus hirae. J. Biol. Chem. 1995;270:4349–4354. doi: 10.1074/jbc.270.9.4349. [DOI] [PubMed] [Google Scholar]
- 105.Wunderli-Ye H., Solioz M. Effects of promoter mutations on the in vivo regulation of the cop operon of Enterococcus hirae by copper(I) and copper(II) Biochem. Biophys. Res. Commun. 1999;259:443–449. doi: 10.1006/bbrc.1999.0807. [DOI] [PubMed] [Google Scholar]
- 106.Pufahl R.A., Singer C.P., Peariso K.L., Lin S., Schmidt P.J., Fahrni C.J., Culotta V.C., Penner-Hahn J.E., O’Halloran T.V. Metal ion chaperone function of the soluble Cu(I) receptor Atx1. Science. 1997;278:853–856. doi: 10.1126/science.278.5339.853. [DOI] [PubMed] [Google Scholar]
- 107.Banci L., Bertini I., Ciofi-Baffoni S., Del Conte R., Gonnelli L. Understanding copper trafficking in bacteria: Interaction between the copper transport protein CopZ and the N-terminal domain of the copper ATPase CopA from Bacillus subtilis. Biochemistry. 2003;42:1939–1949. doi: 10.1021/bi027096p. [DOI] [PubMed] [Google Scholar]
- 108.Beier D., Spohn G., Rappuoli R., Scarlato V. Identification and characterization of an operon of Helicobacter pylori that is involved in motility and stress adaptation. J. Bacteriol. 1997;179:4676–4683. doi: 10.1128/jb.179.15.4676-4683.1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Bayle D., Wangler S., Weitzenegger T., Steinhilber W., Volz J., Przybylski M., Schafer K.P., Sachs G., Melchers K. Properties of the P-type ATPases encoded by the copAP operons of Helicobacter pylori and Helicobacter felis. J. Bacteriol. 1998;180:317–329. doi: 10.1128/jb.180.2.317-329.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 110.Solioz M., Stoyanov J.V. Copper homeostasis in Enterococcus hirae. FEMS Microbiol. Rev. 2003;27:183–195. doi: 10.1016/S0168-6445(03)00053-6. [DOI] [PubMed] [Google Scholar]
- 111.Park S.J., Jung Y.S., Kim J.S., Seo M.D., Lee B.J. Structural insight into the distinct properties of copper transport by the Helicobacter pylori CopP protein. Proteins. 2008;71:1007–1019. doi: 10.1002/prot.21957. [DOI] [PubMed] [Google Scholar]
- 112.Vandem B.T., Cronan J.E., Jr Genetics and regulation of bacterial lipid metabolism. Annu. Rev. Microbiol. 1989;43:317–343. doi: 10.1146/annurev.mi.43.100189.001533. [DOI] [PubMed] [Google Scholar]
- 113.Jones P.J., Holak T.A., Prestegard J.H. Structural comparison of acyl carrier protein in acylated and sulfhydryl forms by two-dimensional 1H NMR spectroscopy. Biochemistry. 1987;26:3493–3500. doi: 10.1021/bi00386a037. [DOI] [PubMed] [Google Scholar]
- 114.Cronan J.E., Jr Molecular properties of short chain acyl thioesters of acyl carrier protein. J. Biol. Chem. 1982;257:5013–5017. [PubMed] [Google Scholar]
- 115.Park S.J., Kim J.S., Son W.S., Lee B.J. pH-induced conformational transition of H. pylori acyl carrier protein: Insight into the unfolding of local structure. J. Biochem. 2004;135:337–346. doi: 10.1093/jb/mvh041. [DOI] [PubMed] [Google Scholar]
- 116.Schulz H. On the structure-function relationship of acyl carrier protein of Escherichia coli. J. Biol. Chem. 1975;250:2299–2304. [PubMed] [Google Scholar]
- 117.Flaman A.S., Chen J.M., Van Iderstine S.C., Byers D.M. Site-directed mutagenesis of acyl carrier protein (ACP) reveals amino acid residues involved in ACP structure and acyl-ACP synthetase activity. J. Biol. Chem. 2001;276:35934–35939. doi: 10.1074/jbc.M101849200. [DOI] [PubMed] [Google Scholar]
- 118.Keating D.H., Cronan J.E., Jr An isoleucine to valine substitution in Escherichia coli acyl carrier protein results in a functional protein of decreased molecular radius at elevated pH. J. Biol. Chem. 1996;271:15905–15910. doi: 10.1074/jbc.271.27.15905. [DOI] [PubMed] [Google Scholar]
- 119.Keating M.-M., Gong H., Byers D.M. Identification of a key residue in the conformational stability of acyl carrier protein. Biochem. Biophys. Acta. 2002;1601:208–214. doi: 10.1016/s1570-9639(02)00470-3. [DOI] [PubMed] [Google Scholar]
- 120.Hanson A.D., Pribat A., Waller J.C., de Crécy-Lagard V. “Unknown” proteins and “orphan” enzymes: The missing half of the engineering parts list—and how to find it. Biochem. J. 2010;425:1–11. doi: 10.1042/BJ20091328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 121.Galperin M.Y., Koonin E.V. “Conserved hypothetical” proteins: Prioritization of targets for experimental study. Nucleic Acids Res. 2004;32:5452–5463. doi: 10.1093/nar/gkh885. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 122.Galperin M.Y., Koonin E.V. From complete genome sequence to “complete” understanding? Trends Biotechnol. 2010;28:398–406. doi: 10.1016/j.tibtech.2010.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 123.Frishman D. Protein annotation at genomic scale: The current status. Chem. Rev. 2007;107:3448–3466. doi: 10.1021/cr068303k. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.