The crystal structure of pyrimidine/thiamin biosynthesis precursor-like domain-containing protein CAE31940 from proteobacterium Bordetella bronchiseptica RB50, and evolutionary insight into the NMT1/THI5 family

Jacek Bajor; Karolina L Tkaczuk; Maksymilian Chruszcz; Hutton Chapman; Olga Kagan; Alexei Savchenko; Wladek Minor

doi:10.1007/s10969-014-9180-3

. Author manuscript; available in PMC: 2015 Jun 8.

Published in final edited form as: J Struct Funct Genomics. 2014 Jun 8;15(2):73–81. doi: 10.1007/s10969-014-9180-3

The crystal structure of pyrimidine/thiamin biosynthesis precursor-like domain-containing protein CAE31940 from proteobacterium Bordetella bronchiseptica RB50, and evolutionary insight into the NMT1/THI5 family

Jacek Bajor ¹, Karolina L Tkaczuk ², Maksymilian Chruszcz ³, Hutton Chapman ⁴, Olga Kagan ⁵, Alexei Savchenko ⁶, Wladek Minor ⁷

PMCID: PMC4115252 NIHMSID: NIHMS603328 PMID: 24908050

Abstract

We report a 2.0 Å structure of the CAE31940 protein, a proteobacterial NMT1/THI5-like domain-containing protein. We also discuss the primary and tertiary structure similarity with its homologs. The highly conserved FGGXMP motif was identified in CAE31940, which corresponds to the GCCCX motif located in the vicinity of the active center characteristic for THi5-like proteins found in yeast. This suggests that the FGGXMP motif may be a unique hallmark of proteobacterial NMT1/THI5-like proteins.

Keywords: Bordetella bronchiseptica RB50, NMT1/THI5-like domain-containing protein, Crystal structure, MCSG

Introduction

Thiamin (vitamin B1) consists of two components: the pyrimidine moiety (4-amino-5-hydroxymethyl-2-methylpyrimidine) and the thiazole moiety (5-(2-hydroxyethyl)-4-methylthiazole). The two moieties are produced by two separate biosynthetic processes, which are then covalently linked to yield thiamin phosphate [1, 2]. This process is well studied in prokaryotes but is still poorly understood in eukaryotes. Thiamin synthesis has been studied to some degree in yeast; in Saccharomyces cerevisiae the thi5 gene product THi5 is responsible for the synthesis of 4-amino-5-(hydroxymethyl)-2-methylpyrimidine phosphate in yeast [3–5]. THi5 appears to be conserved in eukaryotes with thiamin biosynthetic pathways [3–5].

THi5 belongs to a large superfamily known as the NMT1/THI5-like domain proteins (PFam entry PF09084, comprising 7,204 sequences). However, the majority of members of the NMT1/THI5-like superfamily are found in eubacteria, especially Proteobacteria (4,295 sequences in 1,354 species). While there is some structural information for the superfamily—for example, a homolog in Bacillus halodurans—the family is very diverse in sequence and few structural representatives are known, particularly among proteobacteria. Presented here are the structure and analyses of the two-domain protein CAE31940 from Bordetella bronchiseptica RB50 containing pyrimidine/thiamin biosynthesis precursor-like domain, which shed new light on potential proteins taking part in thiamin biosynthesis in this organism.

Materials and methods

Cloning, expression and purification

Selenomethionine (Se-Met) substituted CAE31940 protein was produced using standard MSCG protocols as described by Zhang et al. [6]. Briefly, gene BB1442 from Bordetella bronchiseptica RB50 was cloned into a p15TV LIC plasmid using ligation independent cloning [7–9]. The gene was overexpressed in E. coli BL21-CodonPlus(DE3)-RIPL cells in Se-Met-containing LB media at 37.0 °C until the optical density at 600 nm reached 1.2. Then the cells were induced by isopropyl-β-D-1-thiogalactopyranoside, incubated at 20.0 °C overnight, and pelleted by centrifugation. Harvested cells were sonicated in lysis buffer (300 mM NaCl, 50 mM HEPES pH 7.5, 5 % glycerol, and 5 mM imidazole), the lysed cells were spun down for 15 min at 16,000 RPM and the supernatant was applied to a nickel chelate affinity resin (Ni–NTA, Qiagen). The resin was washed with wash buffer (300 mM NaCl, 50 mM HEPES pH 7.5, 5 % glycerol, and 30 mM imidazole) and the protein was eluted using elution buffer (300 mM NaCl, 50 mM HEPES pH 7.5, 5 % glycerol, and 250 mM imidazole). The N-terminal polyhistidine tag (His-Tag) was removed by digestion with recombinant TEV protease and the digested protein was passed through a second affinity column. The flow through was dialyzed against a solution containing 300 mM NaCl, 10 mM HEPES pH 7.5 and 1 mMTCEP. Purified protein was concentrated to 36 mg/mL and flash-frozen in liquid nitrogen.

Crystallization

Crystals of CAE31940 used for data collection were grown by the sitting drop vapor diffusion method. The well solution consisted of 0.2 M ammonium acetate, 30 % w/v PEG4000, and 0.1 M tri-sodium citrate at pH 5.6. Crystals were grown at 293 K and formed after 1 week of incubation. Immediately after harvesting, crystals were transferred into cryoprotectant solution (Paratone-N) without mother liquor, washed twice in the solution and flash cooled in liquid nitrogen.

Data collection and processing

Data were collected at 100 K at the 19-ID beamline (ADSC Q315 detector) of the Structural Biology Center [10] at the Advanced Photon Source (Argonne National Laboratory, Argonne, Illinois, USA). The beamline was controlled by HKL-3,000 [11]. Diffraction data were processed with HKL-2,000 [11]. Data collection, structure determination, and refinement statistics are summarized in Table 1.

Table 1.

Crystallographic parameters, and data collection and refinement statistics

Organism	Bordetella bronchiseptica PDB code: 3QSL
Crystal
Space group	P2₁2₁2₁
Unit cell	a = 60.9 Å b = 65.0 Å, c = 160.7 Å α = β = γ = 90°
Solvent content (%)	44
Data collection
Diffraction protocol	SAD
Wavelength (Å)	0.9792
Resolution (Å)	50.00–2.00
Highest resolution shell (Å)	2.05–2.00
Unique reflections	42946
Redundancy	7.8 (6.4)
Completeness (%)	97.4 (80.8)
I/σ (I)	30.7 (2.2)
R_merge (%)	8.3 (71.0)
Refinement
R_work (%)	18.8 (32.3)
R_free (%)	23.1 (33.1)
RMSD for bond length (Å)	0.017
RMSD for bond angles (°)	1.6
Number of protein atoms	4,652
Number of water molecules	223
Ramachandran plot
Most favored regions (%)	98.1
Additional allowed regions (%)	1.9

Open in a new tab

Data for the highest resolution shell are given in parentheses

Ramachandran statistics were calculated with MOLPROBITY

R_merge was calculated for merged Friedel pairs

Structure solution and refinement

The structure of the Se-Met-substituted protein was solved using single-wavelength anomalous diffraction (SAD), and an initial model was built with HKL-3000. HKL-3000 is integrated with SHELXC/D/E [12], MLPHARE, DM, ARP/wARP, CCP4 [13], SOLVE, and RESOLVE [14]. The resulting model was further refined with REFMAC5 [15], and COOT [16]. MOLPROBITY [17] and ADIT [18] were used for structure validation. The coordinates and experimental structure factors were deposited to PDB with accession code 3QSL.

Bioinformatics analyses

Sequence homology searches were performed with PSI-BLAST [19], and structural homology searches were done with HHpred [20, 21] with amino acid sequence of CAE31940 as a seed and FATCAT Database Searches using the crystal structure of CAE31940 with PDB ID 3QSL. Structures were superposed with SSM [22]. The evolutionary history was inferred using the neighbor-joining method [23]. The bootstrap consensus tree inferred from 500 replicates [24] is taken to represent the evolutionary history of the taxa analyzed [24]. Branches corresponding to partitions reproduced in less than 50 % of bootstrap replicates are collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches [24]. The evolutionary distances were computed using the Poisson correction method [25] and are in units of the number of amino acid substitutions per site. All positions containing gaps and missing data were eliminated from the dataset (complete deletion option). There were a total of 237 positions in the final dataset. Phylogenetic analyses were conducted with MEGA5 [26].

Homology modeling of Thi5 from Saccharomyces cerevisiae

The “FRankenstein’s Monster” method (in short, best fragment assembly mode) was used to construct a homology model of Thi5 from Saccharomyces cerevisiae S288c using the CAE31940 structure as a template. The method comprises cycles of local realignments in uncertain regions, building of alternative models and their evaluation, realignments in poorly scored regions, and merging of the best scoring fragments [27, 28]. PROQ [28] and MetaMQAP were used for the evaluation of models [29], which allowed the prediction of the deviations of individual residues in the homology model from their counterparts in the template structure.

Results and discussion

Overall structure

The two domain structure of the CAE31940 protein from Bordetella bronchiseptica RB50 was solved at a resolution of 2.0 Å. The protein crystallized in the orthorhombic P2₁2₁2₁ space group with two polypeptide chains in the asymmetric unit, which had the following unit cell dimensions: a = 60.9 Å, b = 65.0 Å, and c = 160.7 Å (all data collection and structure refinement parameters and statistics are shown in Table 1). The protein consists of 346 residues. Due to the lack of well-defined electron density, 27 amino acids at the N-terminus of chain A and 19 at the N-terminus of chain B could not be modeled. The final C-terminal residue of each chain is also missing in the solved structure.

The tertiary structure is composed of two β-sheets consisting of 5 β-strands each. The first one includes β strands 2↑-1↑-3↑-10↓-4↑ (the strands are numbered by their order in the primary sequence) and is flanked by 13 α-helices. The other β-sheet is composed of β-strands 7↑-6↑-8↑-5↓-9↑ and is surrounded by 6 α-helices. The overall structure of the protein is shown in Fig. 1.

Fig. 1 — Structure of CAE31940 from *Bordetella bronchiseptica* (PDB code: 3QSL) shown in cross-eyed stereo. Structure is colored by primary sequence, from *dark blue* at the N-terminus to red at the C-terminus. The N- and C-termini, as well as the secondary structure elements, are labeled

According to the predictions of the PISA server, the CAE31940 protein is monomeric in solution.

Evolutionary relatives and characteristics of CAE31940

Sequential and structural comparisons

Standard sequence searches with PSI-BLAST against a non-redundant protein sequence database found no homolog of CAE31940 with a known structural model. These searches, however, clearly showed that the most similar relatives (by sequence) of CAE31940 are NMT1/THI5-like domain-containing proteins. Thus subsequent structural homology searches were performed.

In order to find the most similar structural homologs of CAE31940, HHpred [21] and FATCAT [30] searches of the most up-to-date PDB database available (pdb70-Dec11) were performed. Their results show that CAE31940 structurally resembles eight proteins with an HHpred probability of 100 %: the alkanesulfonate-binding protein SsuA from Xanthomonas axonopodis (PDB code: 3KSX), periplasmic aliphatic sulphonate binding protein SsuA from Escherichia coli (PDB code: 2X26), ThiY periplasmic N-formyl-4-amino-5-aminomethyl-2-methylpyrimidine binding protein from Bacillus halodurans (PDB code: 3IX1), DSZB C27S mutant in complex with 2′-hydroxybiphenyl-2-sulfinic acid from Rhodococcus sp. (PDB code: 2DE3), periplasmic nitrate-binding protein NrtA from Synechocystis PCC 6803 (PDB code: 2G29), protein from Gloeobacter violaceus (PDB code: 2XQ7), ABC transporter (BDI_1369) from Parabacteroides distasonis (PDB code 3HN0), and bicarbonate transport protein CmpA from Synechocystis sp. PCC 6803 (PDB code: 2I49). Three of them (3KSX, 2X26, and 3IX1) belong to the NMT1/THI5-like family according to the PFam [31] annotations in the PDB database, and the remaining five are not assigned to any family. All of the identified structural homologs share comparatively low sequence identity with CAE31940, and according to the authors’ annotations for each, all act on different substrates.

Of the 5 best-scoring hits in the structural homology searches, the protein with the most similar sequence to CAE31940 is ThiY, as shown on Fig. 2, and also has the lowest RMSD resulting from structural superposition (Table 2). A recent paper by Bale and co-workers [3] discusses two structural homologs, ThiY (HMP binding protein; PDB code: 3IX1) and Thi5 (HMP-P synthase). (Though the crystal structure of Thi5 is not yet available, the authors created a homology model and discussed the similarities of the two proteins.) CAE31940 shares 24 and 26 % sequence identity and 4 and 46 % sequence similarity with ThiY and Thi5 respectively. This corresponds to BLAST expectation values of 1 × 10⁻⁴ and 3 × 10⁻¹³ respectively. This is rather high sequence similarity, and considering that the structure of ThiY is one of the most structurally similar proteins to CAE31940, ThiY superimposes on CAE31940 with an RMSD of 2.5 Å. It is also confirmed by the results of FATCAT structural homolog searches (using FATCAT Database Search utility) where ThiY ranks as the best hit for CAE31940 (FATCAT raw score is 603.22) and RMSD of 2.7 Å (as presented in Table 2). This may suggest that all three proteins are orthologs and originate from the same ancestor, which is further supported by the evolutionary analysis presented below.

Table 2.

Structurally-determined homologs of the CAE31940 protein

Name of the organism	PDB code	Protein	HHpred search
Name of the organism	PDB code	Protein	P value	E value	Sequence identity (%)	RMSD (Å)
Xantomonas axonopodis pv.citri	3KSX	SsuA	2.30E−40	5.80E−36	25.2	2.7
Escherichia coli	2X26	SsuA	2.90E−36	1.10E−40	24.7	3.1
Bacillus halodurans C-125	3IX1	ThiY	2.70E−35	1.10E−39	24.2	2.6
Rhodococcus sp.	2DE3	DszB	4.70E−37	1.90E−41	22.9	3.2
Synechocystis sp.	2G29	NrtA	1.30E−30	4.90E−35	25.6	2.6
Candida albicans	2X7Q	New family	1.00E−28	3.90E−33	54.5	2.6
Parabacteroides distasonis	3HN0	Nitrate transport	4.30E−29	1.70E−33	23.8	3.3
Synechocystis sp.	2I49	CmpA	5.10E−27	2.00E−31	26.5	2.8

Name of the organism	PDB code	Protein	FATCAT results
Name of the organism	PDB code	Protein	Score	P value	Chain RMSD (Å)	Opt RMSD (Å)
Bacillus halodurans C-125	3IX1	ThiY	603.22	0.00E+00	2.7	3.0
Xantomonas axonopodis pv.citri	3E4R	SsuA	556.21	0.00E+00	2.9	3.0
Synechocystis sp	2G29	NrtA	545.82	1.11E−16	2.9	3.0
Synechocystis sp	2I48	CmpA	527.35	2.11E−16	3.0	3.1
Archaeoglobus fulgidus	1ZBM	AF1704	377.69	3.80E−12	3.5	3.1
Thermus thermophilus HB8	2CZL	MqnD	381.37	5.50E−12	3.9	3.2
Escherichia coli	2X26	SsuA	523.72	2.84E−11	4.1	3.2
Streptomyces coelicolor	2NXO	SCO4506	344.48	6.67E−11	3.4	3.1

Open in a new tab

The FATCAT probability cutoff was less than 5.00 × 10⁻²

Structural alignment of the collected structures shows that they superimpose poorly (apart from their central β-sheet): even after removing the most variable regions (β5-β9 and α6-α11) (Fig. 3b), the pairwise RMSD of the most structurally similar proteins is 2.55 Å (as shown in Table 2; Fig. 3). However, the motif FGGXMP in CAE31940 corresponds to the previously identified GCCCX motif in Thi5 from Saccharomyces cerevisiae [3] (conserved in the THI5 family of enzymes) when CAE31940 and the homology model of Thi5 are super-imposed the aforementioned motifs are similar in both configuration and in their location in the 3-D structure (Fig. 4). Sequence analysis also shows that the FGGXMP motif is conserved in all proteobacterial sequences related to CAE31940. Therefore this motif may be a hallmark of proteobacterial members of NMT1/THI5-like domain-containing proteins. The exact role of GCCCX motif in THI5 family of enzymes is not well described. Bale et al. [3] in their work mention that it is conserved in this family and may behave like related cysteine rich sequences (e.g., CX3CX2C) and bind [4Fe-4S] clusters involved in electron transfer or radical reactions.

Fig. 3 — Superposition of the most structurally similar homologs. a and b present C^α superposition of CAE31940 with its most structurally similar homologs. In A: CAE31940 PDB ID: 3QSL (*dark blue*), with ThiY (PDB ID 3IX1; *green*), and CA3427 (PDB ID 2X7Q; *violet*); in B: CAE31940 (PDB ID: 3QSL; *dark blue*) with DSZB (PDB ID 2DE3; *orange*) and SsuA (PDB ID 2X26; *cyan*). c, d shows only the superposition of domain I of respective structures, which gives a better view of the degree of structural similarity in the superposition of single domains from all of the known structures

Fig. 4 — Superposition of CAE31940 with ThiY and the homology model of Thi5. The close-up is a rotated (as shown by the *arrow*) image along the *vertical axis* for better presentation of the detailed position of the conserved amino acids. The conserved residues involved in pyrimidine ring binding in ThiY are shown in stick representation and labels are marked with *asterisks* for Thi5 (model, *green*) and ThiY (crystal structure, *magenta*), and residues corresponding to this cluster in CAE31940 (crystal structure, *navy blue*) are marked with *hashes* (#). The conserved motifs GCCCXC and FGGXMP are highlighted and labeled. FAMP stands for N-formyl-4- amino-5-(aminomethyl)-2-methylpyrimidine, the ligand present in ThiY as reported by Bale et al. [3]

Evolutionary insight into NMT1/THI5 family

The cladogram presented in Fig. 5 and the multiple sequence alignment (MSA) in Fig. 2 clearly shows three well divided groups (marked in different colors on the MSA). Group A (blue group in Fig. 2) is composed of mainly α/β proteobacterial proteins including the target protein CAE31940, characterized by the FGGX(M/N)P motif. Group B (black group in Fig. 2) contains Thi5-like fungal proteins, with a GCCCX motif instead of the aforementioned FGGXMP motif, as well as ThiY (PDB code: 3IX1). As described previously by Bale and co-workers [3], ThiY is closely related to Thi5. Group C (green group in Fig. 2) contains structural homologs (mainly SsuA) of CAE31940, which act on the sulfonate group in proteins. Group C proteins lack both of the sequential motifs present in groups A and B.

As expected, the Thi5 and ThiY proteins fall into the same group (group B) in the cladogram, together with other fungal homologs of Thi5. The conserved Candida albicans CA3427 gene product (PDB code: 2X7Q) groups together with them. CA3427, as described by the authors of its structure [32], represents a new family of proteins exhibiting a generic periplasmic binding protein structural fold. Unlike Thi5 and ThiY, it contains neither the conserved motif nor conserved residues responsible for pyrimidine ring binding. However, it is a fungal protein, like most of the others in group B.

Almost all of the homologs of known structure are more remotely related (i.e. in groups B and C) to CAE31940 than the more similar homologs identified in group A (Fig. 5.), which were annotated as NMT1/THI5-like domain-containing proteins (blue group in Fig. 2). That can partially be explained by the source organisms of the proteins grouping with CAE31940—they are all proteobacteria, while almost all of the other homologs of known structure come from other phyla. The only exception is a protein from Xanthomonas axonopodis (PDB code: 3KSX) from γ-proteobacteria which falls into a separate branch (group B) in Fig. 5. The X. axonopodis protein belongs to the SsuA family and contains neither of the conserved sequential motifs identified. Group C forms a well-separated branch with high statistical support which is far apart from NMT1/THI5 group A. This indicated that the described structure of CAE31940 is the first crystal structure of a NMT1/THI5-like domain-containing protein with the FGGXMP motif conserved in proteobacteria. The function of this domain is still unknown, but due to its structural and sequence similarity to the pyrimidine/thiamin biosynthesis precursor proteins Thi5 and ThiY, it might serve a similar enzymatic function.

Protein data bank accession code

Coordinates and structure factors of CAE31940 have been deposited in the RCSB Protein Data Bank with accession code 3QSL.

Acknowledgments

This research was funded with Federal funds from the National Institute of Health under grants U54-GM074942 and U54-GM094585. The results shown in this report are derived from work performed at Argonne National Laboratory, at the Structural Biology Center of the Advanced Photon Source. Argonne is operated by University of Chicago Argonne, LLC, for the U.S. Department of Energy, Office of Biological and Environmental Research under contract DE-AC02-06CH11357. We thank Dr. Matthew D. Zimmerman for critically reading the manuscript.

Contributor Information

Jacek Bajor, Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA. Midwest Center for Structural Genomics, USA, URL: http://www.mcsg.anl.gov/.

Karolina L. Tkaczuk, Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA. Midwest Center for Structural Genomics, USA, URL: http://www.mcsg.anl.gov/

Maksymilian Chruszcz, Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA. Midwest Center for Structural Genomics, USA, URL: http://www.mcsg.anl.gov/.

Hutton Chapman, Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA. Midwest Center for Structural Genomics, USA, URL: http://www.mcsg.anl.gov/.

Olga Kagan, Midwest Center for Structural Genomics, USA, URL: http://www.mcsg.anl.gov/. Banting and Best Department of Medical Research, University of Toronto, 112 College Street, Toronto, ON M5G 1L6, Canada.

Alexei Savchenko, Midwest Center for Structural Genomics, USA, URL: http://www.mcsg.anl.gov/. Banting and Best Department of Medical Research, University of Toronto, 112 College Street, Toronto, ON M5G 1L6, Canada.

Wladek Minor, Email: wladek@iwonka.med.virginia.edu, Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA. Midwest Center for Structural Genomics, USA, URL: http://www.mcsg.anl.gov/.

References

1.Zurlinden A, Schweingruber ME. Cloning, nucleotide sequence, and regulation of Schizosaccharomyces pombe thi4, a thiamine biosynthetic gene. J Bacteriol. 1994;176:6631–6635. doi: 10.1128/jb.176.21.6631-6635.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Begley TP, Chatterjee A, Hanes JW, Hazra A, Ealick SE. Cofactor biosynthesis–still yielding fascinating new biological chemistry. Curr Opin Chem Biol. 2008;12(2):118–125. doi: 10.1016/j.cbpa.2008.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Bale S, Rajashankar KR, Perry K, Begley TP, Ealick SE. HMP binding protein ThiY and HMP-P synthase THI5 are structural homologues. Biochemistry. 2010;49(41):8929–8936. doi: 10.1021/bi101209t. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Maundrell K. nmt1 of fission yeast. A highly transcribed gene completely repressed by thiamine. J Biol Chem. 1990;265(19):10857–10864. [PubMed] [Google Scholar]
5.Wightman R, Meacock PA. The THI5 gene family of Saccharomyces cerevisiae: distribution of homologues among the hemiascomycetes and functional redundancy in the aerobic biosynthesis of thiamin from pyridoxine. Microbiology. 2003;149(Pt 6):1447–1460. doi: 10.1099/mic.0.26194-0. [DOI] [PubMed] [Google Scholar]
6.Zhang RG, Skarina T, Katz JE, Beasley S, Khachatryan A, Vyas S, Arrowsmith CH, Clarke S, Edwards A, Joachimiak A, et al. Structure of thermo toga maritima stationary phase survival protein SurE: a novel acid phosphatase. Structure. 2001;9(11):1095–1106. doi: 10.1016/s0969-2126(01)00675-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Eschenfeldt WH, Lucy S, Millard CS, Joachimiak A, Mark ID. A family of LIC vectors for high-throughput cloning and purification of proteins. Methods Mol Biol. 2009;498:105–115. doi: 10.1007/978-1-59745-196-3_7. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Aslanidis C, de Jong PJ. Ligation-independent cloning of PCR products (LIC-PCR) Nucleic Acids Res. 1990;18(20):6069–6074. doi: 10.1093/nar/18.20.6069. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Haun RS, Serventi IM, Moss J. Rapid, reliable ligation-independent cloning of PCR products using modified plasmid vectors. Biotechniques. 1992;13(4):515–518. [PubMed] [Google Scholar]
10.Rosenbaum G, Alkire RW, Evans G, Rotella FJ, Lazarski K, Zhang RG, Ginell SL, Duke N, Naday I, Lazarz J, et al. The Structural Biology Center 19ID undulator beamline: facility specifications and protein crystallographic results. J Synchrotron Radiat. 2006;13(Pt 1):30–45. doi: 10.1107/S0909049505036721. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Minor W, Cymborowski M, Otwinowski Z, Chruszcz M. HKL-3000: the integration of data reduction and structure solution—from diffraction images to an initial model in minutes. Acta Crystallogr D Biol Crystallogr. 2006;62:859–866. doi: 10.1107/S0907444906019949. [DOI] [PubMed] [Google Scholar]
12.Sheldrick GM. A short history of SHELX. Acta Crystallogr A. 2008;64:112–122. doi: 10.1107/S0108767307043930. [DOI] [PubMed] [Google Scholar]
13.Bjellqvist B, Basse B, Olsen E, Celis JE. Reference points for comparisons of 2-dimensional maps of proteins from different human cell-types defined in a Ph scale where isoelectric points correlate with polypeptide compositions. Electrophoresis. 1994;15(3–4):529–539. doi: 10.1002/elps.1150150171. [DOI] [PubMed] [Google Scholar]
14.Terwilliger T. SOLVE and RESOLVE: automated structure solution, density modification, and model building. J Synchrotron Radiat. 2004;11:49–52. doi: 10.1107/s0909049503023938. [DOI] [PubMed] [Google Scholar]
15.Murshudov GN, Skubak P, Lebedev AA, Pannu NS, Steiner RA, Nicholls RA, Winn MD, Long F, Vagin AA. REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr D Biol Crystallogr. 2011;67(Pt 4):355–367. doi: 10.1107/S0907444911001314. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr. 2004;60:2126–2132. doi: 10.1107/S0907444904019158. [DOI] [PubMed] [Google Scholar]
17.Chen VB, Arendall WB, III, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, Murray LW, Richardson JS, Richardson DC. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D. 2010;66:12–21. doi: 10.1107/S0907444909042073. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Yang HW, Guranovic V, Dutta S, Feng ZK, Berman HM, Westbrook JD. Automated and accurate deposition of structures solved by X-ray diffraction to the Protein Data Bank. Acta Crystallogr D Biol Crystallogr. 2004;60:1833–1839. doi: 10.1107/S0907444904019419. [DOI] [PubMed] [Google Scholar]
19.Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Soding J. Protein homology detection by HMM–HMM comparison. Bioinformatics. 2005;21(7):951–960. doi: 10.1093/bioinformatics/bti125. [DOI] [PubMed] [Google Scholar]
21.Soding J, Biegert A, Lupas AN. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 2005;33(Web Server issue):W244–W248. doi: 10.1093/nar/gki408. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Krissinel E, Henrick K. Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D Biol Crystallogr. 2004;60(Pt 12 Pt 1):2256–2268. doi: 10.1107/S0907444904026460. [DOI] [PubMed] [Google Scholar]
23.Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4(4):406–425. doi: 10.1093/oxfordjournals.molbev.a040454. [DOI] [PubMed] [Google Scholar]
24.Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39:783–791. doi: 10.1111/j.1558-5646.1985.tb00420.x. [DOI] [PubMed] [Google Scholar]
25.Zuckerkandl E, Pauling L. Evolutionary divergence and convergence in proteins. Academic Press; New York: 1965. [Google Scholar]
26.Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–2739. doi: 10.1093/molbev/msr121. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Kosinski J, Cymerman IA, Feder M, Kurowski MA, Sasin JM, Bujnicki JM. A “FRankenstein’s monster” approach to comparative modeling: merging the finest fragments of Fold-Recognition models and iterative model refinement aided by 3D structure evaluation. Proteins. 2003;53(Suppl 6):369–379. doi: 10.1002/prot.10545. [DOI] [PubMed] [Google Scholar]
28.Wallner B, Elofsson A. Prediction of global and local model quality in CASP7 using Pcons and ProQ. Proteins. 2007;69(Suppl 8):184–193. doi: 10.1002/prot.21774. [DOI] [PubMed] [Google Scholar]
29.Pawlowski M, Gajda MJ, Matlak R, Bujnicki JM. Meta-MQAP: a meta-server for the quality assessment of protein models. BMC Bioinformatics. 2008;9:403. doi: 10.1186/1471-2105-9-403. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Li Z, Ye Y, Godzik A. Flexible structural neighbourhood: a database of proteins structural similarities and alignments. Nucleic Acids Res. 2006;34:D277–D280. doi: 10.1093/nar/gkj124. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Sonnhammer EL, Eddy SR, Durbin R. Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins. 1997;28(3):405–420. doi: 10.1002/(sici)1097-0134(199707)28:3<405::aid-prot10>3.0.co;2-l. [DOI] [PubMed] [Google Scholar]
32.Santini S, Claverie JM, Mouz N, Rousselle T, Maza C, Monchois V, Abergel C. The conserved Candida albicans CA3427 gene product defines a new family of proteins exhibiting the generic periplasmic binding protein structural fold. PLoS ONE. 2011;6(4):e18528. doi: 10.1371/journal.pone.0018528. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.Zurlinden A, Schweingruber ME. Cloning, nucleotide sequence, and regulation of Schizosaccharomyces pombe thi4, a thiamine biosynthetic gene. J Bacteriol. 1994;176:6631–6635. doi: 10.1128/jb.176.21.6631-6635.1994. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Begley TP, Chatterjee A, Hanes JW, Hazra A, Ealick SE. Cofactor biosynthesis–still yielding fascinating new biological chemistry. Curr Opin Chem Biol. 2008;12(2):118–125. doi: 10.1016/j.cbpa.2008.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Bale S, Rajashankar KR, Perry K, Begley TP, Ealick SE. HMP binding protein ThiY and HMP-P synthase THI5 are structural homologues. Biochemistry. 2010;49(41):8929–8936. doi: 10.1021/bi101209t. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Maundrell K. nmt1 of fission yeast. A highly transcribed gene completely repressed by thiamine. J Biol Chem. 1990;265(19):10857–10864. [PubMed] [Google Scholar]

[R5] 5.Wightman R, Meacock PA. The THI5 gene family of Saccharomyces cerevisiae: distribution of homologues among the hemiascomycetes and functional redundancy in the aerobic biosynthesis of thiamin from pyridoxine. Microbiology. 2003;149(Pt 6):1447–1460. doi: 10.1099/mic.0.26194-0. [DOI] [PubMed] [Google Scholar]

[R6] 6.Zhang RG, Skarina T, Katz JE, Beasley S, Khachatryan A, Vyas S, Arrowsmith CH, Clarke S, Edwards A, Joachimiak A, et al. Structure of thermo toga maritima stationary phase survival protein SurE: a novel acid phosphatase. Structure. 2001;9(11):1095–1106. doi: 10.1016/s0969-2126(01)00675-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Eschenfeldt WH, Lucy S, Millard CS, Joachimiak A, Mark ID. A family of LIC vectors for high-throughput cloning and purification of proteins. Methods Mol Biol. 2009;498:105–115. doi: 10.1007/978-1-59745-196-3_7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Aslanidis C, de Jong PJ. Ligation-independent cloning of PCR products (LIC-PCR) Nucleic Acids Res. 1990;18(20):6069–6074. doi: 10.1093/nar/18.20.6069. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Haun RS, Serventi IM, Moss J. Rapid, reliable ligation-independent cloning of PCR products using modified plasmid vectors. Biotechniques. 1992;13(4):515–518. [PubMed] [Google Scholar]

[R10] 10.Rosenbaum G, Alkire RW, Evans G, Rotella FJ, Lazarski K, Zhang RG, Ginell SL, Duke N, Naday I, Lazarz J, et al. The Structural Biology Center 19ID undulator beamline: facility specifications and protein crystallographic results. J Synchrotron Radiat. 2006;13(Pt 1):30–45. doi: 10.1107/S0909049505036721. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Minor W, Cymborowski M, Otwinowski Z, Chruszcz M. HKL-3000: the integration of data reduction and structure solution—from diffraction images to an initial model in minutes. Acta Crystallogr D Biol Crystallogr. 2006;62:859–866. doi: 10.1107/S0907444906019949. [DOI] [PubMed] [Google Scholar]

[R12] 12.Sheldrick GM. A short history of SHELX. Acta Crystallogr A. 2008;64:112–122. doi: 10.1107/S0108767307043930. [DOI] [PubMed] [Google Scholar]

[R13] 13.Bjellqvist B, Basse B, Olsen E, Celis JE. Reference points for comparisons of 2-dimensional maps of proteins from different human cell-types defined in a Ph scale where isoelectric points correlate with polypeptide compositions. Electrophoresis. 1994;15(3–4):529–539. doi: 10.1002/elps.1150150171. [DOI] [PubMed] [Google Scholar]

[R14] 14.Terwilliger T. SOLVE and RESOLVE: automated structure solution, density modification, and model building. J Synchrotron Radiat. 2004;11:49–52. doi: 10.1107/s0909049503023938. [DOI] [PubMed] [Google Scholar]

[R15] 15.Murshudov GN, Skubak P, Lebedev AA, Pannu NS, Steiner RA, Nicholls RA, Winn MD, Long F, Vagin AA. REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr D Biol Crystallogr. 2011;67(Pt 4):355–367. doi: 10.1107/S0907444911001314. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Emsley P, Cowtan K. Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr. 2004;60:2126–2132. doi: 10.1107/S0907444904019158. [DOI] [PubMed] [Google Scholar]

[R17] 17.Chen VB, Arendall WB, III, Headd JJ, Keedy DA, Immormino RM, Kapral GJ, Murray LW, Richardson JS, Richardson DC. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D. 2010;66:12–21. doi: 10.1107/S0907444909042073. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Yang HW, Guranovic V, Dutta S, Feng ZK, Berman HM, Westbrook JD. Automated and accurate deposition of structures solved by X-ray diffraction to the Protein Data Bank. Acta Crystallogr D Biol Crystallogr. 2004;60:1833–1839. doi: 10.1107/S0907444904019419. [DOI] [PubMed] [Google Scholar]

[R19] 19.Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Soding J. Protein homology detection by HMM–HMM comparison. Bioinformatics. 2005;21(7):951–960. doi: 10.1093/bioinformatics/bti125. [DOI] [PubMed] [Google Scholar]

[R21] 21.Soding J, Biegert A, Lupas AN. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 2005;33(Web Server issue):W244–W248. doi: 10.1093/nar/gki408. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Krissinel E, Henrick K. Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D Biol Crystallogr. 2004;60(Pt 12 Pt 1):2256–2268. doi: 10.1107/S0907444904026460. [DOI] [PubMed] [Google Scholar]

[R23] 23.Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4(4):406–425. doi: 10.1093/oxfordjournals.molbev.a040454. [DOI] [PubMed] [Google Scholar]

[R24] 24.Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution. 1985;39:783–791. doi: 10.1111/j.1558-5646.1985.tb00420.x. [DOI] [PubMed] [Google Scholar]

[R25] 25.Zuckerkandl E, Pauling L. Evolutionary divergence and convergence in proteins. Academic Press; New York: 1965. [Google Scholar]

[R26] 26.Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–2739. doi: 10.1093/molbev/msr121. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Kosinski J, Cymerman IA, Feder M, Kurowski MA, Sasin JM, Bujnicki JM. A “FRankenstein’s monster” approach to comparative modeling: merging the finest fragments of Fold-Recognition models and iterative model refinement aided by 3D structure evaluation. Proteins. 2003;53(Suppl 6):369–379. doi: 10.1002/prot.10545. [DOI] [PubMed] [Google Scholar]

[R28] 28.Wallner B, Elofsson A. Prediction of global and local model quality in CASP7 using Pcons and ProQ. Proteins. 2007;69(Suppl 8):184–193. doi: 10.1002/prot.21774. [DOI] [PubMed] [Google Scholar]

[R29] 29.Pawlowski M, Gajda MJ, Matlak R, Bujnicki JM. Meta-MQAP: a meta-server for the quality assessment of protein models. BMC Bioinformatics. 2008;9:403. doi: 10.1186/1471-2105-9-403. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Li Z, Ye Y, Godzik A. Flexible structural neighbourhood: a database of proteins structural similarities and alignments. Nucleic Acids Res. 2006;34:D277–D280. doi: 10.1093/nar/gkj124. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Sonnhammer EL, Eddy SR, Durbin R. Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins. 1997;28(3):405–420. doi: 10.1002/(sici)1097-0134(199707)28:3<405::aid-prot10>3.0.co;2-l. [DOI] [PubMed] [Google Scholar]

[R32] 32.Santini S, Claverie JM, Mouz N, Rousselle T, Maza C, Monchois V, Abergel C. The conserved Candida albicans CA3427 gene product defines a new family of proteins exhibiting the generic periplasmic binding protein structural fold. PLoS ONE. 2011;6(4):e18528. doi: 10.1371/journal.pone.0018528. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

The crystal structure of pyrimidine/thiamin biosynthesis precursor-like domain-containing protein CAE31940 from proteobacterium Bordetella bronchiseptica RB50, and evolutionary insight into the NMT1/THI5 family

Jacek Bajor

Karolina L Tkaczuk

Maksymilian Chruszcz

Hutton Chapman

Olga Kagan

Alexei Savchenko

Wladek Minor

Abstract

Introduction