Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Dec 11.
Published in final edited form as: Proteins. 2006 Jul 1;64(1):280–283. doi: 10.1002/prot.20910

Structure of Phage Protein BC1872 from Bacillus cereus, a Singleton With New Fold

R Zhang 1, G Joachimiak 1, S Jiang 1, A Cipriani 1, F Collart 1, A Joachimiak 1,*
PMCID: PMC2792010  NIHMSID: NIHMS143366  PMID: 16596646

Introduction

Phagelike and prophage elements appear in many sequenced bacterial genomes and may associate with functionally important characteristics such as serotype conversion, pathogenesis, and phage immunity.1 Some prophage genes may have a profound effect on the host and alter properties such as virulence of nonpathogenic hosts and increase virulence of pathogenic hosts.2 VSH-1 is an example of a phagelike element in Brachyspira spirochetes, which is a recognized mechanism for gene transfer between B. hyodysenteriae cells.3 Phages of this type are found to enter long-term relationships with their hosts by integrating their genome into the host chromosome and exist as autonomous linear prophages within the host genome.

The genome of Bacillus cereus strain ATCC 14579/DSM 31 codes for a probacteriophage phBC6A51. The prophage region spans 61,395 base pairs and contains 75 ORFs, many of which are described as hypothetical proteins. It appears that the prophage provides some evolutionary advantage to the B. cereus; therefore, studies of its proteins are important. Because the protein structure can provide useful function information, determination of structures of these proteins are of high priority.4 We have determined the crystal structure of BC1872, a prophage phBC6A51 protein from B. cereus, by using the single-wavelength anomalous diffraction (SAD) method. The protein forms a β/α/β three-layer sandwich structure and represents a new protein fold. The functional unit of BC1872 appears to be homo-dimer, and this structure provides new information about possible functionality of the protein.

Materials and Methods

Cloning, protein purification, and crystallization

The ORF of the B. cereus phage protein BC1872 was amplified from genomic DNA with KOD DNA polymerase by using conditions and reagents provided by the vendor (Novagen, Madison, WI). The gene was cloned into a pMCSG7 vector by using a modified ligation-independent cloning protocol.5, 6 This process generated an expression clone producing a fusion protein with an N-terminal His6 tag and a TEV protease recognition site (ENLYFQ↓S). The fusion protein was overproduced in B. cereus BL21-derivative harboring a plasmid encoding three rare tRNAs (Arg [AGG/AGA] and Ile [ATA]). A selenomethionine (SeMet) derivative of the expressed protein was prepared as described by Walsh et al.7 The protein was purified by resuspension of IPTG-induced bacterial cells in binding buffer (500 mM NaCl, 5% glycerol, 20mMHEPES, pH 8.0, 10mM imidazole, and 10mM β-mercaptoethanol). The cells were lysed by the addition of lysozyme to 1 mg/mL in the presence of a protease inhibitor mixture cocktail (Sigma P8849) (0.25 mL/5 g cells) and sonication for 3 min. After clarification by centrifugation (30 min at 30,000 g) and passage through a 0.2-µm filter, the lysate was applied to a 5-mL HisTrap HP column (Amersham Biosciences) and purified by using AKTAexplorer. The protein was eluted from the column with 50mM HEPES, 500mM NaCl, 250mM imidazole, 10 mM 2-mercaptoethanol and 5% glycerol. The purified protein was concentrated with simultaneous buffer exchange (20mM HEPES pH 7.5, 200mM NaCl, 2 mM DTT) by using Centricon Plus 20, 5-kDa cutoff (Millipore Inc., Bedford, MA). Before crystallization, any particulate matter was removed from the sample by passage through a 0.2-µm Ultrafree-MC centrifugal filtration device (Millipore).

Protein crystallization

The protein was crystallized by vapor diffusion in sitting drops by mixing 1 µL of the His6-tagged protein at 32.5 mg/mL with 1 µL of 0.1 M MES, 0.2 M Ca(OAc)2, pH 6.0, and 20% PEG 8000 and equilibrated at 298 K over reservoir 135 µL of this solution. Crystals that appeared after 24 h were flash-frozen in liquid nitrogen with crystallization buffer plus 28% sucrose as a cryoprotectant before data collection. Crystals belong to monoclinic space group P21 with unit cell dimensions of a = 37.00 Å, b = 66.34, c = 53.97 Å, α = γ = 90.00°,β = 95.29° and diffract to 2.0 Å resolution.

Data collection

Diffraction data were collected at 100 K at the 19ID beam line of the Structural Biology Center at the Advanced Photon Source, Argonne National Laboratory. The SAD data using inverse-beam approach up to 2.0 Å (peak: 0.9798 Å; edge: 0.9795 Å and high-energy remote: 0.9465 Å) were collected from a SeMet labeled protein crystal. One crystal (0.2 × 0.2 × 0.2 mm) was used to collect data at 100 K. The SAD data sets were collected with 8-s exposure/2.0°/frame (rotation of ω) by using 225-mm crystal to detector distance. The total oscillation range covered 280° in ω, 140° were with Ø = 0, and the other 140° with Ø = 180, which was 30° more than predicted with use of the strategy module within HKL2000 suite.8 All data were processed and scaled with HKL2000 (Table I).

TABLE I.

Summary of Crystal and SAD Data

SAD data collection
Unit cell a = 37.00Å b = 66.34Å,c = 53.97Å,
α = γ = 90.00,β = 95.29°
Space group P21
MWDa (residues) 12,466 (110)
Mol (AU) 2
SeMet (AU) 6
Wavelength (Å) 0.9798
Resolution range (Å) 50.00-2.00 (2.07-2.0)
No. of unique reflections 34,558
Completeness (%) 99.1
R merge (%) 9.8 (0.518)

Structure determination and refinement

The structure was determined by SAD phasing by using HKL2000_PH, SHELXC, SHELXD, SHELXE, MLPHARE, SOLVE/RESOLVE9 and refined to 2.00 Å with use of REFMAC10 against the peak data. The initial partial model was autotraced by using RESOLVE, and more residues were added by using ARP/wARP.11 Manual model building and adjustment of residues using COOT10 were required to complete the model. Several rounds of manual rebuilding using COOT and REFMAC refinements were done to converge the model with a final R of 22.3% and a free R of 26.4% (Table II). Electron density calculated at 1.0 σ is well connected for most of the main-chain. Eight residues in subunit A and seven in subunit B from the N-terminal affinity tag are well ordered and visible in the electron density. C-termini of both subunits are less well ordered. In subunit A, one residue (102) and in the subunit B six residues (97–102) cannot be fitted into electron density. The stereochemistry of the structure was checked with PROCHECK12 and the Ramachandran plot. The main-chain torsion angles for all residues are in the allowed regions and additional allowed regions.

TABLE II.

Crystallographic Statistics

Centric Acentric All



Resolution range (Å) FOM Phasing
power
FOM Phasing
power
Number FOM Phasing
power
Phasing
30.0-2.1 0 1.857 0.2786 1.458 28616 0.2725 1.467
Density Modification 0.81
Refinement
  Resolution range (Å) 24.90-2.00
  No. of reflections 34,558
  Cutoff 0
  R-value (%) 22.3
  Free R-value (%) 26.4
  RMSD from ideal geometry
    Bond length (1–2) (Å) 0.008
    Angle (°) 1.3
    Dihedral (°) 23.2
    Improper (°) 0.83
  No. of atoms
    Protein 1690
    Water 102
  Mean B-factor (Å2) 33.60
  Ramachandran plot statistics (%)
    Residues in most favored regions 87.6
    Residues in additional allowed 12.4
  regions
    Residues in disallowed region 0.0

Data deposition

Atomic coordinates of phage protein BC1872 from B. cereus ATCC 14579 have been deposited into the Protein Data Bank (PDB) as 1R7L.

Discussion

The BC1872 subunit forms a three-layer β/α/β sandwich (Fig. 1). Three α-helices, α1, α2, and α3, are sandwiched on one side by three-stranded antiparallel β-sheet (β3, β4, and β5) and on the other by a β-hairpin (β1 and β2). In the monomer, the N- and C-termini ends up ~10 Å apart. Monomers associate to form a dimer that may be functionally relevant. The two small sheets combine to form a continuous six-stranded β-sheet. Additional interactions are provided by side-chains of α2 and α3. On the interface between monomers, a well-defined hydrophobic core is formed by Ile41, Val48, Val55, Val57, Val68, Trp45, and Tyr66 residues. From this dimerization unit, two finger-like protrusions project away. Fingers are formed by a β-hairpin (β1 and β2) and the loop between α1 and β1. In the dimer tips of fingers are ~50 Å apart and are rich in residues with H-bonding potential (Glu18, Lys20, Lys26, Asp27, Arg29, Tyr30, Arg31), and most are positively charged.

Fig. 1.

Fig. 1

Structure of BC1872 dimer, N-, C-termini, and secondary structure elements are labeled. One subunit is in red and the other is in blue.

Analysis of nucleic acid-binding templates using Pro-Func server13 revealed two motifs in BC1872. One motif (Asn7, Ser43, and Lys8) matches residues Asn26, Ser70, and Lys27 with RMSD of 1.82 Å in the structure of ribosomal protein SSU/S20 (PDB ID = 1FJG). Asn26 and Ser70 coordinate phosphate groups in RNA hairpin, and Lys27 binds also to a phosphate group. The second motif (Arg4, Asn7, and Ala11) matches residues Arg144, Asn147, and Ala150 with RMSD of 2.30 Å in human fos protein. This motif is involved in binding to bases and phosphate of B-DNA and may confer sequence specificity.

BC1872 has a unique sequence and is a true singleton; a BLAST14 search of all nonredundant GenBank CDS retrieved no sequence homologs for this protein. The Super-family HMM program, which searches against a library of Hidden Markov Models (HMMs) derived from Structural Classification of Proteins Database (SCOP) superfamilies, found a single short motif between residues 34 and 92 that matches the characteristic motif 52317 of the class I glutamine amidotransferase-like superfamily under the flavodoxin-like SCOP fold.15,16 The flavodoxin-like fold consists of three layers and five-stranded parallel β-sheet.17 However, BC1872 has no structural and most likely functional relationship with glutamine amidotransferase.

The BC1872 protein also produced no matching sequence motifs to InterProScan, which attempted to find sequence motifs in a database of protein families, domains, and functional sites.18 However, the Secondary Structure Matching (SSM) program retrieved three structural matches with root-mean-square deviation (RMSD) of <2.10 Å.19 The NMR structure of the zinc-ribbon domain within translation initiation factor 2 subunit beta had three secondary structure elements (SSE) of the eight total elements in chain B that superimposed with an RMSD of 1.68 Å between the Cα atoms of the matched residues and z score of 5.7. The NMR structure of the complex between the third dsRNA binding domain from Drosophila melanogaster maternal effect protein (staufen) in complex with RNA hairpin (PDB ID = 1EKZ and 1DI2) had five matching SSEs and an RMSD of 1.65 Å. The solution structure of DNA-binding (GCC-box binding) domain of ATERF1 protein (PDB ID = 3GCC) from Arabidopsis thaliana had four matching SSEs and an RMSD of 2.07 Å. It is of interest that all structural matches involve nucleic acid-binding proteins. On further analysis, however, these proteins matched <50% of the secondary structures of the BC1872 protein, leading us to believe that our structure represents a novel fold. Additional analysis using DALI server confirmed this assertion.

We have attempted to dock the BC1872 to the B-DNA and found that the protein indeed could form a complex with DNA duplex using major groove. Figure 2 shows that the fingers of the BC1872 can interact with edges of bases and phosphate groups of B-DNA. Our data suggest that the BC1872 may bind nucleic acid; however, its detailed function needs further investigation.

Fig. 2.

Fig. 2

Docking of the BC1872 dimer to B-DNA. The BC1872 was manually docked with B-DNA from the structure of TraR/DNA complex (PDB ID = 1L3L).

Acknowledgments

The authors thank all members of the Structural Biology Center at Argonne National Laboratory for their help in conducting experiments.

Grant sponsor: National Institutes of Health; Grant number: GM62414; Grant sponsor: U.S. Department of Energy, Office of Biological and Environmental Research; Grant number: W-31-109-Eng-38.

Footnotes

The submitted manuscript has been created by the University of Chicago as Operator of Argonne National Laboratory (“Argonne”) under Contract No. W-31-109-ENG-38 with the U.S. Department of Energy. The U.S. Government retains for itself, and others acting on its behalf, a paid-up, nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government.

REFERENCES

  • 1.Mehta P, Casjens S, Krishnaswamy S. Analysis of the lambdoid prophage element e14 in the E. coli K-12 genome. BMC Microbiol. 2004;4:4. doi: 10.1186/1471-2180-4-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gaidelyte A, Jaatinen ST, Daugelavicius R, Bamford JKH, Bamford DH. The linear double-stranded DNA of phage Bam35 enters lysogenic host cells, but the late phage functions are suppressed. J Bacteriol. 2005;187:3521–3527. doi: 10.1128/JB.187.10.3521-3527.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Stanton TB, Thompson MG, Humphrey SB, Zuerner RL. Detection of bacteriophage VSH-1 svp38 gene in Brachyspira spirochetes. FEMS Microbiol Lett. 2003;224:225–229. doi: 10.1016/S0378-1097(03)00438-5. [DOI] [PubMed] [Google Scholar]
  • 4.Casjens S. Prophages and bacterial genomics: what have we learned so far? Mol Microbiol. 49:277–300. doi: 10.1046/j.1365-2958.2003.03580.x. [DOI] [PubMed] [Google Scholar]
  • 5.Stols L, Gu M, Dieckman L, Raffen R, Collart FR, Donnelley MI. A new vector for high-throughput, ligation-independent cloning encoding a tobacco etch virus protease cleavage site. Protein Expr Purif. 2002;25:8–15. doi: 10.1006/prep.2001.1603. [DOI] [PubMed] [Google Scholar]
  • 6.Dieckman L, Gu M, Stols L, Donnelley MI, Collart FR. High throughput methods for gene cloning and expression. Protein Expr Purif. 2002;25:1–7. doi: 10.1006/prep.2001.1602. [DOI] [PubMed] [Google Scholar]
  • 7.Walsh MA, Dementieva I, Evans G, Sanishvili R, Joachimiak A. Taking MAD to the extreme: ultrafast protein structure determination. Acta Crystallogr D Biol Crystallogr. 1999;55:1168–1173. doi: 10.1107/s0907444999003698. [DOI] [PubMed] [Google Scholar]
  • 8.Brunger AT, Adams PD, Clore GM, DeLano WL, Gros P, Grosse-Kunstleve RW, Jiang JS, Kuszewski J, Nilges M, Pannu NS, Read RJ, Rice LM, Simonson T, Warren GL. Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr D Biol Crystallogr. 1998;54:905–921. doi: 10.1107/s0907444998003254. [DOI] [PubMed] [Google Scholar]
  • 9.Terwilliger TC. SOLVE and RESOLVE: automated structure solution and density modification. Methods Enzymol. 2003;374:22–37. doi: 10.1016/S0076-6879(03)74002-6. [DOI] [PubMed] [Google Scholar]
  • 10.The CCP4 suite: programs for protein crystallography. Acta Crystallogr D Biol Crystallogr. 1994;50:760–763. doi: 10.1107/S0907444994003112. [DOI] [PubMed] [Google Scholar]
  • 11.Morris RJ, Perrakis A, Lamzin VS. ARP/wARP and automatic interpretation of protein electron density maps. Methods Enzymol. 2003;374:229–244. doi: 10.1016/S0076-6879(03)74011-7. [DOI] [PubMed] [Google Scholar]
  • 12.Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PRO-CHECK: a program to check the stereochemical quality of protein structures. J Appl Crystallogr. 1993;26:283–291. [Google Scholar]
  • 13.Laskowski RA, Watson JD, Thornton JM. ProFunc: a server for predicting protein function from 3D structure. Nucleic Acids Res. 2005;33:W89–W93. doi: 10.1093/nar/gki414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Altschul SF, Madden TL, Schs̈ffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Gough J, Karplus K, Hughey R, Chothia C. Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J Mol Biol. 2001;313:903–919. doi: 10.1006/jmbi.2001.5080. [DOI] [PubMed] [Google Scholar]
  • 16.Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995;247:536–540. doi: 10.1006/jmbi.1995.0159. [DOI] [PubMed] [Google Scholar]
  • 17.Madera M, Vogel C, Kummerfeld SK, Chothia C, Gough J. The SUPERFAMILY database in 2004: additions and improvements. Nucleic Acids Res. 2004;32:D235–D239. doi: 10.1093/nar/gkh117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Barrell D, Bateman A, Binns D, Biswas M, Bradley P, Bork P, Bucher P, Copley RR, Courcelle E, Das U, Durbin R, Falquet L, Fleischmann W, Griffiths-Jones S, Haft D, Harte N, Hulo N, Kahn D, Kanapin A, Krestyaninova M, Lopez R, Letunic I, Lonsdale D, Silventoinen V, Orchard SE, Pagni M, Peyruc D, Ponting CP, Selengut JD, Servant F, Sigrist CJA, Vaughan R, Zdobnov EM. The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res. 2003;31:315–318. doi: 10.1093/nar/gkg046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Krissinel E, Henrick K. Protein structure comparison in 3D based on secondary structure matching (SSM) followed by Ca alignment, scored by a new structural similarity function. In: Kungl AJ, Kungl PJ, editors. Proceedings of the 5th International Conference on Molecular Structural Biology; September 3–7; Vienna. p. 88. [Google Scholar]

RESOURCES