Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Feb 15.
Published in final edited form as: Arch Biochem Biophys. 2010 Dec 6;506(2):150–156. doi: 10.1016/j.abb.2010.12.001

Inaugural structure from the DUF3349 superfamily of proteins, Mycobacterium tuberculosis Rv0543c

Garry W Buchko a,*, Isabelle Phan b, Peter J Myler b,c, Thomas C Terwilliger d, Chang-Yub Kim d,*
PMCID: PMC3035944  NIHMSID: NIHMS257784  PMID: 21144816

Abstract

The first structure for a member of the DUF3349 (PF11829) family of proteins, Rv0543c from Mycobacterium tuberculosis, has been determined using NMR-based methods and some of its biophysical properties characterized. Rv0543c is a 100 residue, 11.3 kDa protein that both size exclusion chromatography and NMR spectroscopy show to be a monomer in solution. The structure of the protein consists of a bundle of five α-helices α1 (M1 - Y16), α2 (P21 - C33), α3 (S37 - G52), α4 (G58 - H65) and α5 (S72 - G87) held together by a largely conserved group of hydrophobic amino acid side chains. Heteronuclear steady-state {1H}-15N NOE, T1, and T2 values are similar through-out the sequence indicating that the backbones of the five helices are in a single motional regime. The thermal stability of Rv0543c, characterized by circular dichroism spectroscopy, indicates that Rv0543c irreversibly unfolds upon heating with an estimated melting temperature of 62.5°C. While the biological function of Rv0543c is still unknown, the presence of DUF3349 proteins predominately in Mycobacterium and Rhodococcus bacterial species suggests that Rv0543 may have a biological function unique to these bacteria, and consequently, may prove to be an attractive drug target to combat tuberculosis.

Keywords: tuberculosis, circular dichroism, structural genomics, protein dynamics

Introduction

Approximately 1.6 million people die annually from infection with Mycobacterium tuberculosis, the aetiological agent responsible for the chronic infectious disease tuberculosis (TB) [1]. Via inhalation of aerosolized bacilli as droplet nuclei exhaled from coughing infected individuals [2], almost 10 million people become newly infected annually [1]. Indeed, it is estimated that one-third of the human population would have a positive skin test for the disease [3]. While TB is held in check by effective public health care systems in the Western world, it is endemic in 22 “high-burden” nations [4]. Due to the evolution of multi-drug and extremely drug-resistant M. tuberculosis strains along with the prevalence of high human immunodeficiency virus infections [57], there is a fear that the industrialized world may lose its ability to control this disease [4]. Consequently, it is of fundamental importance to develop a new generation of intervention strategies against TB [8].

The current TB control strategy employed by the World Health Organization (WHO) is based on Directly Observed Treatment and Short-course drug therapy (DOTS) [9]. In addition to saving the lives of infected individuals, early diagnosis and treatment of the disease interrupts transmission. A major focus in the search for new anti-TB drug therapies is a molecular understanding of the M. tuberculosis gene products and the interaction of various metabolic pathways in the microenvironment in the host that allow M. tuberculosis to survive [4]. One highly conserved gene product present in M. tuberculosis and other Mycobacterium and Rhodococcus species is a small protein of 100 amino acid residues, Rv0543c. This protein contains a “Domain of Unknown Function” that falls into the DUF3349 (PF11829) superfamily of proteins. To date there is no known function for this protein and a structure has not been solved for any of the DUF3349 family members. To obtain further insights into the biological function of Rv0543c and the other proteins in the DUF3349 superfamily, the solution structure for Rv0543c was determined using NMR-based methods, its thermostability assayed using circular dichroism spectroscopy, and the dynamics of the protein explored by NMR spectroscopy.

Materials and methods

Cloning, expression, and purification

The DNA coding sequence for the M. tuberculosis Rv0543c gene was cloned into a modified pET28b vector (Novagen, Madison, WI) upstream of the NdeI site such that the expressed protein contained an N-terminal poly-histidine tag (MGSSHHHHHHSSGLVPRGSH-). This recombinant vector was then transfected into the host Escherichia coli bacterial strain BL21PRO (Clontech, Palo Alto, CA). Uniformly 15N- and 15N-, 13C-labeled Rv0543c was obtained by growing the transformed cells (37°C) in minimal medium (Miller) containing 15NH4Cl (1 mg/mL) and D-[13C6]glucose (2.0 mg/mL) supplemented with Fe2Cl3 (50 g/mL) and the antibiotics kanamycin (34 μg/mL) and spectinomycin (100 μg/mL). Once the cells reached an OD600 reading of ~0.8, the cells were cooled to 25°C and protein expression induced with isopropyl β-D-1-thiogalactopyranoside (0.026 μg/mL). Cells were harvested 4 – 6 h later and frozen at −80°C. Following established protocols [10,11], the frozen pellet was later thawed, the cells passed three times through a French press and the cell lysate applied to a Ni-NTA affinity column (Qiagen, Valencia, CA). The fraction containing target protein was exchanged into thrombin cleavage buffer by diaylsis (150 mM NaCl, 20 mM Tris-HCl, pH 8.4), concentrated to ~ 2 mL, and the N-terminal poly-histidine tag removed by overnight incubation with thrombin at room temperature. Using a flow rate of 1.0 mL/min, the reaction solution was then loaded on a Superdex75 HiLoad 10/30 column (GE Healthcare, Piscataway, NJ) to simultaneously purify the protein and exchange it into NMR buffer (300 mM NaCl, 20 mM TrisHCl, 1.0 mM dithiothreitol, pH 7.1). The band containing Rv0543c (retention time = 81 min) was collected and the volume reduced (Amicon Centriprep-10) to generate NMR samples in the 1 – 2 mM range (Lowry analysis). The SDS-PAGE analysis of the final NMR samples showed the protein to be > ~95% pure.

Optical spectroscopy

Circular dichroism data was collected on an Aviv Model 410 spectropolarimeter (Lakewood, NJ) calibrated with an aqueous solution of ammonium d-(+)camphorsulfonate using a 0.06 mM Rv0543c sample in NMR buffer and in a quartz cell of 0.1 cm path length. A thermal denaturation curve was obtained by recording and plotting the ellipticity at 220 nm in 2.0°C intervals from 10 to 80°C. A melting temperature, Tm, was estimated from the thermal denaturation curve by taking a first derivative of the plot using the Aviv software [12]. Steady-state wavelength spectra for Rv0543c were recorded in 0.5 nm increments between 200 and 260 nm at 25°C, 80°C, and 25°C (post-heating). Each reported steady-state wavelength spectrum was the result of averaging two consecutive scans with a bandwidth of 1.0 nm and a time constant of 1.0 s. These spectra were processed by subtracting a blank spectrum from the protein spectrum and then automatically line smoothing the data using Aviv software.

NMR spectroscopy and resonance assignments

Varian 750- and 600-Inova spectrometers equipped with triple resonance probes and pulse field gradients were used to collect the NMR data required for resonance assignments and structure determination. The NMR data, collected on 1 – 2 mM samples at 20°C, were processed with Felix2007 (Felix NMR, Inc., San Diego, CA) and analyzed with Sparky (v3.115) [13]. Chemical shifts were referenced to DSS (DSS = 0 ppm) using indirect methods [14].

The 1H, 13C, and 15N chemical shifts of the backbone and side chain resonances were obtained from standard two-dimensional 1H-15N HSQC, 1H-13C HSQC, HBCBCGCDHD, and HBCBCGCDCHE experiments and three-dimensional HNCACB, CBCA(CO)NH, HNCO, HCC-TOCSY-NNH and CC-TOCSY-NNH experiments using Varian Protein-pack pulse programs. Distance restraints were obtained from a suite of three-dimensional, 13C- and 15N-edited NOESY-HSQC experiments using a mixing time of 80 ms. Deuterium-exchange studies were performed by lyophilizing an 15N-labeled NMR sample, re-dissolving in 99.8% D2O, and immediately collecting 1H-15N HSQC spectra 10, 20, and 60 minutes after the exchange. Steady-state {1H}-15N heteronuclear NOE values (NOE = Isat/Iunsat) were measured in triplicate from the ratios of 1H-15N HSQC cross peak volumes in spectra recorded in the presence (Isat) and absence (Iunsat) of three seconds of proton presaturation prior to the 15N excitation pulse [1517]. Nitrogen-15 T1 and T2 values were measured using previously described experiments [15] with seven different time delays (s): 0.0, 0.1, 0.2, 0.3. 0.4, 0.6, 0.8 (T1); 0.01, 0.03, 0.05, 0.07, 0.09, 0.11, and 0.13, (T2). Values for T1 and T2 and an estimate of their associated errors were obtained using the rate analysis function in Sparky (v3.115) [13] by fitting the measured peak heights to a two-parameter exponential function, I(t) = I0exp(−t/T1,2). An overall rotational correlation time (τc) for Rv0543c was estimated from backbone amide 15N T1/T ratios measured using a modified 1H-15N HSQC experiment to record 15N-edited one-dimensional spectra [18,19] and from the average of all the individual amide nitrogen-15 T1/T2 ratios. The steady-state {1H}-15N heteronuclear NOE, T1, and T2 data were collected at a 1H resonance frequency of 600 MHz (20°C).

Structure calculations

Structure calculations were performed iteratively using CYANA (v 2.1) [20], the 1H, 13C, and 15N chemical shift assignments, and the peak-picked NOESY data as initial experimental inputs. Sixty-six dihedral angle restraints for both Phi (Φ) and Psi (Ψ) were introduced on the basis of the elements of secondary structure identified in the early structural ensembles and TALOS calculations [21]. Near the end of the iterative process 64 hydrogen bond restraints (1.8 – 2.0 Å and 2.7 – 3.0 Å for the NH–O and N-O distances, respectively) were introduced into the structure calculations on the basis of proximity in early structure calculations and the observation of slowly exchanging amides in the deuterium exchange experiment.

The final ensemble of 20 CYANA derived structures were then refined with explicit water [22] with CNS (version 1.1) using force constants of 500 and 2000 kcal for the NOE and dihedral restraints, respectively. For the water refinement calculations the upper boundary of the CYANA distance restraints was increased by 1% and the lower bound was set to the vdw limit. The fifteen water refined structures with the fewest and smallest unfavorable contacts (best clash scores) were then used to calculate a mean structure and average RMSD values to the mean structure using the structured regions of the protein (residue M1 - G87). Structural quality was assessed using the Protein Structure Validation Suite (PSVS, v1.3) [23]. The structural statistics are summarized in Table 1.

Table 1.

Summary of the structural statistics for Rv0543ca

Restraints for Structure Calculations
Total NOEs 1118
Intraresidue NOEs 364
Sequential (i, i + 1) NOEs 304
Medium-range (i, i + j; 1 < j ≤ 4) NOEs 209
Long-range (i, i + j; j > 4) NOEs 241
Phi (Φ) angle restraints 66
Psi (Ψ) angle restraints 66
Hydrogen bond restraintsb 64
Structure Calculations
Number of structures calculated 100
Number of structures used in ensemble 15
Structures with Restraint Violations
Distance Restraint Violations > 0.04Å 0
Dihedral Restraint Violation > 1° 0
RMSD to Mean (Å) Ordered Residues: 1–87
Backbone N-Cα-C=O Atoms 0.64±0.08
Heavy Atoms 1.14±0.09
All Atoms 1.31±0.05
Ramachandran Plots Summary Ordered Residues from Procheckb
Most favored regions 92.3%
Additionally allowed regions 7.5%
Generously favored regions 0.2%
Disallowed 0.0%
Global Quality Scores - Ordered Residuesb Z-score (Raw)
Procheck (all) −0.41 (−0.07)
Procheck (Φ, Ψ) 0.90 (0.14)
MolProbity clash score −2.70 (24.60)
a

All statistics are for the 15-structure ensemble deposited in the Protein Data Bank (2KVC) using the residues containing the structured central core (1–87).

b

Obtained from 32 amide cross peaks still present 60 minutes after the start of the deuterium exchange experiment: I8-Y16, G19, V28-C33, L36, V41-A45, E47-M49, V76, E77, E79-L83.

c

Calculated for the ordered residue, 1–87, using the PSVS program.

Protein structure accession numbers

The atomic coordinates for the final ensemble of 15 structures for M. tuberculosis Rv0543c have been deposited in the Research Collaboratory for Structural Bioinformatics (RSCB) under PDB code 2KVC. The chemical shifts assignments have been deposited with the Biological Magnetic Resonance Data Bank (BMRB) under accession number 16774. Note that the non-native three residues at the N-terminus, GSH-, are numbered sequentially starting with G1 in the RSCB and BMRB depositions. However, here the three non-native residues are numbered sequentially with an asterisk (G1*-H3*) and the first native residue, M4 in the RSCB and BMRB depositions, is labeled as M1.

Bioinformatics analyses

A search for structures similar to Rv0543c in the RSCB Protein Data Bank was performed using the DALI server (http://ekhidna.biocenter.helsinki.fi/dali_server/) [24]. The amino acid conservation pattern on the three-dimensional structure of Rv0543c was calculated using the ConSurf server (http://consurf.tau.ac.il/overview.html) [25]. Using the proteins listed in the DUF3349 (PF11829) superfamily, a multiple sequence alignment was performed with ten closely related protein sequences from genera other than Mycobacterium and Rhodococcus using the program ClustalW2 [26]. Figures of structures were generated using PyMOL [27].

Results and discussion

Solution structure for Rv0543c

The elution time of Rv0543c on a size exclusion column, 81 min, and estimations of the rotational correlation time at 20°C, 9.3 ± 0.3 ns (modified 1H-15N HSQC experiment) and 8.2 ± 2.3 ns (T1/T2 ratios), were consistent with a monomeric 11.3 kDa protein [28]. The wide chemical shift dispersion in the nitrogen and especially the proton dimension of the 1H-15N HSQC spectrum of Rv0543c, shown in Figure 1, indicates the protein adopts a stable folded conformation in solution. All but six of the 96 expected amide resonances were unambiguously assigned in the 1H-15N HSQC spectrum of Rv0543c. Cross peak amides were not identified for D53-D55, V76, L90, and E98. On the basis of the amide assignments and extensive assignment of the 13Cα and side chain proton and carbon chemical shifts (BMRB ID 16774), an ensemble of structures was calculated (Figure 2A) that satisfied all the available experimental NMR data (NOEs, chemical shifts, deuterium exchange experiments, and TALOS calculations).

Figure 1.

Figure 1

Assigned 1H-15N HSQC spectrum of double-labeled Rv0543c collected at 20°C in NMR buffer (300 mM NaCl, 20 mM TrisHCl, 1.0 mM DTT, pH 7.1) at a 1H resonance frequency of 750 MHz. Side chain resonances are indicated by solid horizontal lines and the two unassigned cross peak are labeled with a question mark. Not shown are the ring amide resonances for W11 (9.481 and 131.7 ppm) and W88 (10.35 and 130.1 ppm).

Figure 2.

Figure 2

A) Superposition of the final ensemble of 15 structures calculated for Rv0543c. B) Ribbon representation of the structure in the ensemble closest to the average structure for Rv0543c color-coded using ConSurf. C) Secondary structure diagram of Rv0543c with the α-helices drawn as red rectangles with the residue number at the beginning and the end of each helix shown.

As summarized in Table 1, the final set of structure calculations included 1118 interproton distance restraints (NOE data), 64 hydrogen bond restraints (deuterium exchange data), and 132 dihedral angle restraints (TALOS calculations). While the final set of 20 CYANA derived structures contained no unfavorable steric clashes, refinement with explicit water introduced some unfavorable steric clashes into the structures despite extensive efforts adjusting the refinement parameters. In the end the 15 structures with the best clash scores following water refinement were chosen for the ensemble submitted to the RCSB PDB. Each member of this final ensemble agreed well with the experimental data with no upper limit violation greater than 0.04 Å and no torsion angle violation greater than 1°. The quality of the structure ensemble was also shown to be good using the PSVS validation software package [23]. The Ramachandran statistics for all the residues in the ensemble were overwhelmingly in acceptable space (92.3% of the (φ,ψ) pairs for Rv0543c were found in the most favored regions and 7.5% within additionally allowed regions) and all the structure quality Z-scores were acceptable (greater than minus five) including the final MolProbity clash score of −2.70.

The final set of 15 calculated structures in the ensemble converged well, as shown mathematically by the statistics in Table 1 and visually by the superposition of the ordered residues in Figure 2A. The RMSD of the structured core of ordered residues (M1 - G87) in the ensemble to the mean structure is 0.64 Å for the backbone atoms (N-Cα-C=O) and 1.14 Å for all heavy atoms. The C-terminal region, W91 - A100, is unstructured and disordered, and this is evident in Figure 2A. A cartoon representation of a single structure is shown in Figure 2B along with a schematic secondary structure diagram of Rv0543c in Figure 2C. The protein consists of a bundle of five α-helices, α1 (M1 - Y16), α2 (P21 - C33), α3 (S37 - G52), α4 (G58 - H65) and α5 (S72 - G87), held in place by an hydrophobic core of largely conserved amino acid side chains.

Optical spectroscopy

Circular dichroism spectroscopy was used to characterize the structure and stability of Rv0543c over a range of temperatures [29]. Figure 3A shows the steady-state wavelength spectra for Rv0543c at two temperatures. The solid blue line is the CD spectrum of a fresh, ~ 0.06 mM sample collected at 25°C in the same buffer used to collect NMR data for the structure calculations. The double minimum at ~220 and ~208 nm and projected maximum at a wavelength < 200 nm is characteristic of a structured protein with significant α-helical content [29, 30]. The red line in Figure 3A is the CD spectrum of the same sample at 80°C. Relative to the spectrum collected at 25°C, the double minimum has disappeared and is replaced by a single, more positive minimum at ~212 nm. These observations at 80°C indicate that Rv0543c is more unstructured at elevated temperatures, losing most of its α-helical character. However, the absence of an extrapolated negative minimum at ~198 nm and a positive maximum at ~218 nm indicates that Rv0543c is not entirely random coil at 80°C and still possesses spectral properties consistent with primarily β-sheet structure (single minimum). When the sample is cooled back to 25°C, the CD spectrum, indicated by the cyan line in Figure 3A, is almost identical to the CD spectrum collected at 80°C (red line). This indicates that the temperature-induced effects to the structure of Rv0543c are irreversible.

Figure 3.

Figure 3

A) Circular dichroism steady-state wavelength spectra of a single sample of Rv0543c (0.06 mM) in NMR buffer collected at 25°C (blue), 80°C (red), and 25°C post-heating (cyan). B) The CD thermal melt for Rv0543c obtained by measuring the ellipticity at 220 nm in 2.0°C intervals between 10 and 80°C. C) The first derivative of the curve in Figure 3B.

To assay the thermal stability of Rv0543c, the ellipticity at 220 nm was measured as a function of temperature between 10 and 80°C. Typically, a phase transition can be detected when a structured protein becomes denatured by monitoring the increase in the ellipticity at a specific wavelength with increasing temperature [10,11,31,32]. As shown in Figure 3B, a gradual increase in ellipticity at 220 nm is observed up to ~56°C followed by a more rapid increase in ellipticity that tails-off and plateaus at ~74°C. Because the steady-state wavelength spectra in Figure 3A suggest that the protein is not entirely unstructured at 80°C, the end point of the temperature study in Figure 3B likely does not represent a fully unstructured protein but one that now contains significant β-strand character. Beta-strands are typically more robust than α-helices [33] and the structure that forms at high temperature does not unfold into its native structure upon cooling to 25°C. While the irreversible nature of the transition means the CD data may not be for this transition analyzed thermodynamically [32], a quantitative estimation of the Tm may be obtained by assuming a two-state model and taking a first derivative of the curve in Figure 3B [12]. The maximum of this first derivative, shown in Figure 3C, is 62.5°C.

Backbone dynamics

The solution structure calculations for Rv0543c show that the protein consists of five α-helices arranged in an ordered bundle. Because protein dynamics often plays a role in the binding and catalysis properties of proteins at, and around, active sites [34], motion unique to any one of the five α-helices may have some mechanistic consequences. Information regarding the dynamics along a protein’s backbone can be extracted from the relaxation properties of the amide resonances obtained from heteronuclear steady-state {1H}-15N NOE, T1, and T2 measurements [35,36] because internal motion affects the rate excited nuclei exchange energy with their local environment and relax. Figure 4 is a plot of the backbone {1H}-15N heteronuclear NOE, T1, and T2 values measured for Rv0543c. The average value for the heteronuclear steady-state {1H}-15N NOE, T1, and T2 is 0.77 ± 0.11, 595 ± 108 ms, and 77 ± 21 ms, respectively. There is no evidence that the motion of any of the helices are significantly different from each other as most of the relaxation values are relatively uniform throughout the sequence. These observations suggest that the protein has restricted flexibility on the sub-ns timescale [35] with the only significant exception towards the end of the C-terminus in an unstructured region of the protein where more rapid motion may be expected.

Figure 4.

Figure 4

(A) Backbone {1H)-15 N heteronuclear NOE (blue), (B) T1 (orange), and (C) T2 (purple) values for Rv0543c collected at a 1H resonance frequency of 600 MHz (20°C). Proline residues in the sequence are indicated by red asterisks. The location of experimentally observed α-helices in the structure is shown in the schematic above the graph.

Insights into biological function

To identify a possible biochemical function for Rv0543c based on its structure, the RCSB Protein Data Bank was searched for structures with similarities to Rv0543c using the DALI search engine [24]. The best Z-scores identified by the search were only slightly above 4.0, indicating a low structural similarity to any other protein in the RCSB-PDB. The structures with the best Z-scores were all DNA binding proteins, Cre recombinases. Indeed, the 30 structures with the highest Z-scores were Cre recombinases (1CRX-A, Z = 4.4; 1XO0, Z = 4.1) bound to DNA [37,38]. Figure 5 shows an overlay of the structure of Rv0543c with the PDB structure that resulted in the best RMSD (5.9 Å), 1NZB-E [39], using the web program SuperPose [40]. The overlaid region of 1NZB (S38 - G128) contains four α-helices and Rv0543c (M1 - G90) contains five. It can be seen from Figure 5 that one of the α-helices, labeled α2 in both molecules, superimpose fairly well. Helices α1, α3, and α4 are only crudely in similar space with α5 from Rv0543c overlaying in part with the C-terminal end of α1 from 1NZB.

Figure 5.

Figure 5

Superposition of the structure of Rv0543c (M1 - G90, blue) on the structure of E. coli phage P1 Cre recombinase 1NZB-E (S38 - G128, wheat) [39] using the program SuperPose [40]. The five α-helices for Rv0543c are labeled in black following the scheme in Figure 2 and the four α-helices shown for Cre recombinase are labeled sequentially in wheat.

If the biological role of Rv0543c involved binding DNA, then, as observed for the Cre recombinases and other DNA binding proteins, Rv0543c should also have a positively charged surface that interacts with the negatively charged phosphodiester backbone of DNA [3739]. Figure 6A and 6B illustrate the electrostatic surface potential on the solvent-accessible surface of Rv0543c. The most significant feature is a negatively charged surface (red) on one face (Figure 6A) with only a few small pockets of positively charged surfaces (blue), observations opposite to what would be expected if Rv0543c was a DNA binding protein. Furthermore, ConSurf analysis [25] of the structure of Rv0543c shows that the solvent-accessible surface of the negatively charged areas is more conserved than for the positively charged areas (Figure 6C and 6D), suggesting the negatively charged surface may have more functional significance than the positively charged pockets.

Figure 6.

Figure 6

A + B) PyMOL generated maps [27] of the electrostatic potentials at the solvent-accessible surface of Rv0543c. The orientation in (A) is similar to the orientation shown in Figure 2B with (B) a 180° rotation about the vertical axis. C + D) ConSurf generated maps [25] of the conserved residues on the solvent-accessible surface of Rv0543c. The orientation of C and D is identical to A and B, respectively. Highly conserved residues are colored magenta and pink, poorly conserved residues are colored cyan, and residues with insufficient information are colored yellow.

Figure 7 contains a ClustalW2 [26] sequence alignment of ten proteins in the DUF3349 superfamily that do not belong to the genera Mycobacterium or Rhodococcus. The residues identified as conserved by ClustalW2 and ConSurf are identified on the bottom of the alignment, and in general, identify similarly conserved regions of the protein. The ClustalW2 alignment identifies 11 identical amino acid residues in all ten sequences; eight hydrophobic, one hydrophilic, one negatively charged, and one positively charged. Twelve conserved residues are identified by ClustalW2; ten hydrophobic and two negatively charged. The ClustalW2 identification of only one conserved negatively charged residue in the DUF3349 family further rules out the possibility that these proteins are involved with a DNA binding function. Instead, most of the amino acid conservation in the DUF3349 family involves hydrophobic residues and these residues are likely responsible for driving the five α-helices to adopt their tertiary fold. This conversation of the interior of the protein is better shown in Figure 2B, a ConSurf color-coded ribbon representation of the structure of Rv0543c, where a substantial number of the highly conserved residues (purple and pink) are either in the interior (much of α2) or pointing towards the interior (C-terminal of α1, approximately half of α4 and α5) of the structure.

Figure 7.

Figure 7

Sequence alignment of Rv0543c with 10 other closely related sequences from genera other than Mycobacterium and Rhodococcus using the program ClustalW2 [26]. The proteins, in order are: Rv0543c, JNB_16594, nfa52480, FRAAL1463, Gbro_1053, CMS_2894, Krad_0190, KRH_23040, NCg10440, ZP_053675541.1. The residues are color coded as follows: cyan = hydrophobic and small (AVFPMILW), red = acidic (D, E), purple = basic (K, R), yellow = hydrophophilic (STYHNCGQ). Below the alignment is shown the position of residues that are highly conserved following ClustalW2 (* = identical, : = conserved substitutions, . = semi-conserved substitutions) and ConSurf [25] (scores of 9 = asterisk, 8 = caret) analysis of the sequences. The ClustalW2 analysis used only the amino acid sequence of Rv0543c and the ten other proteins listed in Figure 7 while the ConSurf analysis used the complete RCSB-PDB and SWISS-PROT databases.

Conclusions

Rv0543c is a monomeric bundle of five α-helices rigidly held together by a largely conserved group of hydrophobic amino acid side chains. With a melting temperature of 62.5°C, the protein is somewhat resistant to thermal denaturation. However, upon thermal denaturation the protein is unable to refold into its native conformation. A DALI search of the protein structures in the RCSB-PDB returned hits with low Z-scores (• 4.4) and the hits with Z-scores above 4.0 were all Cre recombinases. However, both the electrostatic surface potential and ConSurf identified conserved surface on the solvent-accessible surface of Rv0543c are not consistent with a DNA-binding function for the protein. Consequently, the fold observed for Rv0543c may be unique for Rv0543c and the other members of the DUF3349 superfamily. Associated with this unique fold may be a unique function that still needs to be discovered, perhaps a function associated with the menaquinone-biosynthethic pathway since the Rv0543c gene is co-localized with genes from this pathway on the chromosome. When a function is found, the structure for Rv0543c presented here will assist the molecular understanding of its mechanism and potentially speed up the conception and development of new therapies to treat and control the spread of tuberculosis if the function turns out to be vital for M. tuberculosis survival in the microenvironment of the host.

  • First structure for a protein from the DUF3349 superfamily.

  • Novel fold composed of a bundle of five α-helices.

  • Irreversibly unfolds upon heating with Tm of 62.5°C.

  • The DUF3349 proteins may be a good drug targets because they are predominately in the Mycobacterium and Rhodococcus bacterial species.

Acknowledgments

The structure of Rv0543c was a community request made to the Seattle Structural Genomics Center for Infectious Disease (SSGCID) and was given the internal identification code MytuD.17712.a. The research was funded by the National Institute of Allergy and Infectious Diseases under Federal Contract No. HHSN272200700057C and performed primarily at the W.R. Wiley Environmental Molecular Sciences Laboratory, a national scientific user facility sponsored by U.S. Department of Energy’s Office of Biological and Environmental Research program located at Pacific Northwest National Laboratory (PNNL). Battelle operated PNNL for the U.S. Department of Energy.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Globular Tuberculosis Control: A Short Update to the 2009 Report. World Health Organization; Geneva: 2009. [Google Scholar]
  • 2.Kaplan G, Post FA, Moreira AL, Wainwright H, Kreiswirth BN, Tanverdi M, Mathema B, Ramaswamy SV, Walther G, Steyn LM, Barry CE, III, Bekker LG. Infect Immun. 2003;71:7099–7108. doi: 10.1128/IAI.71.12.7099-7108.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Enarson DA. In: Mycobacteria and TB. Kaufmann SHEHH, editor. Kargel, Basel: 2003. pp. 1–16. [Google Scholar]
  • 4.Russell DG, Barry CE, III, Flynn JL. Science. 2010;328:852–856. doi: 10.1126/science.1184784. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Mitchison DA. Eur Res J. 2005;25:376–379. doi: 10.1183/09031936.05.00075704. [DOI] [PubMed] [Google Scholar]
  • 6.Basu S, Friedland GH, Medlock J, Andrews JR, Shah NS, Gandhi NR, Moll A, Moodley P, Sturm AW, Galvani AP. Proc Natl Acad Sci USA. 2009;106:7672–7677. doi: 10.1073/pnas.0812472106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Kamholz SL. J Assoc Acad Minor Phys. 2002;13:53–56. [PubMed] [Google Scholar]
  • 8.Palomino JC, Ramos DF, da Silvia PA. Curr Med Chem. 2009;16:1898–1904. doi: 10.2174/092986709788186066. [DOI] [PubMed] [Google Scholar]
  • 9.Dye C, Williams BG. Science. 2010;328:856–861. doi: 10.1126/science.1185449. [DOI] [PubMed] [Google Scholar]
  • 10.Buchko GW, Kim CY, Terwilliger TC, Kennedy MA. J Bact. 2006;188:5993–6001. doi: 10.1128/JB.00460-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Buchko GW, Kim CY, Terwilliger TC, Myler PJ. Tuberculosis. 2010;90:245–251. doi: 10.1016/j.tube.2010.04.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Greenfield NJ. Nat Prot. 2006;6:2527–2535. doi: 10.1038/nprot.2006.204. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Goddard TD, Kneller DG. Sparky 3. University of California; San Francisco: [Google Scholar]
  • 14.Wishart DS, Bigam CG, Yao J, Abildgaard F, Dyson HJ, Oldfield E, Markley JL, Sykes BD. J Biomol NMR. 1995;6:135–140. doi: 10.1007/BF00211777. [DOI] [PubMed] [Google Scholar]
  • 15.Farrow NA, Muhandiram DR, Singer AU, Pascal SM, Kay LE, Gish G, Shoelson SE, Pawson T, Forman-Kay JD. Biochemistry. 1994;33:5984–6003. doi: 10.1021/bi00185a040. [DOI] [PubMed] [Google Scholar]
  • 16.Skelton NJ, Palmer AG, III, Akke M, Kordel J, Rance M, Chazin WJ. J Magn Res B. 1993;102:253–264. [Google Scholar]
  • 17.Buchko GW, Daughdrill GW, De Lorimier R, Rao S, Isern NG, Lingbeck J, Taylor JS, Wold MS, Gochin LD, Spicer LD, Lowry DF, Kennedy MA. Biochemistry. 1999;38:15116–15128. doi: 10.1021/bi991755p. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Szyperski T, Yeh D, Sukumaran D, Moseley H, Montelione G. Proc Natl Acad Sci USA. 2002;99:8009–8014. doi: 10.1073/pnas.122224599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Buchko GW, Tarasevich BJ, Bekhazi J, Snead ML, Shaw WJ. Biochemistry. 2008;47:13215–13222. doi: 10.1021/bi8018288. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Guntert P. Meth Mol Biol. 2004;278:353–378. doi: 10.1385/1-59259-809-9:353. [DOI] [PubMed] [Google Scholar]
  • 21.Cornilescu G, Delagio F, Bax A. J Biomol NMR. 1999;13:289–301. doi: 10.1023/a:1008392405740. [DOI] [PubMed] [Google Scholar]
  • 22.Linge JP, Nilges M. J Biomol NMR. 1999;13:51–59. doi: 10.1023/a:1008365802830. [DOI] [PubMed] [Google Scholar]
  • 23.Bhattacharya A, Tejero R, Montelione G. Proteins. 2007;66:778–795. doi: 10.1002/prot.21165. [DOI] [PubMed] [Google Scholar]
  • 24.Holm L, Rosenstrom P. Nucleic Acids Res. 2010;38:W545–549. doi: 10.1093/nar/gkq366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Landau M, Mayrose I, Rosengerg Y, Glaser F, Martz E, Pupko T, Ben-Tal N. Nucleic Acids Res. 2005;33:W299–W302. doi: 10.1093/nar/gki370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. Bioinformatics. 2007;23:2947–2948. doi: 10.1093/bioinformatics/btm404. [DOI] [PubMed] [Google Scholar]
  • 27.DeLano WL. The PyMOL molecular graphics system. DeLano Scientific LLC; Palo Alto, CA, USA: 2008. [Google Scholar]
  • 28.Palczewska M, Groves P, Ambrus A, Kaeta A, Kover K, Batta G, Kuznicki J. Eur J Biochem. 2001;268:6229–6237. doi: 10.1046/j.0014-2956.2001.02575.x. [DOI] [PubMed] [Google Scholar]
  • 29.Woody RW. Studies of theoretical circular dichroism of polypeptides: Contributions of β-turns. John Wiley & Sons; New York: 1974. [Google Scholar]
  • 30.Holzwarth GM, Doty P. J Amer Chem Soc. 1965;87:218–228. doi: 10.1021/ja01080a015. [DOI] [PubMed] [Google Scholar]
  • 31.Buchko GW, Hess NJ, Bandaru V, Wallace SS, Kennedy MA. Biochemistry. 2000;40:12441–12449. doi: 10.1021/bi001377k. [DOI] [PubMed] [Google Scholar]
  • 32.Karantzeni I, Ruiz C, Liu C-C, LiCata VJ. Biochem J. 2003;374:785–792. doi: 10.1042/BJ20030323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Creighton TE. Proteins: structure and molecular properties. H.W. Freeman & Company; New York: 1993. [Google Scholar]
  • 34.Ishima R, Torchia DA. Nat Struct Biol. 2000;7:740–743. doi: 10.1038/78963. [DOI] [PubMed] [Google Scholar]
  • 35.Freedberg DI, Ishima R, Jacob J, Wang YX, Kustanovich I, Louis JM, Torchia DA. Protein Sci. 2001;11:221–232. doi: 10.1110/ps.33202. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Metcalfe EE, Zamoon J, Thomas DD, Veglia G. Biophys J. 2004;87:1205–1214. doi: 10.1529/biophysj.103.038844. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Guo F, Gopaul DN, Van Duyne GD. Nature. 1997;389:40–46. doi: 10.1038/37925. [DOI] [PubMed] [Google Scholar]
  • 38.Ghosh K, Lau CK, Guo F, Segall AM, Van Duyne GD. J Biol Chem. 2005;280:8290–8299. doi: 10.1074/jbc.M411668200. [DOI] [PubMed] [Google Scholar]
  • 39.Ennifar E, Meyer JEW, Buchholz F, Stewart AF, Suck D. Nucleic Acids Res. 2003;31:5449–5460. doi: 10.1093/nar/gkg732. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Maiti R, van Domselaar GH, Zhang H, Wishart DS. Nucleic Acids Res. 2004;23:W590–W594. doi: 10.1093/nar/gkh477. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES