Skip to main content
Acta Crystallographica Section F: Structural Biology Communications logoLink to Acta Crystallographica Section F: Structural Biology Communications
. 2022 Aug 30;78(Pt 9):338–346. doi: 10.1107/S2053230X22008457

Structure of the hypothetical protein TTHA1873 from Thermus thermophilus

I Yuvaraj a,, Santosh Kumar Chaudhary a,§, J Jeyakanthan b, K Sekar a,*
Editor: M J van Raaijc
PMCID: PMC9435673  PMID: 36048084

The crystal structure of the hypothetical protein TTHA1873 from T. thermophilus has been determined using X-ray crystallography. At high concentrations, the protein showed visible agglutination of red blood cells; its structural properties and thermal stability are discussed.

Keywords: SAD phasing, hypothetical proteins, metalloproteins, jelly-roll topology, Thermus thermophilus, TTHA1873, calcium-binding proteins

Abstract

The crystal structure of an uncharacterized hypothetical protein, TTHA1873 from Thermus thermophilus, has been determined by X-ray crystallography to a resolution of 1.78 Å using the single-wavelength anomalous dispersion method. The protein crystallized as a dimer in two space groups: P43212 and P6122. Structural analysis of the hypothetical protein revealed that the overall fold of TTHA1873 has a β-sandwich jelly-roll topology with nine β-strands. TTHA1873 is a dimeric metal-binding protein that binds to two Ca2+ ions per chain, with one on the surface and the other stabilizing the dimeric interface of the two chains. A structural homology search indicates that the protein has moderate structural similarity to one domain of cell-surface proteins or agglutinin receptor proteins. Red blood cells showed visible agglutination at high concentrations of the hypothetical protein.

1. Introduction

Genome sequencing has increased the number of novel genes with unknown functions. In new genomes that are sequenced, only 70% of the proteins can be assigned precise functions with reasonable accuracy. In comparison, 30% of the proteins have unknown or hypothetical functions (Bork, 2000). These hypothetical proteins are predicted to be expressed from an open reading frame (ORF). They have no experimental evidence for function and represent a significant fraction of the proteomes from both prokaryotes and eukaryotes (Ijaq et al., 2015). Proteins that are conserved among organisms from several phylogenetic lineages, but which have no functional validation, are conserved hypothetical proteins (Galperin & Koonin, 2004). Conversely, some hypothetical proteins are specific to one organism, in which they perform a specialized role. Structure and functional studies of these hypothetical proteins are essential to find novel folds or functions (León et al., 2007). It is indispensable to functionally characterize hypothetical proteins in order to fully understand fundamental or specific phenomena in diverse biological processes. Thus, the three-dimensional structure of a protein is crucial for molecular understanding of its function, which may not be evident solely from sequence analysis (Ebihara et al., 2006).

Several different computational methods exist to predict the structure and function of hypothetical proteins; however, the major obstacle to using these methods is a lack of sequence similarity to other proteins in the database (Sivashankari & Shanmughavel, 2006). Deducing the three-dimensional structures of proteins provides direct and indirect clues to their molecular function, which can be utilized in structural genomics. Further analyses of the structural features of proteins might identify unique attributes that contribute to their role (Eisenstein et al., 2000).

This study used the single-wavelength anomalous dispersion (SAD) method to solve the three-dimensional structure of a hypothetical protein, TTHA1873, from Thermus thermo­philus HB8 at 1.78 Å resolution using diffraction data collected at a home source (Cu Kα X-rays). TTHA1873 from T. thermophilus encodes a 176-residue protein of unknown function with a molecular weight of 18 468 Da. T. thermophilus is a Gram-negative bacterium that thrives at a temperature of 65°C. Proteins from thermophilic organisms are known for their high thermal stability, co-solvent compatibility and increased resistance to denaturing chemical agents (van den Burg, 2003). The T. thermophilus genome has 2218 genes, of which 66% (1482 genes) have been assigned functions and the remaining 34% (736 genes) are unknown or hypothetical (Henne et al., 2004). Crystallographic analysis revealed that TTHA1873 is a metal-binding protein consisting of nine β-strands, which together form a β-sandwich structure and are organized into a jelly-roll fold, a unique topology that is present in carbohydrate-binding proteins.

2. Materials and methods

2.1. Cloning and protein expression and purification

The gene encoding TTHA1873 was cloned and expressed and the protein was purified according to the protocol described previously with slight modifications (Chaudhary et al., 2013). The vector containing the gene for TTHA1873 was transformed into Escherichia coli BL21(DE3)-RIL cells. The transformed colonies were grown at 310 K in 2 l LB medium (tryptone, yeast extract and sodium chloride) containing 100 µg ml−1 ampicillin. The cells were induced with 0.1 mM isopropyl β-d-1-thiogalacto­pyranoside at an OD of 0.6. 4 h post-induction, the cells were harvested, resuspended in lysis buffer (20 mM Tris–HCl pH 7.5, 50 mM NaCl) and lysed using a sonicator. The cell lysate was heated for 15 min at 343 K. After the heat treatment, the cell lysate was centrifuged at 15 000g for 1 h at 277 K. The supernatant solution was desalted using a Sephadex G25 column (GE Healthcare). The desalted protein solution was passed through a Sepharose Q anion-exchange column (GE Healthcare) pre-equilibrated with 20 mM Tris–HCl buffer pH 7.5. The protein was eluted using a linear gradient of 0–0.5 M NaCl in 20 mM Tris–HCl buffer pH 7.5. Expression of the protein was confirmed using 12% SDS–PAGE. The fractions containing TTHA1873 were concentrated and passed through a HiLoad 16/600 Superdex 200 pg gel-filtration column (GE Healthcare) pre-equilibrated with 20 mM HEPES pH 7.5, 150 mM NaCl. Fractions containing the pure protein were pooled together and concentrated, and the purity of the protein was checked using SDS–PAGE. An LC-MS experiment confirmed the molecular mass of the protein.

2.2. Crystallization

The purified TTHA1873 was screened for crystallization using the commercially available Crystal Screen, Crystal Screen 2 and Index screens from Hampton Research. The microbatch-under-oil method was used for all crystallization screening at 20°C. 8 ml of silicone and paraffin oil in a 3:1 ratio was used for each crystallization plate. Crystals were obtained using 1 µl buffer (0.1 M Tris pH 8.5, 3.0 M NaCl) and 1 µl protein solution (∼45 mg ml−1) at 277 K. Diffraction-quality crystals appeared after 60 days. The crystals were soaked for 40 min with mercury(II) potassium iodide (HgI4K2) for SAD data collection at the home source. Diffraction-quality crystals of TTHA1873 were also obtained in Index screen condition No. 29 in a crystallization drop consisting of 1 µl protein (∼45 mg ml−1) and 1 µl 60%(v/v) Tacsimate pH 7.0 at 277 K after 90 days. The protein crystals were soaked with 1 µl crystallization condition and 1 µl of a solution of sugars (glucose, ribose and galactose each at a concentration of 100 mM) and data were collected.

2.3. Data collection, structure solution and refinement

Data were collected from a crystal of the hypothetical protein at 100 K at the home source with Cu Kα radiation using a MAR345 image-plate detector mounted on a Rigaku UltraX-18 rotating-anode X-ray generator at the Molecular Biophysics Unit, Indian Institute of Science, Bangalore. The crystal diffracted to 1.78 Å resolution and the data were indexed and processed in space group P43212 using iMosflm (Battye et al., 2011). Initially, we tried to solve the structure using the molecular-replacement method. The diffraction data were subjected to automated molecular-replacement procedures such as MrBUMP (Keegan & Winn, 2008), the BALBES server (Long et al., 2008) and the marathon MR approach (Hatti et al., 2016). However, none of these provided a useful structure solution. Thus, the SAD method was used for structure solution: the crystal was soaked with HgI4K2 for 40 min and 600 frames were collected at 0.5° oscillation.

The soaked crystal diffracted to 2.03 Å resolution at the home source. Iodine ions were located using an anomalous peak search and the substructure was built using CRANK2 (Skubák & Pannu, 2013) from the CCP4 package. Based on the phase information from the I atoms, the entire structure was built manually using Coot (Emsley et al., 2010) and using the automated model-building program Buccaneer (Cowtan, 2006), with several rounds of refinement using REFMAC (Murshudov et al., 2011). Another data set was collected on the BM14 beamline at the European Synchrotron Radiation Facility (ESRF), Grenoble, France; the crystal diffracted to a resolution of 2.17 Å. The dimer interface was calculated using the PISA server (Krissinel, 2015). Diffraction data and structure-refinement statistics are given in Table 1.

Table 1. X-ray diffraction data and structure-refinement statistics.

PDB code 7wrk 7wwn 7wwo
Wavelength (Å) 1.54 1.54 1.07
Resolution limits (Å) 42.13–1.78 (1.82–1.78) 52.02–2.09 (2.09–2.05) 70.91–2.17 (2.23–2.17)
a, b, c (Å) 42.18, 42.18, 156.86 42.57, 42.57, 156.05 90.42, 90.42, 141.71
Space group P43212 P43212 P6122
R merge (%) 11.3 (20.4) 12.5 (38.8) 18.8 (30.68)
No. of unique reflections 14566 8938 18855
No. of molecules in asymmetric unit 1 1 2
Completeness (%) 100 (100) 99.9 (81.4) 100 (100)
Multiplicity 14.2 (13.6) 22.6 (15.7) 20.1 (10.8)
Mean[(I)/sd(I)] 12.6 (10) 22.6 (3.4) 10.6 (2.7)
R cryst/R free (%) 16/19 20/26 18/21
R.m.s.d., bond lengths (Å) 0.0128 0.011 0.0102
R.m.s.d., bond angles (°) 1.66 1.704 1.712
Average B factors (Å2)
 Protein 13.8 11.51 44.01
 Water 21.22 31.94 46.65
 Ions 8.72 13.45 34.25
 Ligands 25.81
Ramachandran statistics (%)
 Favored region 96.84 95.48 95.86
 Allowed region 1.9 3.23 2.87
 Outliers 1.27 1.27 1.27

2.4. Sequence and structural comparison

Sequence comparison was performed using BLAST (Altschul et al., 1990). Structural alignment to find close homologs of TTHA1873 was performed using DALI (Holm & Rosenström, 2010). PROCHECK (Laskowski et al., 1993) and RAMPAGE (Lovell et al., 2003) were used to generate the Ramachandran maps for the structures. Molecular representations were made using PyMOL (DeLano, 2002). Sequence searches of Pfam (Mistry et al., 2021) and the Conserved Domain Database (Marchler-Bauer et al., 2011) were carried out to identify protein families and domains.

2.5. Preliminary hemagglutination assay

A preliminary hemagglutination assay of TTHA1873 was performed according to the protocol described in Sivaji et al. (2017). Using the serial dilution method, the hemagglutination assay of TTHA1873 was performed in a U-bottom 96-well plate. Protein at a concentration of 45 mg ml−1 was used for serial dilution in phosphate-buffered saline (PBS). 100 µl protein solution was serially diluted with 50 µl red blood cells (RBC) and 40 µl PBS pH 7.4 and incubated for 1 h at room temperature. The agglutination was examined visually after incubation.

3. Results

3.1. SAD phasing using HgI4K2

A crystal soaked with HgI4K2 diffracted to a resolution of 2.03 Å at the home source. Iodide ions were located using an anomalous peak search and the substructure was built using CRANK2 (Skubák & Pannu, 2013) from the CCP4 package. Based on the phase information from the I atoms, the entire structure was built manually using Coot (Emsley et al., 2010) and using the automated model-building program Buccaneer (Cowtan, 2006), with several rounds of refinement using REFMAC (Murshudov et al., 2011). The structure of the hypothetical protein complexed with HgI4K2 (PubChem CID 24542) was solved in space group P43212. Experimental phasing from heavy-atom derivative crystals is a promising method to solve unknown structures; however, choosing the correct heavy-atom derivative is a trial-and-error process. The Heavy-Atom Database System (HATODAS and HATODAS II; Sugahara et al., 2005, 2009) has been developed by analyzing the binding motifs of heavy atoms in protein structures. Mercury binds to most amino acids, preferably to cysteine and histidine residues. According to the HATODAS II database, 46 structures have been reported in which an Hg atom binds to an aspartic acid, and it interacts with a serine residue in another 20 structures (Sugahara et al., 2009). In the TTHA1873 structure mercury is bound to Asp97 and Ser110 at distances of 2.5 and 2.9 Å, respectively. Analysis of the binding of HgI4K2 to TTHA1873 will help to prioritize heavy atoms for phasing experiments. HgI4K2 is a coordinate compound in which mercury is either tricoordinated or tetracoordinated. HgI4K2 occupies five sites in the protein and the Hg atoms are numbered Hg1–Hg5 (Figs. 1 and 2 c). The first mercury ion is coordinated by four iodide ions (Fig. 1 a) and is held by three residues: Gln114, Leu116 and Arg126. An iodine bound to Hg1 forms weak electrostatic interactions with Leu116 and Arg126 (Fig. 1 a). Hg2 is coordinated to three iodide ions bound near residues 96–103 in the protein (Fig. 1 b). It interacts with the side-chain carbonyl of Asp97 at a distance of 2.5 Å and is held in place by the Pro-Asp-X-X-X-Gly-Pro motif. The Hg3 atom is coordinated to four iodide ions bound near a salt-bridge network formed by Gln124–Asn70–Asp156 (Fig. 1 c). The iodine-coordinated Hg3 interacts with Thr158 and Pro165 at distances of 3.0 and 3.6 Å, respectively, whereas the iodine-coordinated Hg4 is present in the loop formed by Gly62, Pro63, Ala132 and Pro133 (Fig. 1 d). Hg5 is bound to Ser110 at a distance of 2.9 Å (Fig. 1 e). It is evident from the analysis of the interactions of the iodine-coordinated mercury compounds in the TTHA1873 structure that HgI4K2 may be useful as a choice of heavy atom for proteins rich in prolines and negatively charged residues.

Figure 1.

Figure 1

The interactions of HgI4K2 with TTHA1873. Hg atoms are represented in white and I atoms are in purple. (a) The first Hg atom, coordinated by four I atoms, interacts with Gln114, Leu116 and Arg126. (b) The second Hg atom, coordinated by three I atoms, interacts with Pro96 and Asp97; the second Hg atom is bound to Asp97 at a distance of 2.5 Å. (c) The third Hg atom, coordinated by four I atoms, interacts with Asn70, Gln124, Asp156, Thr158 and Pro165. (d) The fourth Hg atom is coordinated by four I atoms and interacts with Gly62, Pro63, Ala132 and Pro133. (e) The fifth Hg atom is bound to Ser110 at a distance of 2.9 Å.

Figure 2.

Figure 2

(a) The structure of TTHA1873. The protein structure consists of nine β-strands labeled β1–β9 forming a β-sandwich structure. The blue spheres represent the Ca atoms bound to TTHA1873. (b) The dimeric structure of TTHA1873 shown as a cartoon diagram. The spheres represent calcium ions. (c) The structure of TTHA1873 bound to HgI4K2. The spheres represent HgI4K2 (red spheres, I atoms; white spheres, Hg atoms). (d) A topology diagram of TTHA1873. Red arrows connected by loops represent β-strands, and secondary-structure elements are labeled. (e) Sequence alignment of the closest homologs of the hypothetical protein TTHA1873. The homologs were identified by PSI-BLAST (Altschul et al., 1997) and were aligned using Clustal Omega (Madeira et al., 2019)

3.2. Structure of TTHA1873

The dimeric structure of the hypothetical protein was solved in space groups P43212 and P6122. The TTHA1873 structure consists of nine β-strands arranged to form a β-sandwich structure (Figs. 2 a and 2 b). It has the topological arrangement β1–β2–β7–β4–β9–β8–β3–β6–β5 (Fig. 2 d), in which β1–β2, β7–β4, β8–β9 and β5–β6 form antiparallel β-sheets. A small 310-helix is formed by Gly98, Pro99 and Phe100. The dimeric structure of TTHA1873 contains four calcium ions, two in each chain (Fig. 2 b). Electrostatic surface potential analysis of TTHA1873 using PBEQ-Solver (Jo et al., 2008) shows that the molecular surface is large and negatively charged (Figs. 3 a and 3 b). The interface was calculated using the PDBePISA server. Since most of the surface residues of TTHA1873 are negatively charged, dimer formation is difficult; however, the calcium ions might play an indispensable role in forming the dimer.

Figure 3.

Figure 3

Electrostatic surface potential of TTHA1873 created by PBEQ-Solver (red, negative; blue, positive; white, uncharged). (a) The molecular surface of TTHA1873 shows more negatively charged residues on the surface. (b) The molecular surface of dimeric TTHA1873 shows patches of negatively and positively charged residues on the surface.

3.3. Calcium ion interaction analysis

The structure of TTHA1873 contains four calcium ions: two in each chain. Calcium I interacts with Ile24, Ser25, Glu29, Asp40, Glu170 and Thr171. Specifically, calcium I interacts with Ile24 and Ser25 of one chain and Glu29, Asp40, Glu170 and Thr171 of the other chain, and vice versa for calcium III, thus holding the two chains at the dimeric interface (Fig. 4 a). Calcium II and calcium IV interact with the side-chain carbonyls of Asp95 and Asp97 and the main-chain carbonyls of Gly98, Gly101 and Ala103 in both chains. These two calcium ions (calcium II and calcium IV) interact with negatively charged residues forming a loop on the surface that may play an essential role in the function of the protein (Fig. 4 b).

Figure 4.

Figure 4

The Ca-atom interactions of TTHA1873. Ca atoms are represented in green; chain A is shown as cyan sticks and chain B is in magenta. (a) The first calcium ion interacts with Ile24 and Ser25 of chain A and Glu29, Asp40, Glu170 and Thr171 of chain B. (b) The second calcium ion interacts with Asp95, Asp97, Gly98, Gly101 and Ala103.

3.4. Structural basis of the thermal stability of TTHA1873

Proteins from T. thermophilus are known for their thermostability, and several studies have been performed to understand its molecular basis. An increase in thermal stability of a protein may be due to the presence of a larger number of hydrogen bonds, ion pairs, metal ions, prolines or branched amino acids in the loops (Sterner & Liebl, 2001; Kumar et al., 2000; Chakravarty & Varadarajan, 2002; Manjunath et al., 2013). The structure of TTHA1873 was analyzed to understand the factors behind its higher thermal stability. A total of five ion pairs (Fig. 5 a) are present in the structure (Supplementary Table S2); ion-pair networks are formed by Asp47–Asp89–Arg142, Glu53–Arg55, Glu93–Arg138 and Glu91–Lys86. Arg142 is a critical residue in maintaining the ion-pair network Asp47–Arg142–Asp89. Specifically, the side-chain carbonyl of Asp47 forms an ion pair with the side-chain amide of Arg142, and the amide of Arg142 interacts with the side-chain carbonyl of Asp89, together forming an ion-pair network. Higher frequencies of prolines and branched amino acids are also observed in the loops. All of the 13 prolines and the 11 branched amino acids in TTHA1873 are in loops. These prolines and branched amino acids might contribute to its thermal stability. The dimeric interface is formed by 37 residues, predominantly in β1 and β9 of both chains. Further, only 5% of residues are involved in forming the interface. A total of 23 hydrogen bonds at the interface are formed by Leu22, Ser25, Ser26, Gly28, Glu29, Ser38, Thr39, Asp40, Arg55, Pro165, Thr167, Phe169, Glu170, Thr171 and Thr173. Moreover, the dimeric interface is cemented by the presence of calcium ions, further enhancing its thermostability (Fig. 5 b).

Figure 5.

Figure 5

Structural features of TTHA1873 showing ion-pair and dimeric interface-interacting residues. (a) TTH1873 structure showing ion-pair networks. (b) Calcium-bound interface residues of dimeric TTHA1873.

3.5. Sequence and structural homolog search for possible function

A PSI-BLAST (Altschul et al., 1997) search with the sequence of the hypothetical protein revealed only two DUF11 (domain of unknown function) domain-containing protein sequences, WP_201350985.1 from T. thermophilus and WP_176758238.1 from T. acriforms, with full-length homology, sequence identity ranging from 96.02% to 92.05% and E-values below 1 × 10−103 (Fig. 1 f). No structural or functional information is available for this protein family. The homologous sequences from T. acriforms are annotated as hypothetical conserved-repeat domain-containing proteins. None of the sequence homologs have been functionally characterized. The family of DUF11 proteins is present in multiple copies of several archaebacterial proteins. A search for domains/motifs was performed using Pfam, and the Conserved Domain Database revealed that the protein belongs to the carbohydrate-binding domains and CRISPR-associated proteins.

A search for structural homologs was performed using the DALI server (Holm & Rosenström, 2010). The most structurally similar proteins to TTHA1873 are Streptococcus gordonii agglutinin receptor protein (PDB entry 2wza, Z-score = 14.6, pairwise backbone r.m.s.d. of 3.0 Å over 141 Cα residues; Forsgren et al., 2010) and S. pneumoniae major pilin, a cell-wall surface anchor family protein (PDB entry 2y1v, Z-score = 12.8, pairwise backbone r.m.s.d. of 2.5 Å over 131 Cα residues; El Mortaji et al., 2012), both of which belong to the antigen I/II family of proteins (Supplementary Fig. S3). The second family of structurally similar proteins are lipoxy­genases (PDB entry 5fx8, Z-score = 12.1, pairwise backbone r.m.s.d. of 3.0 Å over 125 Cα residues; Chen et al., 2016). The third protein family similar to TTHA1873 are transglutaminases (PDB entry 2q3z, Z-score = 10.6, pairwise backbone r.m.s.d. of 2.6 Å over 101 Cα residues; Pinkas et al., 2007). The sequence identity with all known structures is below 12%. Even though the sequence identity with other proteins is very low, the backbone structure is found to be moderately similar to those of the cell-surface proteins; however, the length of the structural homologs is considerable compared with that of TTHA1873 (Supplementary Fig. S3). Structural homology searches revealed that THA1873 could be an agglutination protein or a cell-surface protein with potential sugar-binding ability. The TTHA1873 protein only showed visible agglutination of RBCs at high concentrations of protein (Figs. 6 a and 6 b).

Figure 6.

Figure 6

(a, b) Preliminary hemagglutination assay of TTHA1873. Incubation of serially diluted TTHA1873 with RBC in PBS. Red-colored wells indicate positive hemagglutination, whereas red blood clots show the absence of hemagglutination.

4. Discussion

The annotation of hypothetical proteins aids in the discovery of novel structures and novel functions, which allow them to be classified into protein pathways and cascades (Shahbaaz et al., 2013; Mohan & Venugopal, 2012). The annotation of proteins and their functions should allow a comprehensive understanding of the physiology of the organisms. However, most genome-wide functional annotations have been obtained using in silico methods (Meier et al., 2013). It is well documented that only 50% of the proteins discovered in genome projects have reliable, functional annotations, while the remainder have unknown, uncertain or incorrect functional annotations (Gerlt et al., 2015; Schnoes et al., 2009). It has clearly been shown that closely related proteins with high sequence similarity may have different functionalities, and unrelated proteins may have similar functions. Thus, in vitro experiments are necessary to establish the exact roles of structurally characterized protein molecules.

A comparison of the previously available structures was performed using DALI, which suggests that the protein may have carbohydrate-binding properties. Structure-based protein function prediction using graph convolutional networks (DeepFRI; Gligorijević et al., 2021) also predicted the protein to be a carbohydrate-binding protein. The jelly-roll topology of the protein also suggests a role in carbohydrate binding. Furthermore, a hemagglutination assay shows visible agglutination of RBCs at higher protein concentrations. Crystal-soaking experiments with simple sugars such as glucose, ribose and galactose did not show any electron density for the sugar molecules. However, the affinity of the protein towards more complex carbohydrates will be tested in the future.

A total of five ion-pair interactions contribute to thermal stability and are further reinforced by the presence of a larger number of prolines, branched amino acids in loops and dimeric interface stabilization by metal ions. Additionally, the coordinated compound (HgI4K2) is held by the Pro-Asp-X-X-X-Gly-Pro and Ser-Ala-Asp-Val sequence motifs. The interaction of mercury with Asp97 and Ser110 suggests that HgI4K2 may be useful as a choice of heavy atom in the case of proteins that are rich in prolines and negatively charged residues.

5. Conclusion

The reported heavy-atom derivative structure (PDB entry 7wwn) represents the first in which mercury is tetrahedrally coordinated by iodine. This complex structure reveals interesting new structural features that may be helpful in the choice of heavy atoms to obtain phases for unknown structures. TTHA1873 is a calcium-bound metalloprotein that shows similarity to cell-surface proteins. The protein showed visible agglutination of red blood cells at higher concentrations. Analysis of the structural data may provide a framework to deduce and establish the molecular function of the TTHA1873 protein.

6. Related literature

The following references are cited in the supporting information for this article: Arias & DuBois (2017), Geigenmüller et al. (2002) and Madeira et al. (2022).

Supplementary Material

PDB reference: TTHA1873, 7wrk

PDB reference: 7wwo

Supporting information including Supplementary Figures and Tables. DOI: 10.1107/S2053230X22008457/va5048sup1.pdf

f-78-00338-sup1.pdf (1.3MB, pdf)

Acknowledgments

The authors are grateful for the facilities offered by the Centre of Excellence in Structural Biology and Biocomputing, funded by the Department of Biotechnology (DBT), Government of India, and the Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India. We sincerely thank the X-ray and mass-spectrometry facilities at the Molecular Biophysics Unit, Indian Institute of Science, Bangalore, the BM14 beamline at the European Synchrotron Radiation Facility (ESRF), Grenoble, France, the XRD2 beamline at the Elettra Sincrotrone, Trieste, Italy and the BL44XU beamline at SPring-8, RIKEN, Japan. The authors declare no conflicts of interest.

References

  1. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990). J. Mol. Biol. 215, 403–410. [DOI] [PubMed]
  2. Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. (1997). Nucleic Acids Res. 25, 3389–3402. [DOI] [PMC free article] [PubMed]
  3. Arias, C. & DuBois, R. (2017). Viruses, 9, 15. [DOI] [PMC free article] [PubMed]
  4. Battye, T. G. G., Kontogiannis, L., Johnson, O., Powell, H. R. & Leslie, A. G. W. (2011). Acta Cryst. D67, 271–281. [DOI] [PMC free article] [PubMed]
  5. Bork, P. (2000). Genome Res. 10, 398–400. [DOI] [PubMed]
  6. Burg, B. van den (2003). Curr. Opin. Microbiol. 6, 213–218. [DOI] [PubMed]
  7. Chakravarty, S. & Varadarajan, R. (2002). Biochemistry, 41, 8152–8161. [DOI] [PubMed]
  8. Chaudhary, S. K., Jeyakanthan, J. & Sekar, K. (2013). Acta Cryst. F69, 118–121. [DOI] [PMC free article] [PubMed]
  9. Chen, Y., Wennman, A., Karkehabadi, S., Engström, Å. & Oliw, E. H. (2016). J. Lipid Res. 57, 1574–1588. [DOI] [PMC free article] [PubMed]
  10. Cowtan, K. (2006). Acta Cryst. D62, 1002–1011. [DOI] [PubMed]
  11. DeLano, W. L. (2002). PyMOL. http://www.pymol.org.
  12. Ebihara, A., Yao, M., Masui, R., Tanaka, I., Yokoyama, S. & Kuramitsu, S. (2006). Protein Sci. 15, 1494–1499. [DOI] [PMC free article] [PubMed]
  13. Eisenstein, E., Gilliland, G. L., Herzberg, O., Moult, J., Orban, J., Poljak, R. J., Banerjei, L., Richardson, D. & Howard, A. J. (2000). Curr. Opin. Biotechnol. 11, 25–30. [DOI] [PubMed]
  14. El Mortaji, L., Contreras-Martel, C., Moschioni, M., Ferlenghi, I., Manzano, C., Vernet, T., Dessen, A. & Di Guilmi, A. (2012). Biochem. J. 441, 833–843 [DOI] [PubMed]
  15. Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. (2010). Acta Cryst. D66, 486–501. [DOI] [PMC free article] [PubMed]
  16. Forsgren, N., Lamont, R. J. & Persson, K. (2010). J. Mol. Biol. 397, 740–751. [DOI] [PMC free article] [PubMed]
  17. Galperin, M. Y. & Koonin, E. V. (2004). Nucleic Acids Res. 32, 5452–5463. [DOI] [PMC free article] [PubMed]
  18. Geigenmüller, U., Ginzton, N. H. & Matsui, S. M. (2002). J. Gen. Virol. 83, 1691–1695. [DOI] [PubMed]
  19. Gerlt, J. A., Bouvier, J. T., Davidson, D. B., Imker, H. J., Sadkhin, B., Slater, D. R. & Whalen, K. L. (2015). Biochim. Biophys. Acta, 1854, 1019–1037. [DOI] [PMC free article] [PubMed]
  20. Gligorijević, V., Renfrew, P. D., Kosciolek, T., Leman, J. K., Berenberg, D., Vatanen, T., Chandler, C., Taylor, B. C., Fisk, I. M., Vlamakis, H., Xavier, R. J., Knight, R., Cho, K. & Bonneau, R. (2021). Nat. Commun. 12, 3168. [DOI] [PMC free article] [PubMed]
  21. Hatti, K., Gulati, A., Srinivasan, N. & Murthy, M. R. N. (2016). Acta Cryst. D72, 1081–1089. [DOI] [PubMed]
  22. Henne, A., Brüggemann, H., Raasch, C., Wiezer, A., Hartsch, T., Liesegang, H., Johann, A., Lienard, T., Gohl, O., Martinez-Arias, R., Jacobi, C., Starkuviene, V., Schlenczeck, S., Dencker, S., Huber, R., Klenk, H. P., Kramer, W., Merkl, R., Gottschalk, G. & Fritz, H. J. (2004). Nat. Biotechnol. 22, 547–553. [DOI] [PubMed]
  23. Holm, L. & Rosenström, P. (2010). Nucleic Acids Res. 38, W545–W549. [DOI] [PMC free article] [PubMed]
  24. Ijaq, J., Chandrasekharan, M., Poddar, R., Bethi, N. & Sundararajan, V. S. (2015). Front. Genet. 6, 119. [DOI] [PMC free article] [PubMed]
  25. Jo, S., Vargyas, M., Vasko-Szedlar, J., Roux, B. & Im, W. (2008). Nucleic Acids Res. 36, W270–W275. [DOI] [PMC free article] [PubMed]
  26. Keegan, R. M. & Winn, M. D. (2008). Acta Cryst. D64, 119–124. [DOI] [PMC free article] [PubMed]
  27. Krissinel, E. (2015). Nucleic Acids Res. 43, W314–W319. [DOI] [PMC free article] [PubMed]
  28. Kumar, S., Tsai, C. J. & Nussinov, R. (2000). Protein Eng. Des. Sel. 13, 179–191. [DOI] [PubMed]
  29. Laskowski, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. (1993). J. Appl. Cryst. 26, 283–291.
  30. León, E., Yee, A., Ortíz, A. R., Santoro, J., Rico, M. & Jiménez, M. A. (2007). Protein Sci. 16, 2278–2286. [DOI] [PMC free article] [PubMed]
  31. Long, F., Vagin, A. A., Young, P. & Murshudov, G. N. (2008). Acta Cryst. D64, 125–132. [DOI] [PMC free article] [PubMed]
  32. Lovell, S. C., Davis, I. W., Arendall, W. B., de Bakker, P. I. W., Word, J. M., Prisant, M. G., Richardson, J. S. & Richardson, D. C. (2003). Proteins, 50, 437–450. [DOI] [PubMed]
  33. Madeira, F., Park, Y. M., Lee, J., Buso, N., Gur, T., Madhusoodanan, N., Basutkar, P., Tivey, A. R. N., Potter, S. C., Finn, R. D. & Lopez, R. (2019). Nucleic Acids Res. 47, W636–W641. [DOI] [PMC free article] [PubMed]
  34. Madeira, F., Pearce, M., Tivey, A. R. N., Basutkar, P., Lee, J., Edbali, O., Madhusoodanan, N., Kolesnikov, A. & Lopez, R. (2022). Nucleic Acids Res. 50, W276–W279. [DOI] [PMC free article] [PubMed]
  35. Manjunath, K., Kanaujia, S. P., Kanagaraj, S., Jeyakanthan, J. & Sekar, K. (2013). Int. J. Biol. Macromol. 53, 7–19. [DOI] [PubMed]
  36. Marchler-Bauer, A., Lu, S., Anderson, J. B., Chitsaz, F., Derbyshire, M. K., DeWeese-Scott, C., Fong, J. H., Geer, L. Y., Geer, R. C., Gonzales, N. R., Gwadz, M., Hurwitz, D. I., Jackson, J. D., Ke, Z., Lanczycki, C. J., Lu, F., Marchler, G. H., Mullokandov, M., Omelchenko, M. V., Robertson, C. L., Song, J. S., Thanki, N., Yamashita, R. A., Zhang, D., Zhang, N., Zheng, C. & Bryant, S. H. (2011). Nucleic Acids Res. 39, D225–D229. [DOI] [PMC free article] [PubMed]
  37. Meier, M., Sit, R. V. & Quake, S. R. (2013). Proc. Natl Acad. Sci. USA, 110, 477–482. [DOI] [PMC free article] [PubMed]
  38. Mistry, J., Chuguransky, S., Williams, L., Qureshi, M., Salazar, G. A., Sonnhammer, E. L. L., Tosatto, S. C. E., Paladin, L., Raj, S., Richardson, L. J., Finn, R. D. & Bateman, A. (2021). Nucleic Acids Res. 49, D412–D419. [DOI] [PMC free article] [PubMed]
  39. Mohan, R. & Venugopal, S. (2012). Bioinformation, 8, 722–728. [DOI] [PMC free article] [PubMed]
  40. Murshudov, G. N., Skubák, P., Lebedev, A. A., Pannu, N. S., Steiner, R. A., Nicholls, R. A., Winn, M. D., Long, F. & Vagin, A. A. (2011). Acta Cryst. D67, 355–367. [DOI] [PMC free article] [PubMed]
  41. Pinkas, D. M., Strop, P., Brunger, A. T. & Khosla, C. (2007). PLoS Biol. 5, e327. [DOI] [PMC free article] [PubMed]
  42. Schnoes, A. M., Brown, S. D., Dodevski, I. & Babbitt, P. C. (2009). PLoS Comput. Biol. 5, e1000605. [DOI] [PMC free article] [PubMed]
  43. Shahbaaz, M., Hassan, M, I. & Ahmad, F. (2013). PLoS One, 8, e84263. [DOI] [PMC free article] [PubMed]
  44. Sivaji, N., Abhinav, K. V. & Vijayan, M. (2017). Acta Cryst. F73, 300–304. [DOI] [PMC free article] [PubMed]
  45. Sivashankari, S. & Shanmughavel, P. (2006). Bioinformation, 1, 335–338. [DOI] [PMC free article] [PubMed]
  46. Skubák, P. & Pannu, N. S. (2013). Nat. Commun. 4, 2777. [DOI] [PMC free article] [PubMed]
  47. Sterner, R. & Liebl, W. (2001). Crit. Rev. Biochem. Mol. Biol. 36, 39–106. [DOI] [PubMed]
  48. Sugahara, M., Asada, Y., Ayama, H., Ukawa, H., Taka, H. & Kunishima, N. (2005). Acta Cryst. D61, 1302–1305. [DOI] [PubMed]
  49. Sugahara, M., Asada, Y., Shimada, H., Taka, H. & Kunishima, N. (2009). J. Appl. Cryst. 42, 540–544.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

PDB reference: TTHA1873, 7wrk

PDB reference: 7wwo

Supporting information including Supplementary Figures and Tables. DOI: 10.1107/S2053230X22008457/va5048sup1.pdf

f-78-00338-sup1.pdf (1.3MB, pdf)

Articles from Acta Crystallographica. Section F, Structural Biology Communications are provided here courtesy of International Union of Crystallography

RESOURCES