Skip to main content
Bioinformatics and Biology Insights logoLink to Bioinformatics and Biology Insights
. 2023 Apr 2;17:11779322231164828. doi: 10.1177/11779322231164828

Comparative In Silico Analysis and Functional Characterization of TANK-Binding Kinase 1–Binding Protein 1

Humaira Aziz Sawal 1,, Shagufta Nighat 1, Tanzeela Safdar 1, Laiba Anees 1
PMCID: PMC10074619  PMID: 37032976

Abstract

Protein modelling plays a vital role in the drug discovery process. TANK-binding kinase 1–binding protein 1 is also called an adapter protein, which is encoded by gene TBK1 present in Homo sapiens. It is found in lungs, small intestine, leukocytes, heart, placenta, muscle, kidney, lower level of thymus, and brain. It has a number of protein-binding sites, to which TBK1 and IKBKE bind and perform different functions as immunomodulatory, antiproliferative, and antiviral innate immunity which release different types of interferons. Our study predicts the comparative model of 3-dimensional (3D) structure through different bioinformatics tools that will be helpful for further studies in future. The reactivity and stability of these proteins were evaluated physicochemically and through domain determination and prediction of secondary structure using bioinformatics methods such as ProtParam, Pfam, and SOPMA, respectively. Robetta, an ab initio approach, I-TASSER, and AlphaFold was used for 3D structure prediction, and the models were validated using the SAVESv6.0 (PROCHECK) server. Conclusively, the best 3D structure of TBK1-binding protein 1 was predicted using Robetta software. After unveiling the 3D structure of the novel protein, we concluded that this structure will help us to find out its role other than in antiviral innate immunity and by producing torsion in its 3D structure researchers will be able to detect either this protein is involved in any disease or not because according to previous studies it was not associated with any disease.

Keywords: AlphaFold, I-TASSER, Robetta, TBK1-binding protein 1

Introduction

The prediction of the 3-dimensional (3D) structure of proteins from amino acid sequence has been very challenging in computational biophysics for decades. Recently, studies have discussed the inverse problem in which an amino acid sequence designed will fold into a specified 3D structure. This phenomenon has attracted the attention towards the rational engineering of proteins which is useful in biotechnology and pharmaceutical sciences. Previously reported studies show the methods for the prediction and design of protein structures and now they are greatly advanced. New algorithms for protein designing folds and protein-protein interactions have been used to engineer the novel assemblies and to design fluorescent proteins with improved properties, as well as signalling proteins which have the improved therapeutic potential. 1

Proteins are important components of life, and by understanding the 3D structure, we will be able to understand their physical, biological, and biochemical properties very efficiently. About 100 000 3D structures of novel proteins have been predicted till date.2-5 But this is considered as a very small fraction compared with the enormously known protein sequences.6,7

TBK1-binding protein 1 is present in Homo sapiens having a sequence of 615 aminoacids with a primary accession number A7MCY6 in UniProt. 8 TBK1-binding protein 1 is considered the major regulator of interferon type 1 and it interacts with different types of proteins such as NAP1, TANK, and SINTBAD. This protein is called the adapter protein because it has several binding sites through which it interacts with the other proteins such as TBK1 and IKBKE and performs their functions in antiviral innate immunity. 9 It is present in lungs, small intestine, leuckocytes, heart placenta, muscle, kidney, lower level of thymus, and brain. It is encoded by gene TBK1 present in Homo sapiens. 10 It interacts with different proteins; it mostly forms homodimers but sometimes forms heterodimers with NAP1. 11

Determination of 3D structure of a single protein is a laborious method of about a month to years in the past and different experimental techniques such as x-ray crystallography were used which were expensive and time-consuming. To fill the gap, there is the need of fast and accurate computational methods for 3D structural prediction. Due to the fastest progress in structural bioinformatics, the fast and free-of-cost techniques have been introduced so that everyone can easily access the notable servers such as I-TASSER, AlphaFold, and Robetta. 12 I-TASSER is an approach used for the 3D structures of proteins and structure-based functions. First, it identifies the structural templates for protein from PDB (Protein Data Bank) by multiple threading approach and models are constructed by iterative template-based fragment assembly stimulations. 13

AlphaFold is an artificial intelligence system that helps in determining a protein’s 3D structure from its amino acid sequence. It has a competitive accuracy. 14

Robetta is an online server providing automated protein structure prediction and analysis. 15 Robetta was used for the determination of 3D structure, and the models were validated using the SAVESv6.0 (PROCHECK) server. 16

Methodology

Retrieval of protein sequences

The sequence of TANK-binding kinase 1–binding protein was retrieved in FASTA format from NCBI’s (National Center for Biotechnology Information) protein database (NP_001381684.1). 17 The UniProt ID is A7MCY6. 8

Depiction of physiochemical properties

The ProtParam server from Expasy 18 was used to characterize the physiochemical properties of protein sequences and to define the amino acid composition of the targeted protein. The isoelectric point (pI), the aliphatic index (AI), and the grand average of hydropathy (GRAVY) were calculated; the total number of negative (−R) and positive (+R) residues, the extinction coefficients (EC), the instability index (II), and the Pfam analysis were performed to characterize the protein particularly in relation to the protein family. 19

Determination of secondary structure

Secondary structural characteristics of protein was determined using SOPMA (Self-Optimized Prediction Method with Alignment) 20 method. It uses the amino acid sequence for characterization of the secondary structures such as the alpha helix, beta sheets, beta turn, and random coils.

Building of protein model and its evaluation

The 3D structure of the targeted protein was not available in PDB. Therefore, web tools such as Robetta, I-TASSER, and AlphaFold 21 were used to determine the 3D model of the protein. The SAVESv6.0 (PROCHECK) 22 server was used for model validation (Table 1).

Table 1.

Showing structure validation of models predicted through Robetta, AlphaFold, and I-TASSER.

Tools PDB structure number ERRAT Ramachandran plot results
Most favourable region, % Allowed region, % Disallowed region, %
Robetta 1 84.0604 90.8 8.6 0.6
2 84.0871 87.4 10.6 2.0
3 85.5025 88.8 9.6 1.6
4 81.9835 86.6 12.2 1.2
5 83.8 15 1.2
AlphaFold 1 94.6809 77.6 20.4 2.0
I-TASSER 1 75.6388 65.3 31.4 3.2
2 69.0909 53.7 41.1 5.2
3 73.4558 57.9 38.1 4.0
4 69.9336 56.7 40.1 3.2
5 80.2326 65.7 28.8 5.4

Abbreviation: PDB, Protein Data Bank.

Results

The sequence of TBK1-binding protein 1 was retrieved from the NCBI data bank; it is a 9-exonic protein encoded for 615 amino acids. The homologous sequence was searched by Blastp with all PDB files. The deep view of its homology search shows that TBK1-binding protein 1 was showing homology >80% with Homo sapiens.

Physiochemical properties

The amino acid sequence of protein was figured using Expasy’s ProtParam tool. The physiochemical properties were calculated for the protein. The sequence of proteins consists of 615 amino acids, with 67 702.10 molecular weight. The isoelectric point estimates the protein solubility, its electrophoresis, and electrophoretic separation. 23 The isoelectric point of our studied protein was 5.62, which is near to 7, which showed that it has average protein solubility and electrophoretic separation. The proportion volume filled by aliphatic side chains in a protein is known as the AI, and it shows how stably a protein reacts on a wide range of temperature. 24 Aliphatic index of TBK1-binding protein 1 is 70.37. The stability of a protein is determined by the II value; a value of less than 40 means that a protein is stable. 25 The II value of our protein is computed to be 90.54. This classifies the protein as unstable. According to this theory, the protein interacts with water molecules more efficiently if they have low GRAVY indices. Proteins with a GRAVY score above 0 are more possibly to be the hydrophobic proteins. 26 The GRAVY score of TBK1-binding protein 1 is −0.683, which shows its hydrophilic interactions.

Functional categorization

The functional characterization of protein was carried out by Pfam analysis. Tbk1/Ikki-binding domain (TBD) domain was found in the sequence of this protein. The TBD is a 40-amino acid domain able to bind kinases. If the protein families and domains are known, it is easy for correlation of their characteristics by comparing them with the other proteins with similar domains and families.

Prediction of protein secondary structure

The protein’s secondary structure was determined using SOPMA. According to its analysis, secondary structural elements consist of extended strands and beta sheets, followed by random coils and alpha helix. The results of SOPMA showed 615 amino acids, 45.37% alpha helix, 7.15% extended strands, 4.07% beta sheets, 43.41% random coils.

Protein model building and validation

The 3D structure of the proteins was modelled by Robetta, an ab initio approach, 15 I-TASSER, and AlphaFold (Figure 1A to C).

Figure 1.

Figure 1.

Three-dimensional (3D) structure predicted through (A) I-TASSER of targeted protein, (B) Robetta of targeted protein, and (C) AlphaFold of targeted protein.

The residues were categorized in the Ramachandran plot analysis based on their quadrangle regions. The graph’s red sections show the most permitted areas, whereas the yellow areas show permitted areas. Ramachandran plot was generated by PROCHECK for models developed using the Robetta, I-TASSER, and AlphaFold. Using Ramachandran map calculations conducted with the aid of the PROCHECK tool, the stereochemical quality of the predicted models and the quality of the protein models were assessed following the refinement process. In our studied protein, the total number of residues scattered in the most distributed area is greater than 85%, which shows the accuracy and high quality of the modelled structure. 27

The target protein model predicted through I-TASSER has 3.2 residues in the disallowed region and ERRAT value of 75.6388; it was validated through Ramachandran plot using SAVESv6.0 (Figure 2A).

Figure 2.

Figure 2.

The Ramachandran plot created by the PROCHECK server: (A) I-TASSER–predicted modelled protein, (B) Robetta-predicted modelled protein, and (C) AlphaFold-predicted modelled protein.

The model predicted through Robetta has 0.6 residues in the disallowed region and ERRAT value of 84.0604; it was also validated through Ramachandran plot using SAVESv6.0 (Figure 2B). The 3D structure of the protein predicted through AlphaFold has 2.0 residues in the disallowed region and ERRAT value of 94.6809 (Figure 2C).

SAVESv6.0 is the software we have used to evaluate the ERRAT value and Ramachandran plot of all the models. The model with the best Ramachandran plot was the model of Robetta.

Discussion

Proteins are important components of life and the knowledge about their structure can facilitate in the understanding about their function. Experimentally, structures of about 100 000 proteins have been determined so far. But actually, it represents a small segment of the billions of known protein sequences. The 3D structure prediction of protein depends solely on its amino acid sequences. 28 Understanding a protein’s structure enables researchers to plan site-directed alterations with the goal of altering function.

In computational biology, the information is reproduced efficiently because of the available databases. Efficient bioinformatics tools and methodologies were also developed to study the 3D structures of proteins and study their functions, which was an applied approach. Our study focused on the various bioinformatics tools to design a comparative structure of TBK1-binding protein 1.

TBK1-binding protein 1 as a gene is highly expressed in mature natural killer T cells and as a protein it interacts physically with the protein kinase TBK1. 29

TBK1-binding protein 1 forms complexes with different proteins such as NAP1 and TBK1 and performs various biological functions mainly involved in antiviral innate immunity, autophagy, phosphorylation, controlling viral infections, and signalling pathways. All the above functions were performed through the regulation of kinase activity. 30 Three-dimensional structure of TBK1-binding protein 1 predictions through different online tools (I-TASSER, Robetta, and AlphaFold) provides insight about structural and biological functions.

In this study, AlphaFold, Robetta, and I-TASSER were preferred techniques to determine the 3D structure of an undefined protein. Structures were substantiated using a tool called SAVESv6.0 when they have been ascertained. The finest models were then shortlisted on further refinement. The technology behind AlphaFold makes predictions about the protein’s underlying physical structure and assesses which predictions were realistic using internal confidence benchmarks.

The application of bioinformatics is a practical way to study proteins in detail. Briefly, the advancements help scientists to achieve understanding of important proteins. Despite all these exciting achievements by the application of bioinformatics in biotechnology, it is still a long way to achieve. Handling proteins was very challenging because of its complexity; there was a critical need for effective bioinformatics tools which were able to provide good coverage of plant proteins. For achieving this, an advanced algorithm development is important to enable data mining, comparison, and analysis. Therefore, bioinformaticians and experts with mathematical background could play an important role in bringing practical approaches and knowledge into bioinformatics, not only for the advancement in 3D structure determination, but also for further expression analysis. The achievements in bioinformatics were not only beneficial to the field of biotechnology, but will also contribute enormously to the future of humanity.

After unveiling 3D structure of novel protein, we concluded that 3D structure of protein will help us to find out its further role other than in antiviral innate immunity. In a nut shell, we apperceive that the best 3D structure of TBK1-binding protein 1 was predicted using Robetta software.

Footnotes

Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

  • 1.Kuhlman B, Bradley P. Advances in protein structure prediction and design. Nat Rev Mol Cell Biol. 2019;20:681-697. doi: 10.1038/s41580-019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Thompson MC, Yeates TO, Rodriguez JA. Advances in methods for atomic resolution macromolecular structure determination. F1000Res. 2020;9:F1000 Faculty Rev-667. doi: 10.12688/f1000research.25097.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Bai XC, McMullan G, Scheres SH. How cryo-EM is revolutionizing structural biology. Trends Biochem Sci. 2015;40:49-57. doi: 10.1016/j.tibs.2014.10.005 [DOI] [PubMed] [Google Scholar]
  • 4.Jaskolski M, Dauter Z, Wlodawer A. A brief history of macromolecular crystallography, illustrated by a family tree and its Nobel fruits. FEBS J. 2014;281:3985-4009. doi: 10.1111/febs.12796 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Wüthrich K. The way to NMR structures of proteins. Nat Struct Biol. 2001;8:923-925. doi: 10.1038/nsb1101-923 [DOI] [PubMed] [Google Scholar]
  • 6.wwPDB consortium. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 2019;47:D520-D528. doi: 10.1093/nar/gky949 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Mitchell AL, Almeida A, Beracochea M, et al. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res. 2020;48:D570-D578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.UniProt. https://www.uniprot.org
  • 9.Goncalves A, Bürckstümmer T, Dixit E, et al. Functional dissection of the TBK1 molecular network. PLoS ONE. 2011;6:e23971. doi: 10.1371/journal.pone.0023971 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Ryzhakov G, Randow F. SINTBAD, a novel component of innate antiviral immunity, shares a TBK1-binding domain with NAP1 and TANK. EMBO J. 2007;26:3180-3190. doi: 10.1038/sj.emboj.7601743 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Tooley AS, Kazyken D, Bodur C, Gonzalez IE, Fingar DC. The innate immune kinase TBK1 directly increases mTORC2 activity and downstream signaling to Akt. J Biol Chem. 2021;297:100942. doi: 10.1016/j.jbc.2021.100942 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Yang J, Zhang Y. I-TASSER server: new development for protein structure and function predictions. Nucleic Acids Res. 2015;43:W174-W181. doi: 10.1093/nar/gkv342 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y. The I-TASSER suite: protein structure and function prediction. Nat Methods. 2015;12:7-8. doi: 10.1038/nmeth.3213 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Varadi M, Anyango S, Deshpande M, et al. AlphaFold protein structure database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 2022;50:D439-D444. doi: 10.1093/nar/gkab1061 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kim DE, Chivian D, Baker D. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res. 2004;32:W526-W531. doi: 10.1093/nar/gkh468 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kumar KP, Vignesh P, Shanthinie A, et al. In-silico analysis and functional characterization of Rhizoctonia solani effector proteins. Int J Plant Soil Sci. 2022;34:1110-1117. doi: 10.9734/ijpss/2022/v34i2231474 [DOI] [Google Scholar]
  • 17.National Library of Medicine. Protein database. https://www.ncbi.nlm.nih.gov/protein
  • 18.ExPASy – ProtParam tool. https://web.expasy.org/protparam/
  • 19.https://www.ebi.ac.uk/Tools/pfa/pfamscan/
  • 20.https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_sopma.html
  • 21.http://robetta.bakerlab.org/
  • 22.https://saves.mbi.ucla.edu/
  • 23.Audain E, Ramos Y, Hermjakob H, Flower DR, Perez-Riverol Y. Accurate estimation of isoelectric point of protein and peptide based on amino acid sequences. Bioinformatics. 2016;32:821-827. doi: 10.1093/bioinformatics/btv674 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Sivakumar K, Balaji S. In silico characterization of antifreeze proteins using computational tools and servers. J Chem Sci. 2018;16:721-730. doi: 10.1016/j.jgeb.2018.08.004 [DOI] [Google Scholar]
  • 25.Gamage DG, Gunaratne A, Periyannan GR, Russell TG. Applicability of instability index for in vitro protein stability prediction. Protein Pept Lett. 2019;26:339-347. doi: 10.2174/0929866526666190228144219 [DOI] [PubMed] [Google Scholar]
  • 26.Magdeldin S, Yoshida Y, Li H, et al. Murine colon proteome and characterization of the protein pathways. Biodata Min. 2012;5:11. doi: 10.1186/1756-0381 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Gabler F, Nam SZ, Till S, et al. Protein sequence analysis using the MPI bioinformatics toolkit. Curr Protoc Bioinformatics. 2020;72:e108. doi: 10.1002/cpbi.108 [DOI] [PubMed] [Google Scholar]
  • 28.Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583-589. doi: 10.1038/s41586-021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Seiler MP, Mathew R, Liszewski MK, et al. Elevated and sustained expression of the transcription factors Egr1 and Egr2 controls NKT lineage differentiation in response to TCR signaling. Nature Immunol. 2012;13:264-271. doi: 10.1038/ni.2230 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Runde AP, Mack RSJPB, Zhang J. The role of TBK1 in cancer pathogenesis and anticancer immunity. J Exp Clin Cancer Res. 2022;41:135. doi: 10.1186/s13046-022 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Bioinformatics and Biology Insights are provided here courtesy of SAGE Publications

RESOURCES