Skip to main content
Journal of Genetic Engineering & Biotechnology logoLink to Journal of Genetic Engineering & Biotechnology
. 2018 Jul 7;16(2):731–737. doi: 10.1016/j.jgeb.2018.06.006

In silico structural homology modeling of nif A protein of rhizobial strains in selective legume plants

Sadam DV Satyanarayana a, MSR Krishna b, Pindi Pavan Kumar c,, Sirisha Jeereddy d
PMCID: PMC6353771  PMID: 30733794

Abstract

Symbiosis is a complex genetic regulatory biological evolution which is highly specific pertaining to plant species and microbial strains. Biological nitrogen fixation in legumes is a functional combination of nodulation by nod genes and regulation by nif, fix genes. Three rhizobial strains (Rhizobium leguminosarum, Bradyrhizobium japonicum, and Mesorhizobium ciceri) that we considered for in silico analysis of nif A are proved to be the best isolates with respect to N2 fixing for ground nut, chick pea and soya bean (in vitro) out of 47 forest soil samples. An attempt has been made to understand the structural characteristics and variations of nif genes that may reveal the factors influencing the nitrogen fixation. The primary, secondary and tertiary structure of nif A protein was analyzed by using multiple bioinformatics tools such as chou-Fasman, GOR, ExPasy ProtParam tools, Prosa -web. Literature shows that the homology modeling of nif A protein have not been explored yet which insisted the immediate development for better understanding of nif A structure and its influence on biological nitrogen fixation. In the present predicted 3D structure, the nif A protein was analyzed by three different software tools (Phyre2, Swiss model, Modeller) and validated accordingly which can be considered as an acceptable model. However further in silico studies are suggested to determine the specific factors responsible for nitrogen fixing in the present three rhizobial strains.

Keywords: Nif A protein, In silico analysis, Nitrogen fixation, Bioinformatic tools, Homology modeling

1. Introduction

Plant microbial interaction is an ever expanding domain in the ecosystem. Circadian emergence and exploration of novel species day by day is broadening the scope and pertinence of microorganism. One such well ascertained biological process that persists through ages in science is symbiosis. Symbiosis perhaps is a complex and differentiating process that unveil its functional specificity pertaining to evolution and exploration of its partnering rhizobial strains [1]. This could be the probable reason why understanding symbiosis is a perpetual research.

The typical process of nitrogen fixing is facilitated and regulated by three important genes i.e. Nif, Nod and Fix genes with the aid of rhizobium in the nodules of leguminous plants [2]. Nif genes are diversified and unusually found in nitrogen fixing bacteria. Nitrogen fixation is a complex mechanism; not any single gene involved in the whole process but there are several nif genes with their specific function in nitrogen fixation, assimilation and regulation [3]. These nif genes are also found on symbiotic bacterial plasmids along with nod genes [4]. Nif genes code for proteins that are essential to fix and regulate nitrogen in legumes; nitrogenase being one among them [5]. In most of the diazotroph organisms, the nitrogen fixation genes (Nif) transcription is driven by RNA polymerase which is an alternative holoenzyme and also have a need of nif A activator protein. Environmental effectors usually regulate the activity and synthesis of nif A genes. Oxygen and ammonia are the two major signals which regulate the nitrogen fixation at the extent of nif genes [6].

Nif A plays a major role in transcriptional activation and controls the expression of nitrogenase structural genes, genes encoding accessory functions with the association of RNA-polymerase sigma factor Rpo N [7]. Bacterial conversion of Nitrogen (N2) to ammonia (NH3) an energetically expensive process and very sensitive to oxygen (O2) [8]. To create a favorable environment within the nodule tissue a specialized plant cells acts as oxygen barriers. Furthermore nodulin, leghemoglobin makes the low oxygen concentration by reversibly binding the oxygen. In bacteria transcription of nitrogen fixing genes largely induced at low oxygen levels [9]. Under reducing, nitrogen-limiting conditions, NifA is released from NifL to activate transcription at nif promoters.

In vitro analysis of Rhizobium leguminosarum, Bradyrhizobium japonicum, Mesorhizobium Ciceri in ground nut, soya bean and Chick pea respectively showed highest plant growth when compared to the rest of the rhizobacteria. Biochemical tests for the respective root nodules showed elevated levels of nitrogen in all the three Rhizobium leguminosarum, Bradyrhizobium japonicum, Mesorhizobium Ciceri [10]. Further molecular analysis of nif genes by polymerase chain reaction (PCR) showed prominent appearance of nif A band where as other nif genes are with faint or no bands (data not shown). With the evidence of in vitro studies (elevated levels of nitrogen and ACC, IAA plant growth hormones) we further extended the research to understand the role of nif A genes in nitrogen fixation by using in silico model.

The availability of structural model of a protein is one of the keys for understanding biological processes at a molecular level. However, very little is known about the structure and role of nif A proteins. Identification of the 3D structure of a protein is very difficult and complex assignment. Generally two techniques X-ray crystallography or NMR (Nuclear Magnetic Resonance) are used, which are time consuming and expensive [11]. In this regard, a viable alternative approach is to predict the in silico 3D structure of proteins based on homology modeling technique serves the purpose with better validation. Homology Modeling is known to be one of the best and extensively used methods where in the alignment of know protein structures (templates) was done with the unknown protein sequence which has more than 35% of similarity [12].

Sequences of all Nif A proteins are roughly of similar lengths, varying between 519 (R. leguminosarum) and 605 (B. japonicum) amino acids, except that of M. Ciceri, which has only 352 amino acid. Besides, in most rhizobia the nif A gene is subjected to transcriptional regulation although the mechanisms vary depending on the rhizobial strain. Nif A is a three-domain protein [13], with a central domain of about 220 amino acids which is sufficient by itself to activate transcription [14]. The N-terminal domain function is unknown in Nif A, and is absent in M. Ciceri. Whereas the C-terminal domain contains a helix–turn–helix motif that is helps in binding to the upstream activator sequence (UAS) [15], [16]. Since the central domain plays a major role in activating the promoter region of nif A, the sequences of the central domains of B. japonicum, R. leguminosarum, M. Ciceri were compared. For better explanation of the mechanism behind nitrogen fixation we checked the in silico protein structural variation and conserved amino acid sequences by modeling primary, secondary and tertiary structures. However, tertiary structures of large number of nitrogenase proteins from different diazotrophs particularly those of symbiotic ones has not yet been resolved. Therefore, there is a need to model a tertiary structure of the Nif A for further understanding of transcriptional activity.

2. Materials and methods

2.1. Nif-A protein sequences of rhizobia

The nif A protein sequences of B. japonicum, R. leguminosarum, M. Ciceri were retrieved from Uniprot [17], a freely accessible resource of protein sequence and functional information (Table 1). The Accession No: for each organism was Q9AMY3 for B. japonicum, P09828 for R. leguminosarum and A0A165VD05 for M. ciceri.

Table 1.

The protein sequence retrieved from the UniProt.

Gene name Length modeled in complete sequence Uniprot Id Organism
Nif A 253–601 Q9AMY3 Bradyrhizobium japonicum
Nif A specific regulatory protein 177–519 P09828 Rhizobium leguminosarum
Transcriptional regulator Nif A 1–351 A0A165VD05 Mesorhizobium Ciceri

2.2. Physico-chemical characteristics

To analyze the physical and chemical characteristics such as molecular weight, theoretical pI, amino acid composition, atomic composition, extinction coefficient, estimated half life, instability index, aliphatic index, and Grand Average of Hydropathicity (GRAVY) of the nif A protein, was computed by ProtParam tool [18] (Table 2).

Table 2.

Physicochemical properties of Nif A protein (M. wt.: Molecular weight; pI: Isoelectric point; −R: Number of negative residues; +R: Number of positive residues; EC: Extinction coefficient at 280 nm; II: Instability index; AI: Aliphatic index; GRAVY: Grand Average Hydropathy).

S.No Name of the organism M. Wt. Seq. length pI EC (assuming all pairs of Cys residues form cystine) EC (assuming all Cys residues are reduced) Half Life (hrs) II GRAVY −R +R AI
1 B. japonicum 38383.33 353 9.30 18,825 18,450 20 45.32 −0.085 39 48 95.92
2 R. leguminosarum 36926.28 343 8.96 10,470 9970 20 30.65 −0.121 38 45 95.10
3 M. Ciceri 38765.72 352 9.13 17,460 16,960 30 37.40 −0.261 43 52 90.43

2.3. Secondary structure predictions of Nif-A protein

To predict the secondary structural predictions of the nif A protein Chou-Fasman server [19] and GOR [20] was employed and the results were tabulated in Table 3. The method implemented secondary structure predictions based on the analysis of relative frequencies of each amino acid in helices, sheets and turns anchored in the solved X-ray crystallographic protein template [21].

Table 3.

Percentage of amino acids present in nif A protein estimated by UniProt software.

S.No Amino acids B. japonicum (diazoefficiens) R. leguminosarum M. ciceri
1 A (Ala) 11.0% 12.2% 11.4%
2 R (Arg) 8.2% 7.9% 8.0%
3 N (Asn) 2.5% 4.7% 3.1%
4 D (Asp) 2.8% 5.2% 4.3%
5 C (Cys) 1.7% 2.6% 2.3%
6 Q (Gln) 2.5% 3.8% 4.0%
7 E (Glu) 8.2% 5.8% 8.0%
8 G (Gly) 6.5% 8.7% 6.8%
9 H (His) 0.8% 1.2% 1.4%
10 I (Ile) 4.8% 5.5% 5.1%
11 L (Lue) 11.0% 11.4% 10.5%
12 K (Lys) 5.4% 5.2% 6.8%
13 M (Met) 1.1% 0.9% 1.4%
14 F (Phe) 3.7% 3.8% 3.7%
15 P (Pro) 5.9% 2.6% 4.5%
16 S (Ser) 8.5% 5.8% 4.3%
17 T (Thr) 5.1% 5.5% 6.5%
18 W (Trp) 0.6% 0.3% 0.6%
19 Y (Tyr) 1.4% 0.9% 1.1%
20 V (Val) 7.9% 5.5% 6.2%

2.4. Nif-A protein model building and evaluation

The linear amino acid sequence of nifa protein of 3 different rhizobia retrieved from protein sequence database of uniprot (http://www.uniprot.org) [17] (The Accession No: for each organism was Q9AMY3 for B. japonicum, P09828 for R. leguminosarum and A0A165VD05 for M. ciceri). To produce the tertiary structures of proteins, templates were selected from PDB (Protein Data Bank) [22] by using BLASTp algorithm [23]. Sequences of proteins that are more similar to the query sequence, were selected as templates. The modeling of the three dimensional structure of the proteins were performed by three homology modeling programs, Phrye2 [24], Swissmodel [25] and Modeller [26]. For the constructed 3D models energy minimization was performed to minimize steric collisions and strains without significantly altering the overall structure. Energy computations and minimization were carried out using the GROMOS96 force field [27] and implementing Swiss-PDB Viewer. After optimization the 3D model were verified using the rampage [28] and ProsA programs. PROSA web server is used to validate the modeled protein structure with available protein structure derived from PDB on the basic of z-score. Rampage server used for the validation of 3d structure modeled by plotting Ramachandran plot [29], Solvent Accessible area etc.

3. Results and discussion

3.1. Predicted primary protein sequence characterization of nif A gene in B. japonicum, R. leguminosarum, M. ciceri

The nif A protein sequences of the selected rhizobia (B. japonicum, R. leguminosarum, and M. Ciceri) were retrieved from the UniProt software [17]. The details of the unique ID’s of Nif A for all the three species considered for further analysis are provided in Table 1. UniProt is a universally acceptable database for the researchers to identify their specific protein's knowledge regarding quality, richness, and accuracy with wide-range cross references and querying interfaces freely accessible [30].

The primary structure was analyzed, and different parameters were computed using ExPasy ProtParam tool was tabulated in Table 2, Table 3 [31]. The results suggested that the average molecular weight of Nif A proteins calculated is 38025.11 Da. Although the Expasy’s ProtParam computes the extinction coefficient for a range of (276, 278, 279, 280 and 282 nm) wavelength, 280 nm is favored, because proteins absorb strongly there while other substances commonly in protein solutions do not. The extinction coefficient of Nif A proteins at 280 nm was 18825, 10470, 17460 M−1 cm−1 in B. japonicum, R. leguminosarum and M. Ciceri with respect to the concentration of Cys, Trp and Tyr (Table 3). The extinction coefficient of B. japonicum is comparatively high due to the high concentration of Tyr (1.4%). The computed protein concentration and extinction coefficients help in the quantitative study of protein-protein and protein-ligand interactions in solution [32].

The instability index value for the Nif A proteins of B. japonicum, R. leguminosarum, and M. Ciceri were found to be 45.32, 30.65, 37.40, respectively. If instability index is below 40 then the protein is predicted as stable and above 40 it may be unstable [33]. Therefore nif A protein of R. leguminosarum, and M. Ciceris were found to be stable. The stable and compact condition of a protein (the pH at which the surface of the protein is charged while the net charge of the protein is Zero) is called the isoelectric point. The computed pI values of B. japonicum, R. leguminosarum, and M. Ciceri were 9.30, 8.96, 9.13 respectively which are more than 7, proving the alkaline nature of nif A protein. The computed isoelectric point (pI) will be useful for developing buffer systems for purification of the recombinant proteins by the isoelectric focusing method [34]. The total number of negatively charged residues (Asp + Glu) and total number of positively charged residues (Arg + Lys) are 39, 48 in B. japonicum followed by 38, 45 in R. leguminosarum and 43, 52 in M. Ciceri respectively. Since the total negatively charged residues are comparatively lesser than the positively charged, it is understood that the protein is intercellular.

The half life of nif A protein sequence of B. japonicum, and R. leguminosarum was found to be 30 h with all the three domains where as it is 20 h in the absence of amino terminal domain. In M. Ciceri as the amino terminal is absent the half life is expected to be lesser, but interestingly it is found to be 30 h. Based on this prediction without amino terminal these two proteins were less stable.

The aliphatic index of a protein is defined as the relative volume occupied by aliphatic side chains, which include alanine, valine, isoleucine, and leucine, and contributes to protein thermostability [35]. The aliphatic index for the nitrogen fixing protein sequences were 95.92, 95.10, 90.43 for B. japonicum, R. leguminosarum and M. Ciceri respectively. The aliphatic index of nif A proteins results revealed that they are stable for a wide range of temperatures [36]. The Grand Average hydropathy (GRAVY) indices of nif A were −0.085, −0.121, −0.261 in B. japonicum, R. leguminosarum and M. Ciceri respectively. The Grand Average hydropathy (GRAVY) value for a peptide or protein is calculated as the sum of hydropathy values of all the amino acids, divided by the number of residues in the sequence [37]. This estimated low range values of nif A proteins were predicting that they are hydrophilic, possibility of better interaction with water.

All the protein polypeptide chains were prearranged with 20 amino acids. Each amino acid has its own characteristic to perform specific function of the protein. The percentage of polarity, charge, aliphatic and aromatic nature of proteins are vary based on their location and function. Phosphorylation is a vital procedure through which signaling pathways function. Three major amino acid residues namely Serine, Threonine and Tyrosine are mostly phosphorylated, as they contain hydroxyl group in their side chain and thus are capable of binding phosphate group [38]. All the 20 amino acids were estimated by using ProtoParm in which the highest percentage of amino acid is found in Alanine with 11.0, 12.2, 11.4 fallowed by Leucine with 11.0, 11.4, 10.5 and the lowest being typtophan with 0.6, 0.3, 0.6 in B. japonicum, R. leguminosarum, M. ciceri respectively (Table 3).

3.2. Prediction and characterization of nif A secondary structures of B. japonicum, R. leguminosarum and M. ciceri

The prediction of the secondary structure of Nif A proteins were evaluated by using chou-Fasman method [39] and GOR tools [40]. In our designed secondary structure of nif A protein, alpha helices were showing 43.6, 48.1, 47.44 percent in B. japonicum, R. leguminosarum, R. Ciceri respectively. It is followed by Random coils 41.36, 43.73, 41.76 and extended strands 15.58, 8.16, 10.80 (Table 4). Random coils have important functions in proteins for flexibility and conformational changes such as enzymatic turnover (reference). Our nif A protein revealed that the predominant nature of helix and coiling understood that the protein was more compact and strong bonded. As the globular structure and coiling nature of the protein assumed that our nif A protein is present in transmembrane region.

Table 4.

Prediction of secondary structure of nif A by Chou-Fasman method.

B. japonicum (diazo)
R. leguminosarum
R. ciceri
Length Percentage (%) Length Percentage (%) Length Percentage (%)
Alpha helix (Hh) 152 43.6 165 48.10 167 47.44
310 helix (Gg) 0.00 0.00 0.00 0.00 0.00 0.00
Pi helix (Ii) 0.00 0.00 0.00 0.00 0.00 0.00
Beta bridge (Bb) 0.00 0.00 0.00 0.00 0.00 0.00
Extended strand (Ee) 55 15.58 18 8.16 38 10.80
Beta turn (Tt) 0.00 0.0 0.00 0.00 0.00 0.00
Bend region (Ss) 0.00 0.00 0.00 0.00 0.00 0.00
Random coil (Cc) 146 41.36 150 43.73 147 41.76
Ambiguous states 0.00 0.00 0.00 0.00 0.00 0.00
Other states 0.00 0.00 0.00 0.00 0.00 0.00

In most of the nitrogen-fixing bacteria, the NifA protein binds to an upstream activating sequence (UAS) and acts in association with the RNA polymerase sigma factor RpoN (σ54) to activate nif gene expression and, in rhizobia, the expression of several other symbiotic genes. The Nif A protein of B. japonicum and R. leguminosarum are composed of three domains: an amino (N)-terminal domain of unknown function, a central catalytic domain, and a carboxy (C)-terminal DNA-binding domain, but in M. ciceri amino (N)-terminal domain is absent. Between the central and the DNA binding domains, interdomain linker region was conserved in all these three rhizobial species. A few predictions have been made to find out the probable function of specific domains based on the comparison of amino acid sequence of Nif A proteins in three rhizobial species. A comparative low percentage of homology has been identified in the N-terminal region of B. japonicum and R. leguminosarum. A very high conservation in the sequence has been observed in the long central domain proposed to be responsible for the interaction with the RNA polymerase and/or with σ54 (Fig. 1). A region of considerable homology close to the C-terminus has been found, containing helix-turn-helix motif characteristic of DNA binding proteins.

Fig. 1.

Fig. 1

Nif A protein sequence of Bradyrhizobium japonicum (Q9AMY3), Rhizobium leguminosarum (P09828), Mesorhizobium ciceri (P09828) by UniProt software.

Between Phenylalanine-465 and Alanine-480 there are 15 identical amino acids in the three NifA sequences, five conserved cysteine residues at positions 310, 463, 475, 495 and 500. Role of the cysteines might be the binding of a cofactor (covalently bound heme or a complex [Fex:SX]- cluster) which is essential for Nif A activity of B. japonicum and R. leguminosarum. Proteins of this class also contain an additional invariant cysteine residue in the AAA+ domain. The presence of cysteine residues seems to correlate with the oxygen sensitivity of nif A proteins. This might suggest a model in which metal ion coordination to the cysteine residues control the activity of these proteins in response to the redox status.

3.3. D Modeling of nif A tertiary structure

There is a lack of experimental structures for nif A proteins considered. Out of the three domains of nif A protein, 3D structure was modeled for a central catalytic domain, and a carboxy (C)-terminal DNA-binding domain. The modeling of three dimensional structure of protein was performed by three homology modeling programs, Phyre2, Swiss and Modeller. The φ and ψ distribution of the Ramachandran Map generated by non glycine, non proline residues were summarized in Table 5. A comparison of the results obtained from the Phyre2, Swiss and Modeller, three different software tools were showed that the models generated by Modeller was more acceptable when compared to Phyre2 and Swiss Models.

Table 5.

Ramachandran plot calculation using rampage server.

Server Ramachandran plot calculation Bradyrhizobium japonicum Rhizobium leguminosarum Mesorhizobium ciceri
Phyre2 Number of residues in favoured region 85.3% 92.1% 93.4%
Number of residues in allowed region 8.5% 6.5% 4.6%
Number of residues in outlier region 6.2% 1.4% 2.0%
Swiss model Number of residues in favoured region 96.5% 92.9% 92.3%
Number of residues in allowed region 3.1% 6.3% 6.5%
Number of residues in outlier region 0.4% 0.8% 1.1%
Modeller Number of residues in favoured region 95.0% 93% 93.1%
Number of residues in allowed region 3.0% 3.7% 3.7%
Number of residues in outlier region 2.0% 3.4% 3.1%

The stereo chemical quality of the predicted models and accuracy of the protein model was evaluated after the refinement process using Ramachandran Map calculations computed with the Rampage program. The assessment of the predicted models generated by modeller was shown in Fig. 2a–c. The main chain parameters plotted are Ramachandran plot quality, peptide bond planarity, Bad non bonded interactions, main chain hydrogen bond energy, C-alpha chirality and over-all G factor. In the Ramachandran plot analysis, the residues were classified according to its regions in the quadrangle. The 3 Dimentional proteins designed for Nif A of B. japonicum, R. leguminosarum and M. ciceri were analyzed by modellar software and the results revealed that the allowed regions of residues are 96.8, 93, 93.1% respectively. The distribution of the main chain bond lengths and bond angles were found to be within the limits for these proteins. Such figures assigned by Ramachandran plot represent a good quality of the predicted models.

Fig. 2.

Fig. 2

Ramachandran plots for the three bacteria. (a) Bradyrhizobium japonicum, (b) Rhizobium leguminosarum, (c) Mesorhizobium ciceri.

The modeled structures of Nif A proteins were also validated by other structure verification servers, Prosa -web [41]. In the designed 3D protein, standard bond angles of the three models were determined using Prosa -web and the results were shown in Table 6. The predicted structures conformed well to the stereochemistry indicating reasonably good quality. After model building, the structure was validated through energy minimization with Z-Score by using Prosa Web [42]. The Rampage-score provides an estimate of the absolute quality of a model by comparing it to same sized reference structures present in the PDB and solved by experimental techniques. Z-score was used to estimate the ‘degree of nativeness’ of the predicted structure. Z-score for modeled energy minimized PDB structure from Phyre2, Swiss and Modeller servers were −7.88, −7.43, −7.31, for B. japonicum, −8.26, −6.46, −7.18 for R. leguminosarum and −7.02, −6.18, −6.92 for M. ciceri respectively (Table 6). In this paper all the three i.e. Phyre2, Swiss and Modeller servers are showing similar values.

Table 6.

Z-scores for overall model quality using Prosa-web.

Accession No Phyre2 Swiss model Modeller
Q9AMY3 −7.88 −7.43 −7.31
P09828 −8.26 −6.46 −7.18
A0A165VD05 −7.02 −6.18 −6.92

4. Conclusion

To achieve optimistic results in biological nitrogen fixation a deep understanding of protein at structural level is essential. In silico studies provides an opportunity to accomplish the structural modeling and analysis of any protein. In the present study, Nif A sequences of B. japonicum, R. leguminosarum and M. ciceri were selected to determine the physicochemical properties and various protein structure levels using in silico techniques. Primary structure analysis revealed that most of the Nif A employed in the current study was hydrophilic in nature and presence of cysteine residues seems to correlate with the oxygen sensitivity of these proteins. The secondary structure analysis confirmed that in most of the sequences, alpha helix dominated followed by an random coil, extended strand and beta turns. Tertiary structure predictions were analyzed by three different homology servers Phyre 2, Swiss model and Modeller. The models were validated by protein structure checking tool called Rampage. Out of three servers our results revealed that the Modellar is acceptable in silico tool for the designed Nif A protein. We hope that our future studies with the quaternary structure of Nif A protein will provide a better incite of exact or most probable molecular mechanisms involved in nitrogen fixation in the present three rhizobial strains. One of the challenging research goals in the future is to elucidate the mechanism whereby the Nif A protein ultimately responds to the redox state in the cell.

Acknowledgement

No public or private financial support has been taken for publishing this project. The project was conducted on self finance mode. Authors thank Dr. K. Srinivasulu, Head, Dept of Biotechnology, KL University for his support in research. Express gratitude to Dr. Vijaya Saradhi for his moral support and inspiration.

Conflict of interest

The author(s) certify that there is no conflict of interest with any financial/research/academic organization, with regards to the research work discussed in the manuscript.

Footnotes

Peer review under responsibility of National Research Center, Egypt.

References


Articles from Journal of Genetic Engineering & Biotechnology are provided here courtesy of Academy of Scientific Research and Technology, Egypt

RESOURCES