Skip to main content
3 Biotech logoLink to 3 Biotech
. 2020 Aug 8;10(9):383. doi: 10.1007/s13205-020-02375-2

Structural insight of two 4-Coumarate CoA ligase (4CL) isoforms in Leucaena suggests targeted genetic manipulations could lead to better lignin extractability from the pulp

Himanshu Shekhar 1, Gaurav Kant 1, Rahul Tripathi 1, Shivesh Sharma 1, Ashutosh Mani 1, N K Singh 1, Sameer Srivastava 1,
PMCID: PMC7415054  PMID: 32802725

Abstract

4‐Coumarate: coenzyme A ligase (4CL) is a key enzyme involved in the early steps of the monolignol biosynthetic pathway. It is hypothesized to modulate S and G monolignol content in the plant. Lignin removal is imperative to the paper industry and higher S/G ratio governs better extractability of lignin and economics of the pulping process. This background prompted us to predict 3D structure of two isoforms of 4CL in Leucaena leucocephala and evaluate their substrate preferences. The 3D structure of Ll4CL1 and Ll4CL2 protein were created by homology modeling and further refined by loop refinement. Molecular docking studies suggested differential substrate preferences of both the isoforms. Ll4CL1 preferred sinapic acid (− 4.91 kcal/mole), ferulic acid (− 4.84 kcal/mole), hydroxyferulic acid (− 4.72 kcal/mole), and caffeic acid (− 4.71 kcal/mole), in their decreasing order. Similarly, Ll4CL2 preferred caffeic acid (− 6.56 kcal/mole, 4 H bonds), hydroxyferulic acid (− 6.56 kcal/mole, 3 H bonds), and ferulic acid (− 6.32 kcal/mole) and sinapic acid (− 5.00 kcal/mole) in their decreasing order. Further, active site residues were identified in both the isoforms and in silico mutation and docking analysis was performed. Our analysis suggested that ASP228, TYR262, and PRO326 for Ll4CL1 and SER165, LYS247 and PRO315 for Ll4CL2 were important for their functional activity. Based on differential substrate preferences of the two isoforms, as a first step towards genetically modified Leuaena having the desired phenotype, it can be proposed that over-expression of Ll4CL1 gene and/or down-regulation of Ll4CL2 gene could yield higher S/G ratio leading to better extractability of lignin.

Electronic supplementary material

The online version of this article (10.1007/s13205-020-02375-2) contains supplementary material, which is available to authorized users.

Keywords: Leucaena leucocephala, 4-Coumarate CoA ligase, High S/G ratio, In silico mutational analysis, Differential substrate preference

Introduction

Etymologically, the word ‘lignin’ finds its origin in the Latin word ‘lignum’ meaning wood. It is a non-sugar aromatic polymer produced by oxidative combinatorial coupling and is found in the secondary cell wall of plants, where it fills the gaps with its cross-linked macromolecular structure (MW > 10,000). Lignocellulosic biomass contains ~ 30% lignin by weight and ~ 40% by energy (Beauchet et al. 2012). Lignin with its natural hydrophobicity is a phenolic macromolecule made up of three main phenylpropane units (monolignols) namely Coniferyl alcohol (G), Sinapyl alcohol (S), and minor amounts of p-Coumaryl alcohol (H) (Beauchet et al. 2012). When these monolignols are incorporated into lignin polymer they are called as p-hydroxyphenyl (H) units, guaiacyl (G) and syringyl (S) units, respectively (Vanholme et al. 2010). Gymnosperms are richer in G-unit lignins with only some notable exceptions (Uzal et al. 2009), while angiosperm dicots are composed of G-, and S-units. Softwood, compression wood and grasses are reported to have higher levels of H-units (Boerjan et al. 2003).

4 Coumarate: CoA ligase (4CL) is a plant-specific phenylpropanoid pathway enzyme which catalyzes the formation of a hydroxycinnamate-CoA thioester, a precursor of lignin and other phenylpropanoids. 4CL catalyzes this reaction in two steps, first hydroxycinnamate-AMP is formed which then undergoes nucleophilic substitution of AMP by CoA (Hu et al. 2010). Multiple isoforms of 4CL with different substrate specificities are also reported to be expressed differentially in different tissues and at different developmental stages (Voo et al. 1995; Ehlting et al. 1999, Hamberger and Hahlbrock 2004).

Lignin is often an undesired product in pulp making due to its dark brown to black color. Removal of lignin from plant biomass is a costly process, hence research efforts are now aimed at designer plants (genetically engineered) that either deposit less lignin or produce lignin amenable to chemical degradations (Weng et al. 2008; Stewart et al. 2009). It has been found that the increase in S/G ratio leads to improved pulping economics (Weng et al. 2008; Li et al. 2008, Verma and Dwivedi 2014). As mentioned earlier, 4CL generally exists in multiple isoforms with different substrate specificities (Ehlting et al. 1999). Phenylpropanoid pathway with more than one 4CLs (isoforms) having different substrate preferences could be hypothesized to modulate the monolignol biosynthesis and, therefore, speculated to be responsible for manipulating the overall content of synthesized lignin (Kajita et al. 1996; Lee et al. 1997; Wagner et al. 2009; Xu et al. 2011). Theoretically, if one isoform of Leucaena (say Ll4CL1) with its specific substrate preference (e.g. sinapic acid) could increase the synthesis of S monolignol unit and alternatively Ll4CL2 having a preference for other substrate (e.g. caffeic acid) could increase the accumulation of more G monolignol, then manipulation of these two isoforms in a genetically modified Leucaena plant could result in a designer plant with the desired phenotype i.e. higher S/G lignin content (Fig. 1).

Fig. 1.

Fig. 1

Predicted flux change in monolignol biosynthesis for higher S/G ratio: Theoretically, the overall flux of monolignol biosynthesis could be directed towards higher S unit, if Ll4CL1 (with preference to Sinapic acid) is up-regulated and Ll4CL2 (with preference to caffeic acid) is down-regulated. It would result in overall lignin content with higher S unit and, therefore, better extractability

In this paper, we attempted to identify the preferred substrates of the two 4CL isoforms already reported in Leucaena leucocephala (Accession No. FJ205490 & FJ205491). In silico mutation of active site residues and molecular docking studies with different substrates identified the most important active site residues in both the isoforms. Our results predict that the manipulation of 4CL isoform in the desired way could lead to high S/G ratio. It could be the first step towards designer L. leucocephala with high S/G lignin phenotype. This could enable easy extractability of lignin from pulp and reduce the use of chemical bleach and energy, making the process more environmentally sustainable.

Materials and methods

Sequence retrieval and phylogenetic analysis

Protein sequence of Ll4CL1 (Genbank accession: FJ205490) & Ll4CL2 (GenBank accession: FJ205491) from L. leucocephala was obtained from the Universal protein resource (UniProt). This sequence was subjected to a Basic Local Alignment Search Tool (BLAST) to yield similar protein sequences (Altschul et al. 1990). The structure coordinate files of similar proteins were obtained in the Protein Data Bank (PDB)-format and saved for template identification at later stages. Multiple Sequence Alignment (MSA) was performed using ClustalW 2.0.10 program (Larkin et al. 2007) and Molecular Evolutionary Genetic Analysis (MEGA 5.2) was used for constructing and analyzing the phylogenetic tree (Tamura et al. 2011). Neighbor-Joining (NJ) method was used to establish an evolutionary relationship (Saitou and Nei 1987). The bootstrap consensus tree inferred from 10,000 replicates was taken to represent the evolutionary history of the taxa analyzed. Branches corresponding to partitions reproduced in less than 50% bootstrap replicates are collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (10,000 replicates) are shown next to the branches. The evolutionary distances were computed using the Poisson correction method and are in the units of the number of amino acid substitutions per site.

Multiple template homology modeling

Position-Specific Iterative BLAST (PSI-BLAST) algorithm was used to search for homologous sequences having experimentally solved 3D-structures against the entire PDB database (Altschul et al. 1997). Proteins having PDB-Id 3A9U and 3TSY were chosen as a template. Using these templates multi-template modeling was performed by MODELLER9.15 software (Šali and Blundell 1993). Further multiple models were built which were then subsequently evaluated and refined on the basis of residues in disallowed regions and z-score (Marti-Renom et al. 2000).

Protein structure optimization, quality assessment and visualization

Five preliminary models generated by MODELLER were ranked on the basis of their Discrete Optimized Potential Energy (DOPE) scores and z-scores obtained from Protein Structural Analysis webserver (ProSA) (Wiederstein and Sippl 2007). The model with the lowest DOPE-score and least number of residues in the disallowed region of Ramachandran plot was selected for further study.

Loop refinement and molecular dynamics simulations

MODLOOP server was used to remodel the loops in the selected model by satisfying spatial restraints without relying on any database of known protein structures (Fiser and Sali 2003). Loops thus modeled were constantly checked through PDBsum Generate server until no residues fall in the disallowed region. Once the final model was generated molecular dynamics simulation was performed by GROMACS to check the stability of the modeled protein. The simulations were performed for 20/50 ns with default parameters.

Active site prediction

Active sites in the refined model were found using 3DLigandsite server-a server for predicting ligand binding sites (Wass et al. 2010). It superimposes ligands bound to structures similar to query on the selected model to predict possible binding sites. It also provides an insight into sequence conservation, but this information is not used in the current predictive process.

Molecular docking and mutational analysis of active site

Pre-docking steps were performed using AutoDock4.2 tools to prepare ligand and macromolecule files followed by preparation of grid parameter files (Truhlar 2009). The grid was prepared in such a way that the active site residues identified by 3DLigandsite server lie at the center of the grid. Flexible docking algorithm using autodock4.2 was done to check the interaction between two 4CL enzyme isoforms and different known substrates such as caffeic acid, coumaric acid, ferulic acid, Hydroxyferulic acid, Sinapic acid and certain small molecules such as ADP and ATP. Average grid points used were (100,110,100) and grid spacing 0.200 Å. Lamarckian Genetic Algorithm (LGA) was used to obtain the.dlg file as output which can be further analyzed using analysis suite of AutoDock4.2 tools. Conservative substitution in both the isoforms of 4CL was done to identify the most significant residue(s) in the above identified probable active site comprising of 14 residues for Ll4CL1 and 11 residues for Ll4CL2. For all the mutations, original residues were replaced with a similar type of amino acids such as ASP228GLU in Ll4CL1 (Aspartic acid 228th residue in Ll4CL1 replaced with Glutamic Acid). Those residues whose mutation affects the binding energy to a greater extent are selected as the most probable and significant active site residues.

Results

Physico-chemical properties of Ll4CL isoforms and their evolutionary relationship

Ll4CL1 (Nucleotide Accession No. FJ205490) is a 542 amino acid protein with a molecular weight of 58.87 KDa, pI of 5.56. Ll4CL2 (Nucleotide Accession No. FJ205491) is a 519 amino acid protein with a molecular weight of 56.67 KDa, pI of 6.27. Sequences of both the protein isoforms were used for this study and subjected to BLAST search using protein-PSI-BLAST program (with default parameters) against non-redundant databases in National Center for Biotechnology Information (NCBI) to yield similar protein sequences which are shown in Supplementary Table 1. BLASTp suite (PSI-BLAST threshold = 0.005) of BLAST was used to identify the closest match for each isoform. Ll4CL1 shared 74% identities with query cover of 98% with 4CL from Cajanus cajan (Accession No. XP_020225439.1) and Ll4CL2 shared 92% identities with 99% query cover with 4CL from Acacia koa (Accession No.ACI23349.1). To perform MSA, ClustalW2.0.10 program was used and the evolutionary relation was established using MEGA 5.2 software. The phylogenetic analysis involved 26 amino acid sequences of Ll4CL1 isoform and 20 amino acid sequences of Ll4CL2. All positions containing gaps and missing data were eliminated. There were a total of 387 positions and 82 positions in the final datasets for Ll4CL1 and Ll4CL2, respectively. The phylogenetic trees for Ll4CL1 and Ll4CL2 are represented in Fig. 2, which demonstrate that Ll4CL1 and Ll4CL2 were closely grouped with dicotyledonous plants species. The phylogenetic tree for Ll4CL1 can be broadly subdivided into four clusters (A1–A4). Ll4CL1 lies in clade A1.1. It shows the highest homology with 4CL1 from Sorbus aucuparia (ADF30254.1) which shows 66% identity with Ll4CL1 at a query cover of 100% and most distantly related to 4CL1 from Epimedium sagittatum (AIS92505.1) which is a eudicot. The phylogenetic tree for Ll4CL2 is divided into three major clades (B1–B3). Ll4CL2 lies in clade B3 and shows the highest homology with Ruta graveolens (ABY60843.1 4) also a eudicot with 75% identity at a query cover of 99% and is most distantly related to Arabidopsis thaliana (ANM65052.1).

Fig. 2.

Fig. 2

Phylogenetic analysis of Ll4CL1 and Ll4CL2 isoforms: The left panel shows the phylogenetic tree for Ll4CL1 and right panel shows the phylogenetic tree for Ll4CL2 as obtained by MEGA software. Neighbor-Joining (NJ) method was used to establish an evolutionary relationship. The bootstrap consensus tree inferred from 10,000 replicates. Ll4CL1 was found to be similar to Panicum virgatum 4CL1 and was grouped in cluster A1.1. Similarly, Ll4CL2 was grouped in B3 cluster along with 4CL2 of Ruta graveolus. Numeric values at branches represent bootstraps. Alphanumerics represent GenBank accession numbers of 4CL isoforms

Homology modeling and structure validation

3A9UA (Resolution 2.4 Å) and 3TSYA(Resolution 3.1 Å) were selected as the most identical proteins whose experimental 3D-structure was reported. Both the selected template proteins had 98% and 99% query coverage to Ll4CL1 and Ll4CL2, respectively. 3A9UA had 68% and 77% identity to Ll4CL1 and Ll4CL2, respectively. Similarly, 3TSYA had 65% and 70% identity to Ll4CL1 and Ll4CL2, respectively (Supplementary Table 1). We obtained structural coordinate (.pdb file) of templates for further analysis from RCSB. Using these templates, multi-template modeling was performed by MODELLER 9.15 software. Five models were generated from which the best model with the lowest DOPE score and/or lowest z-score values were selected in each case. Least residues in the disallowed region were also considered while selecting the model. The first model with the best scores viz. Model1 for Ll4CL1 and Model1 for Ll4CL2 were selected for further studies as shown in Table 1. Model 1 of Ll4CL1 was having a DOPE score of − 43,268.863281 and z score of − 5.91 with 470 residues in allowed and 7 residues in the disallowed region. Similarly, Model 1 of Ll4CL2 was having a DOPE score of − 417,764.367321 and z score of − 4.10 with 445 residues in allowed and 5 residues in the disallowed region. Loop refinement of the two models led to the successful minimization of disallowed regions in Ll4CL1 (1.5–0%) and Ll4CL2 protein (1.1–0.2%), which is tabulated in Supplementary Table 2(a) and 2(b), respectively. The models thus generated were analyzed by PDBsum Generate which gave Ramachandran plot showing 0% and 0.2% residues in the disallowed regions, respectively (Fig. 3).

Table 1.

Model evaluation by Modeller 9.15 and selection of best models for Ll4CL1 and Ll4CL2: discrete optimized potential energy (unnormalized DOPE scores) obtained by modeller along with percent residues in most favored and disallowed regions are shown

Model no. DOPE score (unnormalized) Disallowed region (%) Most favored region (%) z-score (normalized DOPE score)
Model evaluation for Ll4CL1
 1 − 43,268.863281 1.5 85.7 − 5.91
 2 − 43,557.132813 1.0 83.4 − 5.73
 3 − 42,920.531350 1.3 84.7 − 5.73
 4 − 43,326.582031 1.7 84.3 − 5.84
 5 − 43,436.945313 1.5 83.2 − 5.72
Model evaluation for Ll4CL2
 1 − 42,158.964844 1.1 82.9 − 4.10
 2 − 41,771.527344 2.0 82.4 − 4.35
 3 − 42,357.011719 2.2 85.0 − 3.8%
 4 − 41,500.941406 1.6 81.8 − 3.7
 5 − 41,579.187500 1.3 82.7 − 3.55

Normalized DOPE score (z-score) are also shown for all the models

Fig. 3.

Fig. 3

Ramachandran plot of loop refined models of Ll4CL isoforms: Ramachandran plots as obtained by PDBsum server. Upper panel shows plots before loop refinement for Ll4CL isoforms. Lower panel depicts plots after loop refinement. Visible change in amino acid residues in the disallowed region before and after energy minimization. Residues and their position in the disallowed region are highlighted in red

Active site prediction using 3DLigandsite server

The protein sequence was uploaded on the 3DLigandsite server which identifies the most putative active site residues by superimposing ligands bound to structures similar to query on the selected model. On this basis, 14 residues for Ll4CL1 and 11 residues for Ll4CL2 were identified as active site residues which are tabulated in Table 2. Residues such as SER190/191of Ll4CL1 and SER165 of Ll4CL2, PRO326 of Ll4CL1 and PRO315 of Ll4CL2, ILE264 of Ll4CL1 and ILE252 of Ll4CL2 were identified at similar positions and could possibly be common active site residues. MSA of Ll4CL1 with Ll4CL2 and At4CL2 revealed that residues PRO-315 and PHE-323 lie in the substrate-binding domain (SBD1) previously reported in At4CL isoforms as outlined in Supplementary Fig. 1(b). Similarly, MSA of Ll4CL1 with Ll4CL2 with At4CL1, revealed that residues LYS-324, LEU-325, PRO-326, HIS327 and ILE-329 lie in the substrate-binding domain (SBD1) as previously reported in At4CL isoforms (Supplementary Fig. 1(a)) (Ehlting et al. 2001). AMP binding domains BOX1 and BOX2 were found to be conserved in case of both Ll4CL1 and Ll4CL2 consistent with the previous reports in case of At4CL1, At4CL2, Pt4CL1, Ptr4CL4, Ptr4CL5 and Ptr4CL7 (Zhang et al. 2019; Ehlting et al. 2001).

Table 2.

Active site residues of two Ll4CL isoforms as predicted by 3DLigandsite server: (a) 14 active site residues for Ll4CL1 and (b) 11 active site residues for Ll4CL2 along with their contacts and average distances

Residue Amino acid Contact (number of ligands within 0.8 Å) Average distance (AVERAGE separation of residue from all the ligands within 0.8 Å) Residue Amino acid Contact (number of ligands within 0.8 Å) Average distance (average separation of residue from all the ligands within 0.8 Å)
(a) (b)
190 SER 14 0.05 165 SER 14 0.05
191 SER 11 0.13 212 SER 11 0.13
193 THR 13 0.21 246 THR 13 0.21
227 ASP 13 0.15 247 ASP 13 0.15
228 ASP 9 0.38 248 ASP 9 0.38
262 TYR 14 0 249 TYR 14 0
263 ASP 14 0.11 250 ASP 14 0.11
264 ILE 14 0 251 ILE 14 0
265 ALA 14 0.04 252 ALA 14 0.04
324 LYS 14 0.02 315 LYS 14 0.02
325 LEU 16 0 323 LEU 16 0
326 PRO 18 0.04
327 HIS 14 0
329 ILE 12 0.42

Residues with zero average distance mean that ligand used is in direct contact with the residue

Differential substrate preferences of Ll4CL isoforms

Docking was performed using Autodock 4.2 software. Binding energy and hydrogen bonds formed were noted and tabulated for the best docking configuration among ten different configurations obtained after each run for a single substrate. Best docked configurations were captured using Discovery Studio software and saved for Ll4CL1 and Ll4CL2, respectively (Fig. 4). Taking cognizance of binding energy and the number of hydrogen bonds (H Bond) between the substrate and the active site residues, the substrate preferences for each isoform were predicted. With the binding energy of − 4.91 kcal/mole and three H bond sinapic acid was identified to be the preferred substrate for Ll4CL1 protein followed by ferulic acid, hydroxyferulic acid and caffeic acid in decreasing order of preference. Similarly, with a binding energy of − 6.56 kcal/mole and four H bonds caffeic acid was identified to be the most favored substrate followed by hydroxyferulic acid and ferulic acid in decreasing order of preference for Ll4CL2 protein (Table 3). To ascertain if AutoDock results are in parallel with the previously known data from UniProt entries UniProtKB—Q42524 (4CL1_ARATH) from related work on A. thaliana, we modeled 4CL1_ARATH and docked caffeic acid and ferulic acid as ligand. Binding energy of the docked complex for At4CL1-caffeic acid was − 6.63 kcal/mole (with 4 H-bonds) and At4CL1-Ferulic acid was − 5.47 kcal/mole (with 3 H-bonds). The corresponding Km for these ligands as reported in the literature is 11 µM and 199 µM, respectively. This suggests that an approximate change of 1.2-fold (− 6.63/− 5.47) in binding energy (including a decrease of 1 H-bonding) could result in an approximately 18-fold increase in affinity of caffeic acid than ferulic acid (Table 4).

Fig. 4.

Fig. 4

Fig. 4

Molecular docking analysis of Ll4CL isoforms: a 3D and 2D images of docked Ll4CL1 protein with different substrates (ligands): (i). Ll4CL1 docked with ADP; (ii). Ll4CL1 docked with AMP; (iii). Ll4CL1 docked with caffeic acid; (iv). Ll4CL1 docked with coumaric acid; (v). Ll4CL1 docked with ferulic acid; (vi). Ll4CL1 docked with hydroxyl ferulic acid; (vii). Ll4CL1 docked with sinapic acid, b 3D and 2D images of docked Ll4CL2 protein with different substrates (ligands): (i). Ll4CL2 docked with ADP; (ii). Ll4CL2 docked with AMP; (iii). Ll4CL2 docked with caffeic acid; (iv). Ll4CL2 docked with coumaric acid; (v). Ll4CL2 docked with ferulic acid; (vi). Ll4CL2 docked with hydroxyl ferulic acid; (vii). Ll4CL2 docked with sinapic acid; (viii). Ll4CL2 docked with trans-cinnamic acid. Specific residues interacting with the ligand can be seen clearly in 2D form. Green dotted line represents H bonding/van der Waal’s interaction, pink dotted line represents hydrophobic interaction, red/orange dotted line represents ionic interactions/salt bridges. Amino acid residues and its position in colored circles (red: charged; orange: aromatic, pink: nonpolar)

Table 3.

Preferred substrates of Ll4CL1 and Ll4CL2 as obtained by docking results: lower binding energy suggests more stable ligand-receptor complex

S. no. Substrates Lignin type Ll4CL1 Ll4CL2
Binding energy (kcal/mole) Hydrogen bonds Binding energy (kcal/mole) Hydrogen bonds
1 Adenosine di-phosphate − 1.43 2 0.16 4
2 Adenosine mono-phosphate − 2.83 1 − 2.82 2
3 Adenosine tri-phosphate 0.9 2
4 Coumaric acida G unit − 4.13 2 − 5.33 2
5 Ferulic acid − 4.83 4 − 6.32 4
6 Caffeic acid − 4.71 3 − 6.56 4
7 Hydroxyl ferulic acid S unit − 4.72 2 − 6.56 3
8 Sinnapic acid − 4.91 3 − 5.0 1

Ll4CL1 shows substrate preference to Sinapic acid (lowest binding energy among all substrates with 3 H bonds) and Ll4CL2 shows substrate preference to caffeic acid (lowest binding energy among all substrates with 4 H bonds)

aMay contribute to H lignin

Table 4.

Relationship between binding energy and binding affinity of At4CL1 protein and its ligand: binding energies for docked caffeic acid and ferulic acid substrates with At4CL1 target protein (UniProt ID: Q42524) as obtained using AutoDock4.2 tool

S. no. Substrates At4CL1 Kinetics (Km)
Binding energy kJ/moles Hydrogen bonds µM
1 Caffeic acid − 6.63 4 11
2 Ferulic acid − 5.47 3 199

Flexible docking was employed with grid center (40.601, 33.639, 72.534), spacing 3.52 Å and npts (126,126,126) to cover most of the protein in the grid. Lower binding energy and ahigher number of H-bonds indicate more stable complex and, therefore, lower Km values suggesting a higher affinity

Mutational analysis of Ll4CL isoforms identifies important active site residues

After performing random mutations (all in silico mutations performed were conservative substitutions i.e. residues replaced with similar amino acids. Radical substitutions could result in loss of stable protein structure) in the two proteins (Ll4CL1 and Ll4CL2) they were further subjected to docking studies using the same substrates. The binding energy so obtained was compared with the unmutated docked configurations. It was found that residues ASP228, TYR262, and PRO326 for Ll4CL1 and residues SER165, LYS247 and PRO315 for Ll4CL2 were important for functional activity as suggested by the increase in binding energy and loss of H bond (post mutation in some cases), for all the preferred substrates identified earlier in this work. Moreover, residues ILE264 and ILE329 in Ll4CL1 could not be testified as replacement of ILE264 and ILE329 with proline resulted in a radical change in conformation of the active site resulting in loss of binding of ligands. Active site disruption was concluded by the failure of docking events post mutation for all the preferred substrates (Table 5; Supplementary Table 3a & b). To understand the stability of docked conformation of modeled proteins and its in silico mutated counterpart, as a test candidate, we performed Molecular Dynamics simulations on only WT Ll4CL1 and ASP228GLU Ll4CL1 docked with sinapic acid. The molecular simulation analysis revealed that the running average of RMSD is reasonably stable around ~ 0.22 nm for wild type Ll4CL1 (50 ns) which decreases to ~ 0.19 nm for mutant Ll4CL1 (20 ns), suggesting that the structures were stable in their docked conformations post in silico mutation as well (Supplementary Fig. 2).

Table 5.

Comparison of binding energies and hydrogen bonds before and after random mutation in active site residues of Ll4CL isoforms: (a) mutational analysis for Ll4CL1. ASP228, TYR262, PRO326 identified as important residues for functional activity as seen by decrease in number of H-bonds and increase in binding energy of docked complex post in silico mutation (ASP228GLU), (b) mutational analysis for Ll4CL2

S. no. Mutations done Substrate used Binding energy before mutation (kcal/mole) Binding energy after mutation (kcal/mole) Atoms in H-bonds before mutation Atoms in H-bonds after mutation
(a) Mutational analysis for Ll4CL1
 1 ASP-228-GLU Sinapic acid − 4.91 − 4.33 3 1
Ferulic acid − 4.84 − 4.14 4 1
Hydroxy ferulic acid − 4.72 − 4.37 2 1
Caffeic acid − 4.71 − 4.09 3 2
 2 TYR-262-TRP Sinapic acid − 4.91 − 4.88 3 3
Ferulic acid − 4.84 − 4.91 4 3
Hydroxy ferulic acid − 4.72 − 4.73 2 2
Caffeic acid − 4.71 − 4.65 3 3
 3 PRO-326-VAL Sinapic acid − 4.91 − 4.79 3 2
Ferulic acid − 4.84 − 4.52 4 4
Hydroxy ferulic acid − 4.72 − 4.60 2 2
Caffeic acid − 4.71 − 4.47 3 4
 4 PRO-326-ILE Sinapic acid − 4.91 − 4.63 3 3
Ferulic acid − 4.84 − 4.68 4 4
Hydroxy ferulic acid − 4.72 − 4.92 2 5
Caffeic acid − 4.71 − 4.46 3 3
 5 ILE-264-PRO Sinapic acid − 4.91 Mutation led to the loss of binding of ligand 3 Mutation led to the loss of binding of ligand
Ferulic acid − 4.84 4
Hydroxy ferulic acid − 4.72 2
Caffeic acid − 4.71 3
 6 ILE-264-GLY Sinapic acid − 4.91 3
Ferulic acid − 4.84 4
Hydroxy ferulic acid − 4.72 2
Caffeic acid − 4.71 3
 7 ILE-329-PRO Sinapic acid − 4.91 Mutation led to the loss of binding of ligand 3 Mutation led to the loss of binding of ligand
Ferulic acid − 4.84 4
Hydroxy ferulic acid − 4.72 2
Caffeic acid − 4.71 3
 8 ILE-329-VAL Sinapic acid − 4.91 3
Ferulic acid − 4.84 4
Hydroxy ferulic acid − 4.72 2
Caffeic acid − 4.71 3
(b) Mutational analysis for Ll4CL2
 1 SER-165-CYS Caffeic acid − 6.56 − 4.29 4 1
Hydroxy ferulic acid − 6.56 − 4.50 3 1
Ferulic acid − 6.32 − 4.88 4 1
 2 SER-165-THR Caffeic acid − 6.56 − 4.95 4 1
Hydroxy ferulic acid − 6.56 − 4.54 3 4
Ferulic acid − 6.32 − 5.09 4 1
 3 LYS-247-ARG Caffeic acid − 6.56 − 4.33 4 5
Hydroxy ferulic acid − 6.56 − 4.68 3 1
Ferulic acid − 6.32 − 4.93 4 1
 4 LYS-247-HIS Caffeic acid − 6.56 − 4.93 4 1
Hydroxy ferulic acid − 6.56 − 4.54 3 1
Ferulic acid − 6.32 − 4.54 4 1
 5 PRO-315-ILE Caffeic acid − 6.56 − 4.91 1.65 4 1
Hydroxy ferulic acid − 6.56 − 5.05 1.51 3 2
Ferulic acid − 6.32 − 5.04 1.28 4 1

SER165, LYS247 and PRO315 identified as important residues for functional activity as seen by decrease in number of H-bonds and increase in binding energy of docked complex post in silico mutation (SER165CYS) A detailed list of all performed mutations for (a) and (b) is given in Supplementary Table 3

Endler et al. (2008), reported two divergent 4CLs from Ruta graveolens (L). The recombinant Rg4CL1 & Rg4CL2 differed considerably in their preferential affinities to ferulate, cinnamate and caffeate (sinapic acid was below detection limit). Rg4CL1 preferred caffeic acid and cinnamic acid than ferulic acid and Rg4CL2 prefered ferulic acid and caffeic acid equally and cinnamate was least favoured. Since in phylogenetic analysis Rg4CL2 and Ll4CL2 clustered together, we aligned both Rg4CLs and Ll4CLs protein sequences (Supplementary Fig. 3) and identified ASP228, TYR262, and PRO326 for Ll4CL1 (black arrows) and SER165, LYS247 for Ll4CL2 (brown arrows) residues. On comparing Rg4CL1 and Ll4CL1, we found that ASP228 and PRO326 were conserved and TYR262 was conservatively substituted with PHE299 in Rg4CL1, however, their substrate preference did not match. Rg4CL2 and Ll4CL2 showed conserved SER165, PRO315 and conservatively substituted LYS247 in Rg4CL2 and both preferred caffeic acid over ferulic acid.

Discussion

Modeled Ll4CL1 and Ll4CL2 proteins obtained using Modeller 9.15 software were energy minimized and then subjected to loop refinement to minimize the residues in disallowed regions, Ll4CL1 protein residues in the disallowed region were minimized from 1.5 to 0% and for Ll4CL2 protein, from 1.1 to 0.2%. Using 3DLigandsite server active sites were predicted for Ll4CL1 protein as SER190, SER191, THR193, ASP227, ASP228, TYR262, ASP263, ILE264, ALA265, LYS324, LEU325, PRO326, HIS327, ILE329 and for Ll4CL2 protein as SER165, HIS212, GLN246, LYS247, HIS248, LYS249, VAL250, SER251, ILE252, PRO315, PHE33. Docking results showed that sinapic acid (− 4.91 kcal/mole), ferulic acid (− 4.83 kcal/mole), hydroxyferulic acid (− 4.72 kcal/mole) and caffeic acid (− 4.71 kcal/mole) are the most preferred substrates for Ll4CL1 and hydroxyferulic acid (− 6.56 kcal/mole), caffeic acid (− 6.56 kcal/mole) and ferulic acid (− 6.32 kcal/mole) are most preferred substrates for Ll4CL2 protein (all in decreasing order of preference on the basis of binding energies in adjoining brackets). Post mutational docking studies and their comparison with pre-mutational docking data led to the identification of residues ASP228,TYR262 and PRO326 being important for the functional activity of Ll4CL1 protein. This was depicted by an increase in binding energy (kcal/mole) post mutation which was as follows: ASP228GLU − [Sinapic acid (− 0.58 kcal/mole), ferulic acid (− 0.70 kcal/mole), hydroxyferulic acid (− 0.35 kcal/mole) and caffeic acid (− 0.62 kcal/mole)]; TYR262TRP − [Sinapic acid (− 0.52 kcal/mole), ferulic acid (− 0.67 kcal/mole), hydroxyferulic acid (− 0.61 kcal/mole) and caffeic acid (− 0.66 kcal/mole)] and PRO326VAL − [Sinapic acid (− 012 kcal/mole), ferulic acid (− 0.32 kcal/mole), hydroxyferulic acid (− 0.12 kcal/mole) and caffeic acid (− 0.24 kcal/mole)]. When ILE264PRO and ILE329PRO models were used for docking, we observed that the ligands could not be docked properly. We hypothesized that proline being an amino acid with imino group could have changed the active site conformation radically resulting in loss of binding of ligands to the active site region. Therefore. their significance in active site could not be testified in the present context (Table 5a, Supplementary Table 3a & b). Similarly, for Ll4CL2 protein it was found that the residues SER165, LYS247 and PRO315 were important for the functional activity as suggested by an increase in the binding energy (kcal/mole) post mutation which was as follows: SER165CYS − [caffeic acid (− 2.25 kcal/mole), hydroxyferulic acid (− 2.06 kcal/mole) and ferulic acid (− 1.44 kcal/mole)]; LYS247ARG − [Cinnamic acid (− 1.45 kcal/mole), hydroxyferulic acid (− 2.02 kcal/mole) and ferulic acid (−1.78 kcal/mole)] and PRO315ILE − [caffeic acid (− 1.65 kcal/mole), hydroxyferulic acid (− 1.51 kcal/mole) and ferulic acid (− 1.28 kcal/mole)]. In the majority of the in silico mutations discussed above, an increase in binding energies was also characterized by loss of H bonds between ligand and the residues replaced (Table 5). Ruta graveolens 4CL2 clusterd with Ll4CL2 in the phylogentic analysis and had conserved SER165/205, LYS247/299 and PRO315/355 in Ll4CL2 and Rg4CL2, respectively (Table 5a, Supplementary Table 3a & b).

Several pathway perturbation studies have been reported from time to time which focus on different enzymes in the monolignol biosynthesis pathway. C3H or HCT silencing studies resulted in lignin with H-unit levels as high as 100% of the total thioacidolysis lignin monomers (Franke et al. 2000, Abdulrazzak 2005; Ralph et al. 2012; Besseau et al. 2007; Coleman et al. 2008). In a similar study downregulation of F5H or COMT strongly reduced S-unit content while upregulation of F5H increased the S-unit content (Stewart et al. 2009; Franke et al. 2000). Downregulation of CAD leads to increased incorporation of cinnamaldehydes into the polymer (Baucher et al. 1999; Kim et al. 2003; Lapierre et al. 1999).

One of the oldest examples of the idea that reduced 4CL activity could lead to a specific decrease in S unit was demonstrated in tobacco (Kajita et al. 1996), however, a decrease in G unit in case of Arabidopsis was observed in other similar and contemporary study (Lee et al. 1997). The difference in the specific decrease in these two types of lignins also suggested the existence of more than two isoforms in tobacco and Arabidopsis having differential substrate preferences. Possibly because of this fact in case of tobacco the correct isoform was downregulated resulting in a decrease in S lignin units. Wagner et al. (2009) also supported the idea that manipulating 4CL could lead to compromised lignin synthesis (reduced between 36 and 50%) and caused modifications in the lignin interunit linkage. In another report by Xu et al. (2011) reduction in lignin content was observed in Panicum virgatum downregulated for 4CL1 with decreased guaiacyl content and uncompromised biomass yield. In Arabidopsis, three 4CL isozymes viz. At4CL1, At4CL2, and At4CL3, with different substrate preferences and gene expression patterns, have been identified. At4CL1 and At4CL2 are involved in the monolignol biosynthesis pathway, while At4CL3 participates in flavonoid and other nonlignin biosynthesis pathways (Ehlting et al. 1999; Cukovica et al. 2005). In Poplar, two functionally divergent 4CLs were identified. Ptr4CL1 is devoted to lignin biosynthesis in developing xylem tissues, whereas Ptr4CL2 is possibly involved in flavonoid biosynthesis in epidermal cells (Hu et al. 1998).

Our current work adds to the repertoire of opportunities with 4CL gene which has recently gained the focus of the scientific community. Leucaena leucocephala is a fast-growing multipurpose tree species with its utility in pulp and paper industries and as feedstock for lignocellulosic biomass for various purposes. On the basis of above findings it could be suggested that if we upregulate the expression of Ll4CL1, more G-lignin will be produced initially due to late entry of sinapic acid in the pathway but after sometimes as more of sinapic acid starts accumulating S-lignin production will be the major product and not G-lignin. However, since binding energy of Ll4CL1 with sinapic acid is not too much higher than ferulic acid it would be better to reduce the flux towards G-lignin at the same time, which can be achieved by downregulating the Ll4CL2 gene. Moreover, the binding energy of Ll4CL2 with ferulic acid is ~ 1.3 times lower than that with Ll4CL1 gene. Hence the major effect towards S/G ratio improvement will be due to Ll4CL2 downregulation whereas Ll4CL1 upregulation will be playing a subsidiary role. Figure 1 shows the proposed redirection of flux towards higher S/G lignin ratio as a result of Ll4CL1 upregulation and Ll4CL2 downregulation.

The present study has a few limitations which also need to be pointed out. While performing mutations in the active site residues it was ensured that they were replaced with the same class of residues like aromatic residues were replaced with other aromatic residues only, polar with polar, non-polar with non-polar, charged with charged, and uncharged were replaced with uncharged residues only. This was done to ensure that there is no abrupt change in DOPE scores or protein model structure due to the vast difference in nature of residue thus enabling only DOPE score changes in most significant residues and minimizing false positives. Modeling of the protein structure with more robust and advanced docking and modeling software is desirable to improve the quality of protein models. Further, wet lab studies on mutational docking need to be done to validate the role of significant residues thus identified.

Electronic supplementary material

Below is the link to the electronic supplementary material.

13205_2020_2375_MOESM4_ESM.tif (13.1MB, tif)

Supplementary Figure 1: Multiple sequence alignment of Ll4CL1 (ACI23348.1) and Ll4CL2 (ACI23349.1) with 4CL isoforms from Arabidopsis thaliana: SBD and AMP binding domain identified from UniProtKB-Q9S725 (4CL2_ARATH). Conserved peptide motifs BOX1 and BOX2 comprising of the AMP binding domain are highlighted as blue boxes. Substrate binding domains (SBD) are highlighted as orange outlined boxes. Blue and red triangles show already identified important residues from previous works and residues predicted to be an important part of active site in current work. (a) MSA of Ll4CL1 (ACI23348.1) and Ll4CL2 (ACI23349.1) with 4CL2_ARATH from Arabidopsis thaliana; (b) MSA of Ll4CL1 (ACI23348.1) and Ll4CL2 (ACI23349.1) with 4CL1_ARATH from Arabidopsis thaliana. (TIF 13460 kb)

13205_2020_2375_MOESM6_ESM.tif (6.1MB, tif)

Supplementary Figure 2: Molecular simulation dynamics of Ll4CL1 and its mutant: (a) RMSD of wild type Ll4CL1 docked with sinapic acid as substrate, indicates that it is reasonably stable at ~0.22nm (run for 50 ns); (b) RMSD of Ll4CL1-ASP228GLU mutant was ~0.19nm (run for 20 ns) was identified as stable complex. (TIF 6225 kb)

13205_2020_2375_MOESM7_ESM.tif (7MB, tif)

Supplemetary Figure 3: Multiple sequence alignment of Leucaena leucocephala 4CL isoforms (GenBank Accesion: ACI23348.1 & ACI23349.1) with Ruta graveolens 4CL1 & 4CL2 (GenBank Accession: EU224388 & EU224389): Black arrows represent the in silico mutated residues (D228;Y262;P326) for Ll4CL1 and brown arrows represent the in silico mutated residues (SER165; LYS247) for Ll4CL2 in present study. Thick blue line represents AMP binding domain. Red and green line represents substrate binding domain 1 &2 (Substrate binding domain and AMP binding domain identified from UniProtKB-Q9S725 (4CL2_ARATH)). (TIF 7155 kb)

Acknowledgements

HS and RT are thankful to the Ministry of Human Resource and Development, Govt. of India, New Delhi, India. GK thanks DBT, India for his fellowship during this tenure. Authors also thank Dr. Tonima Kamat for critically reviewing the manuscript.

Author contributions

HS performed most of the experiment and analysis. GK and RT helped HS in experiments and analysis of the data. AM, ShS and NKS reviewed the manuscript and were part of discussion all the time. SaS conceived the idea and executed the project. HS and SaS wrote the paper.

Compliance with ethical standards

Conflict of interest

Authors declare no potential conflict of interest. Authors also confirm that there was no research involving human participants and/or animal.

Contributor Information

Himanshu Shekhar, Email: hishu0811@gmail.com.

Gaurav Kant, Email: gkpmintoo@gmail.com.

Rahul Tripathi, Email: alaneuschevalier@gmail.com.

Shivesh Sharma, Email: shiveshs@mnnit.ac.in.

Ashutosh Mani, Email: amani@mnnit.ac.in.

N. K. Singh, Email: nksingh@mnnit.ac.in

Sameer Srivastava, Email: sameers@mnnit.ac.in.

References

  1. Abdulrazzak N. A coumaroyl-ester-3-hydroxylase insertion mutant reveals the existence of nonredundant meta-hydroxylation pathways and essential roles for phenolic precursors in cell expansion and plant growth. Plant Physiol. 2005;140:30–48. doi: 10.1104/pp.105.069690. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  3. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Baucher M, Bernard-Vailhé MA, Chabbert B, Besle JM, Opsomer C, Van Montagu M, Botterman J. Down-regulation of cinnamyl alcohol dehydrogenase in transgenic alfalfa (Medicago sativa L.) and the effect on lignin composition and digestibility. Plant Mol Biol. 1999;39:437–447. doi: 10.1023/A:1006182925584. [DOI] [PubMed] [Google Scholar]
  5. Beauchet R, Monteil-Rivera F, Lavoie JM. Conversion of lignin to aromatic-based chemicals (L-chems) and biofuels (L-fuels) Biores Technol. 2012;121:328–334. doi: 10.1016/j.biortech.2012.06.061. [DOI] [PubMed] [Google Scholar]
  6. Besseau S, Hoffmann L, Geoffroy P, Lapierre C, Pollet B, Legrand M. Flavonoid accumulation in arabidopsis repressed in lignin synthesis affects auxin transport and plant growth. Plant Cell Online. 2007;19:148–162. doi: 10.1105/tpc.106.044495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Boerjan W, Ralph J, Baucher M. Lignin biosynthesis. Annu Rev Plant Biol. 2003;54:519–546. doi: 10.1146/annurev.arplant.54.031902.134938. [DOI] [PubMed] [Google Scholar]
  8. Coleman HD, Park JY, Nair R, Chapple C, Mansfield SD. RNAi-mediated suppression of p-coumaroyl-CoA 3′-hydroxylase in hybrid poplar impacts lignin deposition and soluble secondary metabolism. Proc Natl Acad Sci. 2008;105:4501–4506. doi: 10.1073/pnas.0706537105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cukovica D, Ehlting J, Van Ziffle JA, Douglas CJ. Structure and evolution of 4-coumarate: coenzyme A ligase (4CL) gene families. Biol Chem. 2005;382(4):645–654. doi: 10.1515/BC.2001.076. [DOI] [PubMed] [Google Scholar]
  10. Ehlting J, Büttner D, Wang Q, Douglas CJ, Somssich IE, Kombrink E. Three 4-coumarate: coenzyme A ligases in Arabidopsis thaliana represent two evolutionarily divergent classes in angiosperms. Plant J. 1999;19:9–20. doi: 10.1046/j.1365-313X.1999.00491.x. [DOI] [PubMed] [Google Scholar]
  11. Ehlting J, Shin JJK, Douglas CJ. Identification of 4CL substrate recognition domains. Plant J. 2001;27(5):455–465. doi: 10.1046/j.1365-313X.2001.01122.x. [DOI] [PubMed] [Google Scholar]
  12. Endler A, Martens S, Wellmann F, Matern U. Unusually divergent 4-coumarate: CoA-ligases from Ruta graveolens L. Plant Mol Biol. 2008;67:335–346. doi: 10.1007/s11103-008-9323-7. [DOI] [PubMed] [Google Scholar]
  13. Fiser A, Sali A. ModLoop: automated modeling of loops in protein structures. Bioinformatics. 2003;19:2500–2501. doi: 10.1093/bioinformatics/btg362. [DOI] [PubMed] [Google Scholar]
  14. Franke R, McMichael CM, Meyer K, Shirley AM, Cusumano JC, Chappie C. Modified lignin in tobacco and poplar plants over-expressing the Arabidopsis gene encoding ferulate 5-hydroxylase. Plant J. 2000;22:223–234. doi: 10.1046/j.1365-313x.2000.00727.x. [DOI] [PubMed] [Google Scholar]
  15. Hamberger B, Hahlbrock K. The 4-coumarate: CoA ligase gene family in Arabidopsis thaliana comprises one rare, sinapate-activating and three commonly occurring isoenzymes. Proc Natl Acad Sci. 2004;101:2209–2214. doi: 10.1073/pnas.0307307101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hu W-J, Kawaoka A, Tsai C-J, Lung J, Osakabe K, Ebinuma H, Chiang VL. Compartmentalized expression of two structurally and functionallydistinct 4-coumarate: CoA ligase genes in aspen (Populus tremuloides) Proc Natl Acad Sci. 1998;95(9):5407–5412. doi: 10.1073/pnas.95.9.5407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Hu Y, Gai Y, Yin L, Wang X, Feng C, Feng L, Li D, Jiang XN, Wang DC. Crystal structures of a Populus tomentosa 4-Coumarate: CoA ligase shed light on its enzymatic mechanisms. Plant Cell. 2010;22:3093–3104. doi: 10.1105/tpc.109.072652. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Kajita S, Katayama Y, Omori S. Alterations in the biosynthesis of lignin in transgenic plants with chimeric genes for 4-Coumarate; coenzyme a ligase. Plant Cell Physiol. 1996;3:957–965. doi: 10.1093/oxfordjournals.pcp.a029045. [DOI] [PubMed] [Google Scholar]
  19. Kim H, Ralph J, Lu F, Ralph SA, Boudet AM, MacKay JJ, Sederoff RR, Ito T, Kawai S, Ohashi H, Higuchi T. NMR analysis of lignins in CAD-deficient plants. Part 1. Incorporation of hydroxycinnamaldehydes and hydroxybenzaldehydes into lignins. Org Biomol Chem. 2003;1:268–281. doi: 10.1039/b209686b. [DOI] [PubMed] [Google Scholar]
  20. Lapierre C, Pollet B, Petit-Conil M, Toval G, Romero J, Pilate G, Leplé JC, Boerjan W, Ferret V, De Nadai V, Jouanin L. Structural alterations of lignins in transgenic poplars with depressed cinnamyl alcohol dehydrogenase or caffeic acid O-methyltransferase activity have an opposite impact on the efficiency of industrial Kraft pulping. Plant Physiol. 1999;119:153–164. doi: 10.1104/pp.119.1.153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Larkin MA, Blackshields G, Brown NP, Chenna R, Mcgettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. Clustal W and clustal X version 2.0. Bioinformatics. 2007;23:2947–2948. doi: 10.1093/bioinformatics/btm404. [DOI] [PubMed] [Google Scholar]
  22. Lee D, Meyer K, Chapple C, Douglas CJ. Antisense suppression of 4-Coumarate: coenzyme A ligase activity in Arabidopsis leads to altered lignin subunit composition. Plant Cell. 1997;9:1985–1998. doi: 10.1105/tpc.9.11.1985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Li X, Weng JK, Chapple C. Improvement of biomass through lignin modification. Plant J. 2008;54:569–581. doi: 10.1111/j.1365-313X.2008.03457.x. [DOI] [PubMed] [Google Scholar]
  24. Marti-Renom MA, Stuart AC, Sanchez R, Melo F, Sali A. Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct. 2000;29:291–325. doi: 10.1146/annurev.biophys.29.1.291. [DOI] [PubMed] [Google Scholar]
  25. Ralph J, Akiyama T, Coleman HD, Mansfield SD. Effects on lignin structure of Coumarate 3-hydroxylase downregulation in poplar. Bioenergy Res. 2012;5:1009–1019. doi: 10.1007/s12155-012-9218-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4:406–425. doi: 10.1093/oxfordjournals.molbev.a040454. [DOI] [PubMed] [Google Scholar]
  27. Šali A, Blundell TL. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. [DOI] [PubMed] [Google Scholar]
  28. Stewart JJ, Akiyama T, Chapple C, Ralph J, Mansfield SD. The effects on lignin structure of overexpression of ferulate 5-hydroxylase in hybrid poplar1. Plant Physiol. 2009;150:621–635. doi: 10.1104/pp.109.137059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–2739. doi: 10.1093/molbev/msr121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Truhlar DG. Valence bond theory for chemical dynamics. J Comput Chem. 2009;28:73–86. doi: 10.1002/jcc.20529. [DOI] [PubMed] [Google Scholar]
  31. Uzal EN, Gómez Ros LV, Pomar F, Bernal MA, Paradela A, Albar JP, Ros Barceló A. The presence of sinapyl lignin in Ginkgo biloba cell cultures change our views of the evolution of lignin biosynthesis. Plant Physiol. 2009;135:196–213. doi: 10.1111/j.1399-3054.2008.01185.x. [DOI] [PubMed] [Google Scholar]
  32. Vanholme R, Demedts B, Morreel K, Ralph J, Boerjan W. Lignin biosynthesis and structure. Plant Physiol. 2010;153:895–905. doi: 10.1104/pp.110.155119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Verma SR, Dwivedi UN. Lignin genetic engineering for improvement of wood quality: applications in paper and textile industries, fodder and bioenergy production. S Afr J Bot. 2014;91:107–125. doi: 10.1016/j.sajb.2014.01.002. [DOI] [Google Scholar]
  34. Voo KS, Whetten RW, O’Malley DM, Sederoff RR. 4-Coumarate: coenzyme A ligase from loblolly pine xylem (isolation, characterization, and complementary DNA cloning) Plant Physiol. 1995;108:85–97. doi: 10.1104/pp.108.1.85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Wagner A, Donaldson L, Kim H, Phillips L, Flint H, Steward D, Torr K, Koch G, Schmitt U, Ralph J. Suppression of 4-Coumarate-CoA ligase in the coniferous gymnosperm Pinus radiate. Plant Physiol. 2009;149(1):370–383. doi: 10.1104/pp.108.125765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Wass MN, Kelley LA, Sternberg MJE. 3DLigandSite: predicting ligand-binding sites using similar structures. Nucleic Acids Res. 2010;38:469–473. doi: 10.1093/nar/gkq406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Weng JK, Li X, Bonawitz ND, Chapple C. Emerging strategies of lignin engineering and degradation for cellulosic biofuel production. Curr Opin Biotechnol. 2008;19:166–172. doi: 10.1016/j.copbio.2008.02.014. [DOI] [PubMed] [Google Scholar]
  38. Wiederstein M, Sippl MJ. ProSA-web: Interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res. 2007;35:407–410. doi: 10.1093/nar/gkm290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Xu B, Escamilla-Treviño LL, Sathitsuksanoh N, Shen Z, Shen H, Zhang Y-HP, Dixon RA, Zhao B. Silencing of 4-coumarate:coenzyme A ligase in switchgrass leads to reduced lignin content and improved fermentable sugar yields for biofuel production. New Phytol. 2001;192(3):611–625. doi: 10.1111/j.1469-8137.2011.03830.x. [DOI] [PubMed] [Google Scholar]
  40. Zhang C, Zang Y, Liu P, Zheng Z, Ouyang J. Characterization, functional analysis and application of 4-Coumarate: CoA ligase genes from Populus trichocarpa. J Biotechnol. 2019;302(2019):92–100. doi: 10.1016/j.jbiotec.2019.06.300. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

13205_2020_2375_MOESM4_ESM.tif (13.1MB, tif)

Supplementary Figure 1: Multiple sequence alignment of Ll4CL1 (ACI23348.1) and Ll4CL2 (ACI23349.1) with 4CL isoforms from Arabidopsis thaliana: SBD and AMP binding domain identified from UniProtKB-Q9S725 (4CL2_ARATH). Conserved peptide motifs BOX1 and BOX2 comprising of the AMP binding domain are highlighted as blue boxes. Substrate binding domains (SBD) are highlighted as orange outlined boxes. Blue and red triangles show already identified important residues from previous works and residues predicted to be an important part of active site in current work. (a) MSA of Ll4CL1 (ACI23348.1) and Ll4CL2 (ACI23349.1) with 4CL2_ARATH from Arabidopsis thaliana; (b) MSA of Ll4CL1 (ACI23348.1) and Ll4CL2 (ACI23349.1) with 4CL1_ARATH from Arabidopsis thaliana. (TIF 13460 kb)

13205_2020_2375_MOESM6_ESM.tif (6.1MB, tif)

Supplementary Figure 2: Molecular simulation dynamics of Ll4CL1 and its mutant: (a) RMSD of wild type Ll4CL1 docked with sinapic acid as substrate, indicates that it is reasonably stable at ~0.22nm (run for 50 ns); (b) RMSD of Ll4CL1-ASP228GLU mutant was ~0.19nm (run for 20 ns) was identified as stable complex. (TIF 6225 kb)

13205_2020_2375_MOESM7_ESM.tif (7MB, tif)

Supplemetary Figure 3: Multiple sequence alignment of Leucaena leucocephala 4CL isoforms (GenBank Accesion: ACI23348.1 & ACI23349.1) with Ruta graveolens 4CL1 & 4CL2 (GenBank Accession: EU224388 & EU224389): Black arrows represent the in silico mutated residues (D228;Y262;P326) for Ll4CL1 and brown arrows represent the in silico mutated residues (SER165; LYS247) for Ll4CL2 in present study. Thick blue line represents AMP binding domain. Red and green line represents substrate binding domain 1 &2 (Substrate binding domain and AMP binding domain identified from UniProtKB-Q9S725 (4CL2_ARATH)). (TIF 7155 kb)


Articles from 3 Biotech are provided here courtesy of Springer

RESOURCES