Abstract
With the escalating prevalence of malaria in recent years, artemisinin demand has placed considerable stress on its production worldwide. At present, the relative lowyield of artemisinin (0.011.1 %) in the source plant (Artemisia annua L. plant) has imposed a serious limitation in commercializing the drug. Amorpha4, 11diene synthase (ADS) has been reported a key enzyme in enhancing the artemisinin level in Artemisia annua L. An understanding of the structural and functional correlations of Amorpha4, 11diene synthase (ADS) may therefore, help in the molecular upregulation of the enzyme. In this context, an in silico approach was used to study the ADS3963 (3963 bp) gene cloned by us, from high artemisinin (0.70.9% dry wt basis) yielding strain of A. annua L. The fulllength putative gene of ADS3963 was found to encode a protein consisting of 533 amino acid residues with conserved aspartate rich domain. The isoelectric point (pI) and molecular weight of the protein were 5.25 and 62.2 kDa, respectively. The phylogenetic analysis of ADS genes from various species revealed evolutionary conservation. Homology modeling method was used for prediction of the 3D structure of ADS3963 protein and Autodock 4.0 version was used to study the ligand binding. The predicted 3D model and docking studies may further be used in characterizing the protein in wet laboratory.
Keywords: Artemisia annua, artemisinin, ADS3963 gene, homology modeling, phylogenetic tree, docking
Abbreviations
ADS - Amorpha-4, 11-diene synthase, FPP - farnesyl pyrophosphate, ORF - open reading frame, PCR - polymerase chain.
Background
Multi drug resistance in P. falciparum to the commonly used antimalarial agents is becoming more and more widespread and poses a serious threat to conventional and current therapeutic treatments. An alternative class of antimalarial compound, sesquiterpenoid artemisinin from Artemisia annua L. is effective against both chloroquineresistant and sensitive strains of Plasmodium species as well as the species causing cerebral malaria. Its relative low yield (0.011.1 % dry weight) [1, 2] however, has caused a serious concern in commercializing the drug [3]. The physiological and cell culture studies were carried out to improve the yield of artemisinin, but were mostly unsatisfactory [4]. The chemical synthesis of artemisinin is also possible, but it is complicated and economically unviable due to the poor yield [5]. Recent reports have highlighted the use of biotechnological approaches such as metabolic engineering and genetic modification of microbe and plants as a feasible alternative for the semisynthesis of artemisinin and its precursors [6, 7]. S. cerevisiae had been enginnered to produce artemisinic acid, precursor of artemisinin, at a significantly higher level than in A. annua [8]. The complete synthesis of artemisinin outside the source plant is however, not achieved yet and has to rely either on biotransformation using plant extract [7,9] or semisynthesis [10] to obtain end product. Studies have been conducted in different laboratories to elucidate the biochemical pathway of artemisinin and its regulation with an aim to improve artemisinin content of A. annua L. Based on the biosynthetic pathway for artemisinin starting from farnesyl pyrophosphate (FPP) followed by synthesis of germacrane skeleton, dihydrocostunolide, cardinanolide, arteannuin B and artemisinin [11], but these studies did not indicate artemisinic acid as precursor of artemisinin. In another study, two compounds (secocadinane and dihydroxycadinanolide)were isolated and postulated an alternative route for artemisinin biosynthesis [12]. According to this postulate, arteannuin B gets converted into a dihydroxycadinanolide, which then undergoes Grobs fragmentation to yield an enolic form of a secocadinane. This further undergoes enzymatic oxygenation yielding artemisitene, which finally reduces to artemisinin. The in vitro and in vivo transformation of artemisinic acid to arteannuin B and artemisinin with an over all yield of 4.0 % [13], suggesting that artemisinic acid is a common precursor for both arteannuin B and artemisinin. This was confirmed by several studies employing crude and semi purified cell free extracts of leaf homogenates of A. annua, where arteannuin B was found an intermediate in the bioconversion of artemisinic acid to artemisinin [9]. The transformation of dihydroartemisinic acid into artemisinin by cell free extracts from A. annua L. plants [14]. These studies thus, indicate that artemisinic acid is converted into artemisinin either via arteannuin B or dihydroartemisinic acid. Recently, amorpha4, 11 diene synthase (ADS), has been identified as first enzyme of artemisinin biosynthesis. It has also been reported to be the key regulatory enzyme which catalyzes the cyclization of farnesyl pyrophosphate (FPP) into the sesquiterpenoid skeleton, amorpha 4,11diene [15, 16]. Thus, increasing the activity of amorpha4,11 diene synthase (ADS) enzyme through overexpression of its gene in A. annua L. may be a promising approach to enhance artemisinin biosynthesis. Consequently, several groups have reported the cloning, sequencing and expression of Amorpha4,11diene synthase genes from different strains of A. annua L. [17–20]. In this paper, we report cloning and characterization of a putative Amorpha4, 11 diene synthase gene (ADS3963) from high yielding A. annua L. strain (0.70.9 % artemisinin dry weight basis).
Methodology
The seeds of Artemisia annua L., a high artemisinin yielding strain (0.70.9 % artemisinin dry weight basis) was provided by the Ipca Laboratories Limited, Ratlam, MP, India. Plants were grown in the experimental field of Jamia Hamdard, New Delhi, India.
Isolation of genomic DNA
Fresh leaves were collected from A. annua L. plants at preflowering stage. The extraction of DNA was done by modified CTAB method[21].
Generation of A. annua L. primers
The PCR primers were designed using published Amorpha4, 11 diene synthase sequence retrieved from NCBI GenBank (Acc. no. AF327527) from A. annua L. plant. The primers were derived from total ORF region of Amorpha4, 11diene synthase gene and were synthesized by SigmaAldhrich Chemicals Pvt. Ltd. with the following sequences: Forward primer, ATGTCACTTACAGAAGAAAAACCTATTC and Reverse primer, TCATATACTCATAGGATAAACG.
Amplification of Amorpha4, 11 diene synthase (ADS3963)
The PCR reaction mixtures containing 100 μl PCR mastermix [(0.2 mmol of each deoxynucleotide, buffer and 1.5 mmol of MgCl2), 1 μl (of 1 μg/μl) of each forward and reverse primer and 0.5 μl (5U) of Taq DNA polymerase] was divided into ten 9 μl aliquots. One microlitre of Artemisia annua L. genomic DNA was added as either neat, 1 in 5 or 1 in 10 dilution to make the final reaction mixture 10 μl. The cycling parameters were 95 °C for 3 min: 40 cycles of denaturation for 30 s at 94 °C, annealing for 1 min at 55 °C an extension for 1 min at 72 °C, with final extension 72 °C for 5 min.
Cloning of amorpha4, 11diene synthase
The resulting PCR product was purified and ligated into the pDrive cloning vector (Qiagen Pvt. Ltd.) following the manufacturer's instruction. The ligation mixture was then used to transform DH5α E. coli (100 μl) competent cells. The cells were mixed with 900 μl of Luria broth (LB) medium, incubated at 37°C for an hour and E. coli cell culture (100 μl) was plated onto Xgal/Amp/IPTG/LB plate. Following overnight incubation at 37 °C, the recombinant colonies were picked and used for plasmid extraction. Plasmid DNA was purified using Gene JETTM Plasmid Miniprep Kit (Fermentas Life Science) following the manufacture's instructions. The clones were confirmed through PCR.
Sequencing of Amprpha4, 11diene synthase (ADS3963)
The sequencing of ADS3963 gene cloned in pDrive was carried out through automated sequencer at Bangalore Genei,Bangalore.
Nucleotide and Protein Sequence Accession Number
The genomic sequence of Amorpha4, 11diene synthase gene having 3963 bp and its annotation has been submitted to GenBank and was assigned accession numbers FJ432667 for nucleotide and protein ACL15394.
Domain/Motif Search
PROSITE scan at the EXPaSy (http://www.expasy.ch/tools/scanprosite) was used to identify ROSITE motif in amorpha 411diene synthase protein.
Phylogenetic analysis
Nucleotide sequences of ADS3963 (FJ432667) from A. annua L. and other sesquiterpene synthase genes (EU798693, Santalum album sesquiterpene synthase mRNA; EU726270, Cistus creticus subsp Creticus germacrene B synthase mRNA; AF441124, Citrus sinensis valencene synthase (tps1) mRNA; AF288465, Citrus junos terpene synthase mRNA; NM122301, Arabidopsis thaliana terpene synthase/cyclase family protein (AT5G23960) mRNA; AF279455, Lycopersicon hirsutum sesquiterpene synthase 1 (SSTLH1) mRNA; AY860847, Artemisia dracunculus clone 27 sesquiterpene synthase gene; AB247331, Zingiber zerumbet zss1 mRNA for alphahumulene synthase; AY640155, Cucumis sativus betacaryophyllene synthase mRNA; AY397644, Solidago canadensis (+)germacrene D synthase mRNA; AY900123, Ixeris dentata leaf sesquiterpene cyclase mRNA; AF174294, Gossypium arboreum (+)deltacadinene sythase (vCAD1C1) gene and AY860849, Artemisia dracunculus clone 27 sesquiterpene synthase gene)were obtained from NCBI GenBank. The evolutionary relationship between nucleotide sequences was developed by MEGA 4.0 program to create NeighbourJoining tree [22]. Bootstrap probability was introduced to assure the statistical significance of groups in the phylogenetic tree [23].
Homology modeling of ADS3963
Template searching
In an attempt to search the suitable template for modeling the ADS3963 protein, (PS)2 MODELLER server [http://ps2.life.nctu.edu.tw/], an online tool for searching template based on sequence and structure wise similarity was used [24]. On the basis of homology percentage (39%), single template of the 5epi aristolochene synthase [PDB ID: 5eau] was selected [25] for 3D modeling of ADS3963 protein.
Sequence alignment
Amino acid sequence alignment of target and template protein was done using the SwissPdbViewer package (http://www.expasy.ch/spdbv/) with default parameters. The aligned sequences were checked and adjusted manually to minimize the number of gaps and insertions. The homology based modeling was used to predict the 3D structure of ADS3963. The (PS)2 MODELLER was used for alignment as well as modeling. The scripts were also employed to perform an alignment between the target and template sequence. A predicted 3D model was then obtained from the script model default based on the generated alignment.
Refinement and validation of predicted structure
The predicted model constructed was solvated and subjected to constraints of energy minimization with a harmonic constraint of 100 kJ/mol/Å2 applied for all protein atoms, using the steepest descent and conjugate gradient technique to eliminate bad contacts between protein atoms and structural water molecules. Computations were carried out in vacuo with the GROMOS96 43B1. The stereo chemical quality of the predicted structure was assessed by VADAR (http://www.redpoll.pharmacy.ualberta.ca/vadar) [26] and PROCHECK (http://nihserver.mbi.ucla.edu/). Ramachandran analysis showed that more than 90 % amino acids of ADS3963 gene were in most favored region, which supports the reliability of the predicted model.
Docking of Amorpha4, 11diene synthase (ADS3963) with Substrate Farnesyl Pyrophosphate (FPP)
After obtaining the final model, the centre coordinates of probable binding sites of enzyme were extracted using PASS (Putative Active Sites with Spheres) version pass10_2.0.36. (http://www.ccl.net/cca/software/UNIX/pass). FPP (PDB ID: 1UBY) was used as a substrate for docking studies. The ligand coordinates (FPP) were downloaded from Protein Data bank (http://www.rcsb.org/pdb/home/home.do). The FPP was docked with ADS3963 using th e L ama r c k i an Genetic algorithm (LGA) provided by the AutoDock program, version 4 (http://autodock.scripps.edu/). The docking of FPP was performed with respect to all the 17 binding sites of the enzyme. The residues lining the cavities within 10Å region around the ligand molecule were extracted using a GETNEAREST [27] and LIGPLOT [28] was used to plot proteinligand interactions.
Results and Discussion
Amorpha4, 11diene synthase belongs to the class of enzymes referred as sesquiterpene synthases (sesquiterpene cyclases). These are very similar in physical and chemical properties. Based on their similar reaction mechanism, conserved structural and sequence characteristics including amino acid sequence homology,conserved sequence motif, introns number and exon size, several groups have suggested that plant sesquiterpene synthases have a common evolutionary origin. Because of lack of the knowledge of structurefunction correlation, it was impossible to predict the function of terpene synthases solely on the degree of sequence identity until now [29]. The cloned ADS3963 gene sequence (FJ432667) from a high yielding strain of A. annua by us, have shown 98 % identity with nucleotide and amino acid sequences derived from earlier published A. annua ADS genes. The genomic organization of ADS3963 gene comprises of 7 exons (17103; 218448; 21262508; 26852903; 30323170; 32683516 and 3629-3922) and 8 introns (116; 104217; 4492125; 25092684; 29043031; 31713267; 35153628 and 39233963), respectively (Figure 1). The deduced mass and the pI of the encoded protein are 62.2 kDa and 5.25, respectively. Multiple sequence alignment analysis of ADS3963 gene with seven ADS genes already reported has revealed 98 % homology. ADS3963 (533 aa) protein has shown more than 98%identity with amorpha4, 11diene synthase proteins of other strains of Artemisia annua L. with respect to amino acid composition (Figure 2). Phylogenetic tree analysis has shown relatively higher homology of ADS3963 gene from Artemisia annua L. with sesquiterpene synthases from angiosperms(Figure 3). The 6 different types of domain were found in ADS3963 protein (ACL15394) ( Table 1).
Figure 1.
Genomic organization of Amorpha4, 11diene synthase gene (ADS3963). Numbers shown below the lines are the start and end positions of exons.
Figure 2.
Multiple sequence alignment of different amorpha-4, 11-diene synthase genes.
Figure 3.
Evolutionary relationships of 13 taxa of nucleotide sequence of ADS3963 and other sesquiterpene synthase genes using the Neighbor-Joining method. The bootstrap consensus tree inferred from 500 replicates is taken to represent the evolutionary history of the taxa analyzed. Branches corresponding to partitions reproduced in less than 50% bootstrap replicates are collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches. The evolutionary distances were computed using the Maximum Composite Likelihood method and are in the units of the number of base substitutions per site.
The highly conserved region, aspartate rich motif (DDxxD position 286 290), is the characteristic of all terpene synthases [30]. This motif is involved in the coordination of the substrate bound divalent metal ion (Mg2+ and Mn2+) [25]. Two basic residues Arg264 and Arg441 from 5epiaristolochene synthase from Nicotiana tabacum [25]corresponding to Arg249 and Arg427 in ADS3963 protein are brought close to each other by loop movement (Figure 4). The deletion of 13 amino acid residues from ADS3963 protein has shown no influence on its catalytic domains, as they may not constitute the essential structure or it may be non catalytic site of protein. The presence of a new motif SlwD, a casein kinase II phosphorylation domain site at position 104107 in ADS3963 protein, not found in other terpene synthase, is a unique finding of our study. It has been shown that the activity of enzymes of terpenoid biosynthetic pathway in higher eukaryotes is regulated by phosphorylation/dephosphorylation. Enzymes whose activity is regulated by phosphorylation differ both with respect to the spatial relationship between their active and regulatory sites and the mechanisms by which phosphorylation modulates activity. Therefore, the presence of new site present in ADS3963 protein suggested it may be part of regulatory region of the enzyme and the DDxxD may be the part of active site for binding of the substrate.
Figure 4.
Alignment of deduced amino acid sequences of ASD3963 and epi-aristolochene synthase (5eau) from tobacco. The consensus sequence shows amino acid residues conserved in above sequence.
The threedimensional )3D( structure of protein is of major importance in providing insights into their molecular functions. The result showed that the putative ADS3963 protein contains 72 % of αhelices, 23 % βturns and 26 % of random coils (Figure 5). Penetrating through most part of the secondary structure, α helices and random coils are the most abundant structural elements of ADS3963, while β turns are intermittently distributed in protein. The total energy values of the predicted 3D model of ADS protein were calculated as 93 % before energy minimization and 90 % after energy minimization from Ramachandran plot. The refined model of ADS3963, analyzed by VADAR for the evaluation of the Ramachandran plot quality, was found to be satisfactory based on expected values representing those numbers which would be expected for highly refined XRay and NMR protein structure (Table 2). The description of reaction regulation in enzymes responsible for activating and catalyzing small molecules requires identification of ligand movement into the binding site and out of the enzyme through specific channels and docking sites.
Figure 5.

Prediction of 3D model of Amorpha-4, 11-diene synthase showing N (blue) and C (Red) terminal isolated from Artemisia annua L. plant.
Recent studies have revealed that the core sequences of many proteins were nearly optimized for stability by natural evolution. Surface residues, by contrast, were not so optimized, presumably because protein function is mediated through surface interactions with other molecules. Here, we sought to determine the extent to which the sequences of protein ligandbinding and enzyme active sites could be predicted by optimization of scoring functions based on protein ligandbinding affinity rather than structural stability [32]. In an attempt to find the possible binding sites of FPP on Amorpha4, 11diene synthase (ADS3963), PASS was performed. The output of PASS contains center coordinates for 17 binding sites. Docking of FPP was performed with respect to all the 17 binding sites of the enzyme (Table 3). The aim of molecular docking is to achieve an optimized conformation for both the protein, ligand and relative orientation between protein and ligand such that the free energy of the overall system is minimized. Lower energy corresponds to better binding therefore; initial six of these binding sites (Table 3)were studied for the interaction with FPP (Figure 6). Based on the docking studies the following stretches are Ala 321Ala324 Lys398; Ala 234Val237Phe283Thr286Tyr287; Ser94Met95Trp141 Trp430Asn434; Ser94Arg96 Glu104Leu107Lys142Lys431; Lys137Arg143Ile147Ala150Gln151Leu478 and Ser218 Gly219 Tyr224Arg228Cys352Met356Aln450 may constitute to substrate binding. These results may have implications for understanding the role of amorpha 4, 11diene synthase in cyclization of FPP.
Figure 6.
Figure showing the receptor molecule with the ligand docked in six top ranking binding sites. All the six figures in different colour correspond to six binding sites containing the ligand molecule FPP docked into it.
Conclusions
Amorpha4,11diene synthase in A. annua L. plants is reported to be a key regulatory enzyme, catalyzing the ratelimiting step in the biosynthesis of artemisinin. In this study, we conclude that the putative amorpha4,11diene synthase gene (ADS3963) cloned by us from high artemisinin (0.70.9 % dry wt basis) yielding strain of A. annua L. plant is evolutionary conserved, as suggested by phylogenetic analysis. It encodes a protein of 533 amino acid residues with conserved domain DDxxD. The absence of thirteen amino acids in this protein has resulted into the formation of a new motif, SlwD, which might have role in regulating the active state of the enzyme. This finding of our study also indicates that ADS3963 protein may be an isoform. These results, however, has to be corroborated with wet lab data. Further, if the structure and function of ADS3963 protein is understood, it can serve as model to study other related enzymes and could be utilized to develop new functional approaches for overexpression of this key enzyme leading to enhanced synthesis and accumulation of artemisinin in A. annua L. plants.
Supplementary material
Acknowledgments
P.A. thanks to Jamia Hamdard for financial support in the form of a studentship. We are highly thankful to Andrew M. Lynn, Director, School of Information Technology, Jawaharlal Nehru, University,New Delhi, India for providing me docking facilities in their lab. We thank Ipca Laboratories Limited, Ratlam, India for providing the seeds of high yielding strain of A. annua L.
Footnotes
Citation:Alam et al, Bioinformation 4(9): 421-429 (2010)
References
- 1.Abdin, et al. Planta Med. 2003;69:289–299. doi: 10.1055/s-2003-38871. [DOI] [PubMed] [Google Scholar]
- 2.Akhila, et al. Phytochemistry. 1990;29:2129. [Google Scholar]
- 3.Bharel, et al. J Nat Prod. 1998;61:633. doi: 10.1021/np970024s. [DOI] [PubMed] [Google Scholar]
- 4.Bhattacharya, et al. Mendeleev Communications. 2007;1:27. [Google Scholar]
- 5.Bouwmeester, et al. Phytochemistry. 1999;52:843. doi: 10.1016/s0031-9422(99)00206-x. [DOI] [PubMed] [Google Scholar]
- 6.Bowman, et al. Proc Natl Acad. 1990;8:9052. [Google Scholar]
- 7.Brown GD. J Nat Prod. 1993;55:1756. [Google Scholar]
- 8.Chakrabarti, et al. Proc Natl Acad Sci USA. 2005;102:10153. [Google Scholar]
- 9.Chang, et al. Arch Biochem Biophys. 2000;383:178. doi: 10.1006/abbi.2000.2061. [DOI] [PubMed] [Google Scholar]
- 10.Chen, et al. Nucleic Acids Res. 2006;34:W152. doi: 10.1093/nar/gkl187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Delabays, et al. Acta Hort. 1993;330:203. [Google Scholar]
- 12.Dhingra, et al. Life Sci. 2000;66:279. doi: 10.1016/s0024-3205(99)00356-2. [DOI] [PubMed] [Google Scholar]
- 13.Khan, et al. Afr J Biotechnol. 2007;6:175. [Google Scholar]
- 14.Kim NC, Kim SU. J Kor Agric Chem Soc Rev. 1992;35:106. [Google Scholar]
- 15.Kuntz, et al. J Mol Biol. 1982;161:269. doi: 10.1016/0022-2836(82)90153-x. [DOI] [PubMed] [Google Scholar]
- 16.Laughlin JC. Trans Royal Soc Trop Med Hyg. 1994;88:21. doi: 10.1016/0035-9203(94)90465-0. [DOI] [PubMed] [Google Scholar]
- 17.Liu, et al. J Integr Plant Biol. 2006;48:1486. [Google Scholar]
- 18.Martin, et al. Nat Biotechnol. 2003;21:796. doi: 10.1038/nbt833. [DOI] [PubMed] [Google Scholar]
- 19.McGarvey DJ, Croteau R. Plant Cell. 1995;7:1015. doi: 10.1105/tpc.7.7.1015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Mercke, et al. Arch Biochem Biophys. 2000;381:173. doi: 10.1006/abbi.2000.1962. [DOI] [PubMed] [Google Scholar]
- 21.Posner, et al. Med Chem. 1995;38:607. doi: 10.1021/jm00004a006. [DOI] [PubMed] [Google Scholar]
- 22.Ravindranathan, et al. Tetrahedron Lett. 1990;31:755. [Google Scholar]
- 23.Ro, et al. Nature. 2007;440:940. [Google Scholar]
- 24.Sangwan, et al. Phytochemistry. 1993;34:1301. [Google Scholar]
- 25.Starks, et al. Science. 1997;277:1815. doi: 10.1126/science.277.5333.1815. [DOI] [PubMed] [Google Scholar]
- 26.Tamura, et al. Mol Biol Evol. 2007;24:1596. doi: 10.1093/molbev/msm092. [DOI] [PubMed] [Google Scholar]
- 27.Tamura, et al. Proceedings of the National Academy of Sciences (USA) 2004;101:11030. [Google Scholar]
- 28.Trapp SC, Croteau RB, et al. Genetics. 2001;158:811. doi: 10.1093/genetics/158.2.811. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Van, et al. Trends Pharmacol Sci. 1999;20:199. doi: 10.1016/s0165-6147(99)01302-4. [DOI] [PubMed] [Google Scholar]
- 30.Wallaart, et al. Planta. 2001;212:460. doi: 10.1007/s004250000428. [DOI] [PubMed] [Google Scholar]
- 31.Willard, et al. Nucleic Acids Res. 2003;31:3316. doi: 10.1093/nar/gkg565. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. http://www.expasy.ch/tools/scanprosite.
- 33. http://ps2.life.nctu.edu.tw/
- 34. http://www.expasy.ch/spdbv/
- 35. http://www.redpoll.pharmacy.ualberta.ca/vadar.
- 36. http://www.ebi.ac.uk/Tools/clustalw2/index.html.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





