Skip to main content
Bioinformation logoLink to Bioinformation
. 2013 Feb 21;9(4):197–206. doi: 10.6026/97320630009197

Computational analysis of common bean (Phaseolus vulgaris L., genotype BAT93) lycopene β-cyclase and β-carotene hydroxylase gene's cDNA

Subhash Janardhan Bhore 1,2,*, Kassim Amelia 1,2, Edina Wang 2, Sindhuja Priyadharsini 2, Farida Habib Shah 1,3
PMCID: PMC3602890  PMID: 23519320

Abstract

The identification of genes and understanding of genes' expression and regulation in common bean (Phaseolus vulgaris L.) is necessary in order to strategize its improvement using genetic engineering techniques. Generation of expressed sequence tags (ESTs) is useful in rapid isolation, identification and characterization of the genes. To study the gene expression in P. vulgaris pods tissue, ESTs generation work was initiated. Early stage and late stage bean-pod-tissues cDNA libraries were constructed using CloneMiner cDNA library construction kit. In total, 5972 EST clones were isolated using random method of gene isolation. While processing ESTs, we found lycopene β-cyclase (PvLCY-β) and β-carotene hydroxylase (PvCHY-β) gene's cDNA. In carotenoid biosynthesis pathway, PvLCY-β catalyzes the production of carotene; and PvCHY-β is known to function as a catalyst in the production of lutein and zeaxanthin. To understand more about PvLCY-β and PvCHY-β, both strands of both cDNA clones were sequenced using M13 forward and reverse primers. Nucleotide and deduced protein sequences were analyzed and annotated using online bioinformatics tools. Results showed that PvLCY-β and PvCHY-β cDNAs are 1639 and 1107 bp in length, respectively. Analysis results showed that PvLCY-β and PvCHY-β gene's cDNA contains an open reading frame (ORF) that encodes for 502 and 305 amino acid residues, respectively. The deduced protein sequence analysis results also showed the presence of conserved domains needed for PvLCY-β and PvCHY-β functions. The phylogenetic analysis of both PvLCY-β and PvCHY-β proteins showed it's closeness with the LCY-β and CHY-β proteins from Glycine max, respectively. The nucleotide sequence of PvLCY-β and PvCHY-β gene's cDNA and it's annotation is reported in this paper.

Keywords: Expressed sequence tags, Genetic engineering, Health, Human population, Malaysia, Natural products, Nutrition, Phaseomics, Proteins, Vegetables

Background

We certainly do not know how many people are malnourished; but, FAO report indicates that there are about 925 million undernourished people in the world [2]. The animal products (eggs, meat, milk, etc.) are a source of dietary proteins; but, proteins are usually derived from legumes (plants from the bean and pea family) especially by poor people [2]. There are thousands of legume species, but common beans (Phaseolus vulgaris L.) are cultivated on the large scale. By understanding the importance of P. vulgaris, the Phaseomics international consortium was developed to establish the necessary framework of knowledge and materials for the advancement of bean genomics, transcriptomics, and proteomics; and the main goal of it is to help in generating new common bean varieties suitable and desired by farmers and consumers [3]. As a part of the international consortium for Phaseolus genomics [3], research work on generation of P. vulgaris expressed sequence tags (ESTs) was initiated at Melaka Institute of Biotechnology, Malaysia.

The randomly isolated anonymous cDNA clones (on a large scale) are treated as ESTs and used extensively in the gene's expression and regulation studies [4]. The generated ESTs data is also used in the evaluation of the genomes for genes content and its structure, in comparative gene expression analysis between different plant tissues using computational tools [5], and in discovery of new and novel genes [6]. In monocot and dicot plants, various new and novel genes have been identified by using random method of cDNA clones isolation and their nucleotide sequencing [711]. Hence, ESTs were generated to study the gene's expression and regulations in bean– pod-tissue in-line-with the agenda of the international consortium for Phaseolus genomics [3].

To this point, we have generated 5972 ESTs; and annotated ESTs were deposited into ESTs database hosted by National Center for Biotechnology Information (NCBI) GenBank / DDBJ / EMBL (our unpublished work). While processing and analysing generated ESTs, we found lycopene β-cyclase and β- carotene hydroxylase gene's cDNA [12, 13]. The source of lycopene β-cyclase and β-carotene hydroxylase cDNA is P. vulgaris; hence, lycopene β-cyclase and β-carotene hydroxylase cDNAs were designated as PvLCY-β and PvCHY-β, respectively. In carotenoids biosynthesis pathway, PvLCY-β catalyzes the production of carotene (α-carotene and β-carotene) [12, 14]; and PvCHY-β is known to function as a catalyst in the production of lutein and zeaxanthin [13].

Due to antioxidant properties of carotenes (β-carotene), several health benefits associated with its consumption are reported elsewhere [15]. Similarly, the benefits of lutein and zeaxanthin consumption are reported by many researchers; and their reports are reflecting the importance of these (carotenes, lutein and zeaxanthin) natural products in human health [1622].

Both, PvLCY-β and PvCHY-β cDNA clones do have potential applications in genetic engineering of P. vulgaris and other plants. That is why, both clones were fully sequenced. These two cDNA clones could be used in manipulating P. vulgaris and level of carotene, lutein and zeaxanthin could be elevated. Hence, in order to understand more about PvLCY-β and PvCHY-β, their cDNA clones were analysed and annotated. The nucleotide and deduced protein sequence of PvLCY-β and PvCHY-β gene's cDNA are analyzed and annotated in this study using computational tools. The nucleotide sequence of PvLCY-β and PvCHY-β gene's cDNA and its annotation is reported in this paper.

Methodology

Plant Materials:

The seeds of P. vulgaris genotype BAT93 were kindly provided by Patricia Lariguet, Laboratoire de Biologie Moléculaire des Plantes Supérieures, Department of Plant Biology, University of Geneva, Geneva, Switzerland. Seeds were germinated in soil obtained from a nursery (Melaka, Malaysia), and seedlings were maintained to grow in the open area at Melaka Institute of Biotechnology, Malaysia.

PvLCY-β and PvCHY-β cDNA clones isolation:

The PvLCY-β and PvCHY-β cDNA clones were identified from the ESTs generated using random method of gene isolation [7, 8, 23]. The cDNA clone encoding PvLCY-β was isolated from 20- day-old [days after anthesis (DAA)] bean-pod-tissue cDNA Entry Library; and the cDNA clone encoding PvCHY-β was isolated from 5-day-old bean-pod-tissue cDNA Entry Library. The cDNA libraries were constructed (our unpublished data) using ‘CloneMiner cDNA library construction kit’ procured from Invitrogen Corporation.

Plasmid DNA isolation:

The individual cultures of Escherichia coli strain DH5α cells harbouring recombinant plasmids with PvLCY-β and PvCHY-β cDNA clones were cultivated in 10 ml LB medium supplemented with 40µg/ml Kanamycin. Cultures were incubated in dark at 37°C, 160 rpm for 18 h. From harvested E. coli cells, plasmid DNA was isolated using Wizard® Plus SV Minipreps DNA purification system, a commercial kit (Promega).

Nucleotide sequencing:

Purified plasmid DNA was used in sequencing reactions. Both strands of both PvLCY-β and PvCHY-β cDNA clones were sequenced using M13 (Forward) [5'-GTAAAACGACGGCCAG- 3'] and M13 (Reverse) [5'-GGATAACAATTTCACACAGG-3'] primers.

cDNA and deduced protein sequence analysis:

For both PvLCY-β and PvCHY-β cDNA clones, the nucleotide sequence of plus (+) and minus (−) strands were aligned using Blast (bl2seq) program available at NCBI [http://blast.ncbi.nlm.nih.gov/]. The 5' and 3' ends of the cDNA sequences were edited to eliminate adaptor and vector sequences. The finalized cDNA sequences were analyzed using online bioinformatics tools.

The similarity searches were performed using blast programs (BlastN and BlastP) available at NCBI. Online bioinformatics tools available at JustBio [http://www.justbio.com/] were used to deduce the protein sequence, to find out the general features of PvLCY-β and PvCHY-β cDNA and deduced protein sequences. The EMBOSS Water - Pairwise Sequence Alignment [http://www.ebi.ac.uk/Tools/emboss/align/] was used to compare cDNA and deduced protein sequences to find out similarity% with their counterparts from other species. Guanine and cytosine (GC %) content calculation was carried out by using ‘DNA/RNA base composition calculator'. Alignment of multiple protein (amino acids) sequences was carried out using multiple sequence alignment by ClustalW program, and the phylograms were constructed using BioEdit and TreeView programs [24, 25]. Proteins sequences were aligned by using CLUSTAL 2.1 multiple sequence alignment program to find out conserved residues in both PvLCY-β and PvCHY-β deduced proteins.

Discussion

PvLCY-β and PvCHY-β cDNA clones isolation:

The full-length PvLCY-β and PvCHY-β cDNA clones were isolated from 20-day-old and 5-day-old bean-pod-tissues cDNA libraries, respectively. The isolated PvLCY-β and PvCHY-β cDNA clones were designated as PvLCY-β and PvCHY-β to indicate their precise identity and the source of the plant to which they belong.

Nucleotide sequencing:

Both, sense (+) and antisense (−) strands of both cDNA clones were sequenced where M13 forward and M13 reverse primers were used. After elimination of the vector and adaptor sequence, the sequence of sense and antisense strand of individual cDNA was compared using blast (bl2seq) program. Analysis of the results showed that PvLCY-β and PvCHY-β cDNAs are 1639 and 1107 bp in length, respectively.

cDNA and Deduced Protein Sequence Analysis:

The identity of both cDNA clones was confirmed by analyzing finalized respective cDNA sequence and its deduced amino acid sequence. Annotated nucleotide sequences of both PvLCY-β and PvCHY-β cDNA were deposited in GenBank/DDBJ/EMBL under the accession numbers HQ199604 and JN255133, respectively. Annotated general features of cDNA nucleotide and protein sequences are summarized in Table 1 (see supplementary material); and nucleotide sequence of PvLCY-β and PvCHY-β cDNA along with its deduced amino acid sequence is shown in Figure 1 & Figure 2, respectively.

Figure 1.

Figure 1

Nucleotide and deduced amino acid sequences of Phaseolus vulgaris lycopene β-cyclase (PvLCY-β) cDNA clone. Open reading frame (ORF) and 3' non-coding region of cDNA are shown in capital and small letters, respectively. The deduced aminoacid sequence is given below the nucleotide sequence, and numbered at both ends of each sequence line. The ORF encodes for a protein of 502 amino acid residues (blue). Amino acid residues are numbered beginning with the initial Methionine (M) till last Glutamic acid (E) residue. Initiation and termination codons are shown in green and red colour, respectively. *represent the termination codon. This cDNA clone was isolated from P. vulgaris 20-day-old-pods tissue cDNA library.

Figure 2.

Figure 2

Nucleotide and deduced amino acid sequences of Phaseolus vulgaris beta-carotene hydroxylase (PvCHY-β) cDNA clone. Open reading frame (ORF) and non-coding regions of cDNA are shown in capital and small letters, respectively. The deduced amino-acid sequence is given below the nucleotide sequence, and numbered at both ends of each sequence line. The ORF encodes for a protein of 305 amino acid residues (blue). Amino acid residues are numbered beginning with the initial Methionine (M) till the last Serine (S) residue. Initiation and termination codons are shown in green and red colour, respectively. *represent the termination codon. This cDNA clone was isolated from P. vulgaris 5-day-old-pods tissue cDNA library.

The similarity% of both PvLCY-β and PvCHY-β cDNA nucleotide and deduced protein sequence with their counterparts from other species are shown in Table 2 & Table 3 (see supplementary material), respectively. The amino acid Sequence analysis results showed that both PvLCY-β and PvCHY-β proteins are Leucine (L) rich (Supplementary Figure 1 & Figure 2). The comparison of the PvLCY-β protein with its counterparts from other species showed that 217 (out of 502) residues (43.23%) are fully conserved. But, in case of the PvCHY-β protein, results showed that only 67 (out of 305) residues (21.97%) are fully conserved. The consecutive search for conserved domains in PvLCY-β and PvCHY-β protein sequences resulted in the detection of their conserved domains, and the results are summarised in Table 4 (see supplementary material). The phylograms were constructed in order to understand phylogenetic relationship of PvLCY-β and PvCHY-β proteins with their counterparts from other species. The phylograms for PvLCY-β and PvCHY-β proteins are shown in Figure 3 & Figure 4, respectively.

Figure 3.

Figure 3

The phylogram showing phylogenetic relationship of common bean (Phaseolus vulgaris L.) lycopene β-cyclase (PvLCY-β) protein with its counterparts from other species. Available 35 full-length LCY-β protein sequences were retrieved from NCBI database (see supplementry Table 2). The location of PvLCY-β protein in phylogram is shown in a pink box.

Figure 4.

Figure 4

The phylogram showing phylogenetic relationship of common bean (Phaseolus vulgaris L.) β-carotene hydroxylase (PvCHY-β) protein with its counterparts from other species. Available 16 full-length PvCHY-β protein sequences were retrieved from NCBI database (see supplementary Table 3). Location of PvCHY-β protein in phylogram is shown in a pink box.

The understanding of the identified genes, their expression patterns and regulation is crucial in order to strategize the approach to manipulate any biosynthesis pathway of interest in the plants. For the suppression of a gene expression, partial sequence of that gene can be utilized to induce posttranscriptional gene silencing (PTGS) [2628]. However, the full length gene or its cDNA is required for its over-expression in order to increase either the production of desired vital proteins or natural products [29]. Therefore, understanding of gene of interest and it's cDNA is prerequisite before it can be used in recombinant DNA (rDNA) technology to manipulate genetically, any plant of interest or organism [30].

The main goal of this study was to annotate PvLCY-β and PvCHY-β gene's cDNA and deduced respective protein (amino acid sequence). The PvLCY-β cDNA clone was identified in 20- day-old-pod tissue cDNA library, and it indicates that PvLCY-β is expressed in bean's 20-day-old developing pod tissue. However, the PvCHY-β cDNA clone was identified in 5-dayold- pod tissue cDNA library; and it reflects that PvCHY-β is expressed in bean's 5-day-old developing pod tissue. However, the level of both gene's expression, pattern of expression, and tissue-specificity is not clear at this moment as we have not characterised these two gene's expression. It can be done by using either Northern hybridization technique or microarray technique [31].

The LCY-β protein of G. max showed the maximum similarity (95.2%) with inferred PvLCY-β protein. Whereas, LCY-β protein from Cryptomeria japonica and Taxodium distichum showed less similarity (83.8%) with inferred PvLCY-β protein. Both, Cryptomeria japonica and Taxodium distichum are species that belong to gymnosperms; the relatively low level of PvLCY-β similarity with LCY-β from Gymnosperm members is in line with evolution in plant species [33]. Interestingly, LCY-β protein of Salicornia europaea and Crocus sativus showed lowest similarity with PvLCY-β. While analysing PvCHY-β, we noticed that Glycine max CHY-β protein shows the maximum (78%) similarity with inferred PvCHY-β protein. On the contrary, Muriella zofingiensis CHY-β protein showed less (58%) similarity with PvCHY-β protein. Muriella zofingiensis is an algal member, and the relatively low level of similarity between PvCHY-β and CHY-β from M. zofingiensis is along the lines of the evolution in plants. These results are similar to the results reported by Bhore et al. [23]. Both, PvLCY-β and PvCHY-β proteins showed highest similarity with LCY-β and CHY-β of G. max, respectively. This makes logical sense because both P. vulgaris and G. max belongs to the same family, Fabaceae [34].

The PvCHY-β protein contains conserved domain for betacarotene hydroxylase which is a member of the Fatty acid hydroxylase superfamily (Accession No: cl01132) [35]. In PvLCY-β, two main conserved domains namely, NADB_Rossmann super family and lycopene beta cyclase were detected. The NADB domain does exist in numerous dehydrogenases of metabolic pathways; for example glycolysis. The lycopene cyclase family protein conserved domain was detected in PvLCY-β, and lycopene beta and epsilion cyclases are part of this protein family [3638].

Phaseolus vulgaris is a valuable source of proteins in the human diet; and it is important to increase the yield of this essential crop [3, 39]. Several research teams are using GM technology approach to improve yield of the beans [30, 40]. For instance developing P. vulgaris resistant to the herbicide [41] and viral infection [42]. In addition to this, there is a vast scope to modify beans genetically for improving the nutritional quality of its pods and seeds. This type of genetic manipulation is possible; because, rice (Oryza sativa) has been genetically engineered and β-carotene content in it has been increased for use as a source of vitamin A [43]. Similarly, β-carotene content can be increased in beans by over-expression of PvLCY-β in its carotenoid biosynthesis pathway (Supplementary Figure 3). Furthermore, therapeutically beneficial lutein and zeaxanthin content increment in beans is also possible by over-expression of PvCHY-β [44].

Genetic modification of agricultural crop plants to improve yield and nutritional quality is a viable option, and it is absolutely important as far as human wellbeing is concerned [30, 40]. Both, isolated PvLCY-β and PvCHY-β gene's cDNA are reasonably well annotated in this study, and we believe that the available annotated cDNA sequences could be useful in designing the strategy for the construction of transformation vectors. Further research is needed in this line to achieve the ultimate goal of generating new common bean varieties suitable and desired by farmers and consumers.

Conclusion

This study has annotated the salient features of PvLCY-β and PvCHY-β gene's cDNA clones. The computational analysis of deduced PvLCY-β and PvCHY-β proteins revealed the presence of conserved domains. Furthermore, the comparative analysis of deduced PvLCY-β and PvCHY-β protein sequences with their counterparts from other species revealed the fully conserved amino acid residues. However, further study is required to understand PvLCY-β and PvCHY-β gene's expression and its regulation in bean-pods. Both genes' over-expression in beanpods can be considered for futher research to explore the possibility of nutritional quality improvement of the bean-pods and bean-seeds.

Conflict of Interests

Authors attest that there are no conflicts of interest to declare.

Supplementary material

Data 1
97320630009197S1.pdf (369.8KB, pdf)

Acknowledgments

Authors are grateful to the Ministry of Science, Technology and Innovation (MOSTI), Malaysia for research funding [Research Grant Code Number: BSP (M) / BTK / 004 (3)]; and to Patricia Lariguet, Department of Plant Biology, University of Geneva, Geneva, Switzerland for supplying seeds of bean, genotype BAT93.

Footnotes

Citation:Bhore et al, Bioinformation 9(4): 197-206 (2013)

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data 1
97320630009197S1.pdf (369.8KB, pdf)

Articles from Bioinformation are provided here courtesy of Biomedical Informatics Publishing Group

RESOURCES