Skip to main content
Pharmacognosy Magazine logoLink to Pharmacognosy Magazine
. 2012 Jan-Mar;8(29):4–11. doi: 10.4103/0973-1296.93301

Application of deoxyribonucleic acid barcoding in Lauraceae plants

Zhen Liu 1,2, Shi-Lin Chen 3, Jing-Yuan Song 3, Shou-Jun Zhang 4, Ke-Li Chen 2,
PMCID: PMC3307201  PMID: 22438656

Abstract

Background:

This study aims to determine the candidate markers that can be used as DNA barcode in the Lauraceae family.

Material and Methods:

Polymerase chain reaction amplification, sequencing efficiency, differential intra- and interspecific divergences, DNA barcoding gap, and identification efficiency were used to evaluate the four different DNA sequences of psbA-trnH, matK, rbcL, and ITS2. We tested the discrimination ability of psbA-trnH in 68 plant samples belonging to 42 species from 11 distinct genera and found that the rate of successful identification with the psbA-trnH was 82.4% at the species level. However, the correct identification of matK and rbcL were only 30.9% and 25.0%, respectively, using BLAST1. The PCR amplification efficiency of the ITS2 region was poor; thus, ITS2 was not included in subsequent experiments. To verify the capacity of the identification of psbA-trnH in more samples, 175 samples belonging to 117 species from the experimental data and from the GenBank database of the Lauraceae family were tested.

Results:

Using the BLAST1 method, the identification efficiency were 84.0% and 92.3% at the species and genus level, respectively.

Conclusion:

Therefore, psbA-trnH is confirmed as a useful marker for differentiating closely related species within Lauraceae.

Keywords: Deoxyribonucleic acid barcoding, ITS2, Lauraceae, matK, psbA-trnH, rbcL

INTRODUCTION

Lauraceae is a large family of woody plants (except the herbaceous parasite, Cassytha) with about 50 genera and 2500 to 3000 species distributed throughout tropical to subtropical latitudes. Lauraceae plants have the extremely important economic value. A great number of them are important resource in the construction timber, spice, essential oil, and medicinal plants. Simultaneously, as their crowns are spacious, they have immense ecological value for virescence and environment protection. Boasting of various kinds and widespread distribution, Lauraceae plants are known to have an ancient origin with a fossil record dating back to the mid-Cretaceous period.[1] However, the evolution and developing process of these plants are very slow. Since boundaries of many species in the family are quite unclear, it is difficult to identify them while the traditional morphological methods are used. Thus, it is significant to develop a quick, simple, and effective method to identify the species in the Lauraceae family.

Deoxyribonucleic acid (DNA) barcoding is the researching focus on biodiversity in the world in recent years. The core of the research is to choose a universal barcode in order to appraise the species quickly and accurately. In 2003, Herbert analyze the order of the genes of the cytochrome c oxidase subunit 1 (CO1) belonging to 11 phyla from 13320 species.[2] Then, as regards animals, most researchers agree that the mitochondrial gene encoding CO1 is a favorable region for use as the standard DNA barcode in the world. Compared with the excellent study in the animal barcode, the study in the plants barcode is relatively slow.

The plant working group of the Consortium for the barcode of life recommended the two-locus combination of rbcL + matK for plant barcoding.[3] Chen et al., tested the discrimination ability of ITS2 in more than 6600 plant samples belonging to 4800 species from 753 distinct genera; they found that the ITS2 region possesses many advantages compared with plastid loci, including rbcL and matK region. They also recommended for psbA-trnH to be a complementary barcode to ITS2 for a broad series of plantae.[4]

Despite some scholars having carried out DNA barcoding research for related species and genera,[59] none had referred to multiple samples in the Lauraceae family. In this study, four potential DNA regions (psbA-trnH, matK, rbcL, and ITS2) were tested for their suitability as DNA barcodes for the Lauraceae family (68 samples belonging to 42 species from 11 genera). The true ability of the candidate sequences to identify species of Lauraceae as a universal DNA barcode is assessed in spite of many closely related species in the samples.

MATERIALS AND METHODS

Experimental materials (68 samples belonging to 42 species from 11 diverse genera) were collected from the Chinese provinces of Hubei, Jiangxi, Guangdong, and Guangxi. The materials are authenticated by Prof. Panhong Lin of Hubei College of Traditional Chinese Medicine and Engr. Zhang Shoujun of Wuhan Botanical Garden at the Chinese Academy of Sciences. All specimen and image vouchers were maintained at the herbarium of Hubei College of Traditional Chinese Medicine. To increase further the number of species represented, psbA-trnH sequences from the taxonomy database of the National Centre for Biotechnology Information (NCBI) were included in the reference database.

Leaf tissues were firstly dried in silica gel. A total of 10 mg of each of the dried tissues was rubbed for 1 min at a frequency of 30 times/second in a FastPrep bead mill (Retsch MM400, Germany). Total DNA was extracted as instructed by the Plant Genomic DNA Kit (Tiangen Biotech Co., China). The polymerase chain reaction (PCR) reaction mixture consisted of 1 μL (~30 ng) DNA, 2 μL of 25 mM MgCl2, 2.5 μL of 10×PCR buffer, 1.0 U of Taq DNA polymerase, 2 μL of 2.5 mM dNTPs mix (Biocolor BioScience and Technology Co., China), 1.0 μL of 2.5 μM primers (Synthesized by Sangon Co., China); the final volume was 25 μL. Sequences of the universal primers for the tested DNA barcode, including those for psbA-trnH, matK, rbcL, and ITS2, as well as general PCR reaction conditions, were obtained from previous studies.[4] PCR products were purified using the Gel Band Purification Kit (Tiangen Biotech Co., China) and sequenced on an ABI 3730XL sequencer (Applied Biosystems, USA). The sequences were submitted to GenBank.

Sequence editing and contig assembly were conducted by CodonCode Aligner (CodonCode Co., Germany). Sequences were aligned using CLUSTALW and analyzed by the MEGA 4.0 software program. Average interspecific distances, theta prime, and smallest interspecific distances were used to characterize interspecific divergences.[4,10,11] Average intraspecific distances, theta, and coalescent depth were calculated to determine intraspecific variations using Kimura 2-parameter (K2P) distances.[10] Wilcoxon signed rank tests were performed as described previously.[12,13] Barcoding gap was calculated by TAXON DNA.[14] To estimate the reliability of species identification using the DNA barcoding technique, two methods (BLAST1 and the nearest genetic distance) were carried out.[15]

RESULTS

PCR amplification and sequencing efficiency

Results showed that psbA-trnH, matK, and rbcL sequences were successfully amplified and sequenced at 100%. However, in our pilot study, the PCR amplification efficiency of the ITS2 region was poor; thus, ITS2 was not included in subsequent experiments [Table 1].

Table 1.

Efficiency of polymerase chain reaction amplification and success rate of sequencing of potential barcodes in total number of samples

graphic file with name PM-8-4-g001.jpg

Analysis of intraspecific variations and interspecific divergences

A favorable barcode should own low intraspecific variations and high interspecific divergence in order to distinguish different species. First, upon comparison of interspecific genetic distances among congeneric species for three candidate barcodes, it was observed that the chloroplast noncoding region of psbA-trnH exhibited the highest interspecific divergence for all three metrics, followed by rbcL, while matK provided the lowest divergence [Table 2]. Moreover, Wilcoxon signed rank tests confirmed that psbA-trnH provided the highest interspecific divergence among congeneric species [Table 3].

Table 2.

Analysis of interspecific divergence between congeneric species and intraspecific variation of candidate barcodes

graphic file with name PM-8-4-g002.jpg

Table 3.

Wilcoxon signed rank test for interspecific divergences

graphic file with name PM-8-4-g003.jpg

Second, it was found that matK showed the lowest level of intraspecific variation for all three parameters, followed by rbcL, while psbA-trnH provided the highest variation [Table 2]. Wilcoxon signed rank tests showed that rbcL and matK have the lowest variation between conspecific individuals, whereas psbA-trnH showed the highest [Table 4].

Table 4.

Wilcoxon signed rank test for intraspecific variations

graphic file with name PM-8-4-g004.jpg

Assessment of the barcoding gap

Ideally, barcoding involves separate distributions and without overlap between intra- and interspecific variations.[10,16] Results of the present study showed that psbA-trnH have a faint gap, whereas matK and rbcL exhibited significant overlap without any gaps [Figures 1 and 2].

Figure 1.

Figure 1

Schematic representation of the deoxyribonucleic acid barcoding gap between interspecific and intraspecific divergences for three candidate DNA barcodes. (a) matK; (b) rbcL; and (c) psbA-trnH

Figure 2.

Figure 2

The interspecific divergence of the psbA-trnH region in Lauraceae

Evaluation of identifying ability of barcodes

In the BLAST1 method, results showed that psbA-trnH identified correctly 82.4% of the samples at the species level and 88.1% at the genus level. In contrast to psbA-trnH, the correct identification for matK and rbcL were much lower at the species level, as identified by both BLAST1 and nearest genetic distance methods. At the species level, the correct identification of the two-locus combination of rbcL + matK, matK + psbA-trnH, and rbcL + psbA-trnH were 38.2%, 82.4%, and 82.4%, respectively, using BLAST1 [Table 5]. To verify the capacity of the identification of psbA-trnH in more samples, 175 samples belonging to 117 species from the experimental data and from the GenBank database of the Lauraceae family were tested [Tables S1 and S2]. Using the BLAST1 method, the identification efficiency were 84.0% and 92.3% at the species and genus level, respectively.

Table 5.

Comparison of identification efficiency for potential deoxyribonucleic acid barcodes loci using different methods of species identification

graphic file with name PM-8-4-g007.jpg

Table S1.

Samples for testing potential barcodes and accession numbers in GenBank

graphic file with name PM-8-4-g008.jpg

Table S2.

Samples for determining the ability of the psbA-trnH barcode to identify species and accession numbers in GenBank

graphic file with name PM-8-4-g009.jpg

DISCUSSION

This work, which focused on four popular candidate sequences of matK, rbcL, psbA-trnH, and nrDNA ITS2, has conducted a comparative study of 11 genera 42 species from 68 samples of Lauraceae. In the experiments, it was found that matK, rbcL, rbcL + matK, and ITS2 were not suitable as a barcode for the Lauraceae family. The psbA-trnH region presented itself with short length, easy sequencing, and powerful ability of species identification for Lauraceae plants. By comparing matK, rbcL, and ITS2, it was found that the psbA-trnH region is the best marker for the identification of Lauraceae species.

Selection of the DNA barcode for the Lauraceae family

In the present research, it was found that psbA-trnH, as a barcode sequence, showed excellent results. First, the psbA-trnH region has a short length in the 195–423 base pairs, which can then be easily amplified and sequenced. The success rate of PCR amplification and sequencing for the psbA-trnH of 68 samples from 11 genera of Lauraceae were 100%. Second, the determination of genetic divergences using six metrics and statistical tests confirmed that the psbA-trnH region possesses sufficient high interspecific variation. There existed significant differences between interspecific and intraspecific variations. Third, according to BLAST1, the identification efficiency using the psbA-trnH region was 84.0% at the species level for the 175 samples from 117 species in 35 genera of Lauraceae. Moreover, the two loci combination of matK + psbA-trnH and rbcL + psbA-trnH did not show any improved abilities for identification. The psbA-trnH can identify all the species, which were identified by matK, rbcL, and the two-locus combination of rbcL + matK.

The rbcL sequence possesses advantages of versatility, easy amplification, and alignment. However, the variation in the rbcL region mainly exists for the above-species level, as the variation in the species level is insufficient to discriminate the different species.[12,13,17,18] The evolutionary rate of matK segment is faster than the coding regions of others, but Rohwer et al.,[19] reported that the matK sequence has low-evolutionary rates for Lauraceae (ie, the informative sites are only 9.7%). In this study, the two loci can be easily amplified and sequenced, but it was also found that they were too conservative for Lauraceae plants-their interspecific divergence were very low. Although matK and rbcL provided good PCR efficiency (both at 100%) and satisfactory sequencing efficiency (both at 100%), the successful identification rate of matK and rbcL were 30.9% and 25.0%, respectively, according to BLAST1. The success rate was only 38.2% at the species level when the two loci combination was used.

Many researchers have proposed the use of ITS2 as a suitable marker applicable for phylogenetic reconstruction and taxonomic classification.[4,20,21] In our study, the success rate of PCR amplification with ITS2 was poor in Lauraceae; thus, ITS2 was not included in subsequent experiments. We strictly observe the standard operating program of PCR, during the test, and similar experiment was repeated three times. The success rates for ITS2 sequences were 32.35%, 32.35%, and 30.88%, respectively. Then, we compared the success rate of PCR amplification of Lauraceae and Caprifoliaceae, used the same primers of ITS2 and PCR reaction conditions. Results showed that ITS2 sequences are relatively easy to amplify in Caprifoliaceae. In contrast to Caprifoliaceae, the success rate of PCR amplification of Lauraceae were much lower. Furthermore, in our experiments, ITS2 provided not satisfactory PCR efficiency (32.35%) and bad sequencing efficiency (27.27%), because homologous sequences existed. Our much work shows that in the direct PCR amplification and sequencing ITS2 produce a high success rate in some taxonomy group but the low success rate in another taxonomy group. It is found that ITS2 region produced a low success rate in direct PCR amplification and sequencing in Lauraceae species and it is also unsuitable to be DNA barcode of Lauraceae.

Discussion on samples with unsuccessful identification

In our study, the psbA-trnH sequence was chosen as a DNA barcode in identifying the species of Lauraceae family. Among the 175 samples tested, 28 samples could not be identified. At present, there is no stated consensus on the taxonomy of Lauraceae, and the relationships among the species of the family are still poorly understood.[22] The present study found that ambiguous identification mainly occurred in five genera (Persea, Ocotea, Litsea, Machilus, and Cinnamomum) which have always been as source of dispute in taxonomy. It was difficult to distinguish species in the same genus because they show little differences in morphology. The relationship among species of these genera is complex and the boundaries across groups are vague, which could result in improper classification.[2327] These species could not be identified by matK, rbcL, and the two-locus combination of rbcL + matK, could also not be identified by psbA-trnH in this study. A possible method for the species of these genera identification may be whole chloroplast genome sequencing.

The present research made a new exploration in the application of DNA barcode technology, as well as provided new approaches and evidences for the classification and phyletic evolution of Lauraceae plants. However, because of sampling constraints, lack of duplication of some species individuals, and the presence of those highly related species (ie, from sister species) not included in the analysis, some flaws in the research still exist. Hopefully, with the increasing number of materials and the progress of the study, DNA barcode technology can provide more effective information and more reliable method for the identification of Lauraceae plants.

Footnotes

Source of Support: Nil

Conflict of Interest: None declared.

REFERENCES

  • 1.Drinnan AN, Crane PR, Friis EM, Pedersen KR. Lauraceous Flowers from the Potomac Group (Mid-Cretaceous) of Eastern North America. Bot Gaz. 1990;151:370–84. [Google Scholar]
  • 2.Hebert PD, Ratnasingham S, de Waard JR. Barcoding animal life: Cytochrome c oxidase subunit 1 divergences among closely related species. Proc Biol Sci. 2003;270(Suppl 1):S96–9. doi: 10.1098/rsbl.2003.0025. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.CBOL Plant Working Group. A DNA barcode for land plants. Proc Natl Acad Sci U S A. 2009;106:12794–7. doi: 10.1073/pnas.0905845106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Chen SL, Yao H, Han JP, Liu C, Song JY, Shi LC, et al. Validation of the ITS2 region as a novel DNA barcode for identifying medicinal plant species. Plos One. 2010;5:e8613. doi: 10.1371/journal.pone.0008613. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Gao T, Chen SL. Authentication of the medicinal plants in Fabaceae by DNA barcoding technique. Planta Med. 2009;75:417. [Google Scholar]
  • 6.Pang XH, Chen SL. Using DNA barcodes to identify Rosaceae. Planta Med. 2009;75:417. [Google Scholar]
  • 7.Song JY, Yao H, Li Y, Li XW, Liu C, Han JP, et al. Authentication of the family Polygonaceae in Chinese pharmacopoeia by DNA barcoding technique. J Ethnopharmacol. 2009;124:434–9. doi: 10.1016/j.jep.2009.05.042. [DOI] [PubMed] [Google Scholar]
  • 8.Yao H, Song JY, Ma XY, Liu C, Li Y, Xu HX, et al. Identification of Dendrobium species by a candidate DNA barcode sequence: The chloroplast psbA-trnH intergenic region. Planta Med. 2009;75:667–9. doi: 10.1055/s-0029-1185385. [DOI] [PubMed] [Google Scholar]
  • 9.Zhu YJ, Chen SL, Yao H, Tan R, Song JY, Luo K, et al. DNA barcoding for the identification plants of the genus Paris. Yao Xue Xue Bao. 2010;45:376–82. [PubMed] [Google Scholar]
  • 10.Meyer CP, Paulay G. DNA barcoding: Error rates based on comprehensive sampling. PLoS Biol. 2005;3:2229–38. doi: 10.1371/journal.pbio.0030422. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Meier R, Zhang GY, Ali F. The use of mean instead of smallest interspecific distances exaggerates the size of the “Barcoding Gap” and leads to misidentification. Syst Biol. 2008;57:809–13. doi: 10.1080/10635150802406343. [DOI] [PubMed] [Google Scholar]
  • 12.Kress WJ, Erickson DL. A two-locus global DNA barcode for land plants: The coding rbcL gene complements the non-coding trnH-psbA spacer region. PLoS One. 2007;2:e508. doi: 10.1371/journal.pone.0000508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Lahaye R, van der Bank M, Bogarin D, Warner J, Pupulin F, Gigot G, et al. DNA barcoding the floras of biodiversity hotspots. Proc Natl Acad Sci U S A. 2008;105:2923–8. doi: 10.1073/pnas.0709936105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Slabbinck B, Dawyndt P, Martens M, De Vos P, De Baets B. TaxonGap: A visualization tool for intra- and inter-species variation among individual biomarkers. Bioinformatics. 2008;24:866–7. doi: 10.1093/bioinformatics/btn031. [DOI] [PubMed] [Google Scholar]
  • 15.Ross HA, Murugan S, Li WL. Testing the reliability of genetic methods of species identification via simulation. Syst Biol. 2008;57:216–30. doi: 10.1080/10635150802032990. [DOI] [PubMed] [Google Scholar]
  • 16.Moritz C, Cicero C. DNA Barcoding: Promise and Pitfalls. PLoS Biol. 2004;2:e354. doi: 10.1371/journal.pbio.0020354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Fazekas AJ, Burgess KS, Kesanakurti PR, Graham SW, Newmaster SG, Husband BC, et al. Multiple multilocus DNA barcodes from the plastid genome discriminate plant species equally well. PLoS One. 2008;3:e2802. doi: 10.1371/journal.pone.0002802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Newmaster SG, Fazekas AJ, Steeves RA, Janovec J. Testing candidate plant barcode regions in the Myristicaceae. Mol Ecol Resour. 2008;8:480–90. doi: 10.1111/j.1471-8286.2007.02002.x. [DOI] [PubMed] [Google Scholar]
  • 19.Rohwer JG. Toward a phylogenetic classification of the Lauraceae: Evidence from matK sequences. Syst Bot. 2000;25:60–71. [Google Scholar]
  • 20.Schultz J, Maisel S, Gerlach D, Müller T, Wolf M. A common core of secondary structure of the internal transcribed spacer 2 (ITS2) throughout the Eukaryota. RNA. 2005;11:361–4. doi: 10.1261/rna.7204505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Miao M, Warren A, Song WB, Wang S, Shang HM, Chen ZG. Analysis of the internal transcribed spacer 2 (ITS2) region of scuticociliates and related taxa (Ciliophora, Oligohymenophorea) to infer their evolution and phylogeny. Protist. 2008;159:519–33. doi: 10.1016/j.protis.2008.05.002. [DOI] [PubMed] [Google Scholar]
  • 22.Li J, Li XW. Advances in lauraceae systematic research on the world scale. Acta Bot Yunnan. 2004;26:1–11. [Google Scholar]
  • 23.Kojoma M, Kurihara K, Yamada K, Sekita S, Satake M, Iida O. Genetic identification of cinnamon (Cinnamomum spp.) based on the trnL-trnF chloroplast DNA. Planta Med. 2002;68:94–6. doi: 10.1055/s-2002-20051. [DOI] [PubMed] [Google Scholar]
  • 24.Van der Werff H. A synopsis of Persea (Lauraceae) in Central America. Novon St. Louis Mo. 2002;12:575–86. [Google Scholar]
  • 25.Van der Werff H. A synopsis of Ocotea (Lauraceae) in Central America and Southern Mexico. Ann Mo Bot Gard. 2002;89:429–51. [Google Scholar]
  • 26.Li J, Christophel DC, Conran JG, Li HW. Phylogenetic relationships within the ‘core’ Laureae (Litseacomplex, Lauraceae) inferred from sequences of the chloroplast gene matK and nuclear ribosomal DNA ITS regions. Plant Syst Evol. 2004;246:19–34. [Google Scholar]
  • 27.Wei FN, Tang SC. On the circumscription of Machilus and of Persea (Lauraceae) Acta Phytotaxon Sin. 2006;44:437–42. [Google Scholar]

Articles from Pharmacognosy Magazine are provided here courtesy of Wolters Kluwer -- Medknow Publications

RESOURCES