Skip to main content
3 Biotech logoLink to 3 Biotech
. 2020 Jan 6;10(1):29. doi: 10.1007/s13205-019-2020-1

Analysis of complete chloroplast genome sequence of Korean landrace Cymbidium goeringii

Heng Wang 1,2, So-Yeon Park 1, Su-Hyang Song 1, Mar-Lar San 1, Yong-Chul Kim 1, Tae-Ho Ham 3,4, Dong-Yong Kim 5, Tae-Sung Kim 4, Joohyun Lee 3, Soon-Wook Kwon 1,
PMCID: PMC6944737  PMID: 32015946

Abstract

The complete chloroplast genome sequence of Korean Cymbidium goeringii acc. smg222 was analyzed. Based on a comparison with Chinese C. goeringii, losses of nine ndh subunits (ndhA, ndhB, ndhC, ndhD, ndhE, ndhF, ndhH, ndhJ, and ndhK), three protein-coding genes (ycf 1-like, ycf 15, and ycf 68), six transfer RNAs, and one conserved open reading frame (orf 42). In addition, 219 InDels (insertion or deletion) and 171 simple sequence repeats were observed. Twenty-Five of which InDel markers have been evaluated, that useful for distinguishing Korean and Chinese Cymbidium associations based on the polymorphisms of chloroplast genomes between Korean Cymbidium goeringii acc. smg222 and Chinese C. goeringii and evaluation of genetic diversity. Finally, the phylogenetic relationships of the 39 Korean and 22 Chinese species was constructed based on the five InDel markers of them and obtained high support, indicating that our data may be useful in resolving relationships in this genus. The information about chloroplast DNA structure and gene variants of C. goeringii acc. smg222 chloroplast genome will provide sufficient phylogenetic information for resolving evolutionary relationships. The molecular markers developed in here will contribute to further research of Cymbidium species and conservation of endemic Cymbidium species.

Electronic supplementary material

The online version of this article (10.1007/s13205-019-2020-1) contains supplementary material, which is available to authorized users.

Keywords: C. goeringii acc. smg222, Chloroplast genome, InDel marker, Molecular evolution

Introduction

Chloroplasts contain their own genome and have similarities DNA structure (i.e., number, content, and order of genes) in green plants. The complete chloroplast genome usually consists of a conserved quadripartite structure, i.e., a pair of inverted repeats (IRs) separated by a large single copy (LSC) region and a small single copy (SSC) region. In typical flowering plants, most complete chloroplast genomes are consist of a complete circular molecule ranging from 120 to 160 kb in size (Wicke et al. 2011). In addition, the highly conserved chloroplast genomes typically contain 4 ribosomal RNAs, approximately 30 transfer RNAs, and up to 80 unique proteins (Guisinger et al. 2011). Although the structure and gene contents of chloroplast genomes are highly conserved with respect to their basic structure, comparative genomic studies have revealed some species have a unique chloroplast genome structure (i.e., gene losses and gains, inversions, and large-scale genomic rearrangements) (Asaf et al. 2017). The most obvious examples of several reduced chloroplast genomes have been found in the parasitic plants Cuscuta (Funk et al. 2007), Epifagus (Wolfe et al. 1992), and Rhizanthella (Delannoy et al. 2011), which have lost some or all of their capacity for photosynthetic or the photosynthetic genes have become pseudo. The loss of chloroplast genes can occur if the gene is transferred and integrated into its hosts’ nuclear genome or is functionally replaced by a nuclear gene (Dong et al. 2013). Another speculation is that some species have a compact chloroplast DNA to facilitate adaptation to harsh and competitive conditions (Wicke et al. 2011).

Cymbidium orchid (Orchidaceae) is one of the most popular orchids in the world. The 44 species included in genus Cymbidium are dispersed widely in tropical, subtropical Asia and northeastern Australia (Du Puy et al. 2007). Spring orchid (C. goeringii) is one of the most important species belonging to Orchidaceae owing to its aesthetic appeal, variegated leaves, scented flowers, typical characteristics, and highly decorative flower structures for use as a decorative pot flowers. Selection based on somatic mutations at the vegetative propagation and hybrids derived from the 44 Cymbidium species, Wang et al. (2009) and Liu et al. (2014) have produced the most commercially important Cymbidiums which are attractive, long lasting and larger-flowers. These currently recognized variants are often complex, involving several species in their ancestry (Obara-Okeyo and Kako 1998). Recently, various molecular methods, such as random amplified polymorphic DNA (RAPD) (Obara-Okeyo and Kako 1998; Wang et al. 2004; Choi et al. 2006), amplified fragment length polymorphism (AFLP) (Wang et al. 2004), inter-simple sequence repeat (ISSR) (Wang et al. 2009), simple sequence repeat (SSR) (Moe et al. 2010), and expressed sequence tag-simple sequence repeat markers (EST-SSR) (Liu et al. 2014), have been used to detect genetic diversity in Cymbidium resources. However, in general, information on genetic diversity within and between horticultural groups of C.goeringii is still quite limited, and the genetic relationships among major excellent lines remain unclear because of limited number of Cymbidium DNA sequences (Moe et al. 2010; Choi et al. 2006). Chloroplast as one of the essential cell organelles and its genome act as a single genome, it has become an effective method to analysis plant genetic population and phylogenetic at lower taxonomic levels (Parks et al. 2009).

The Korean landrace, C. goeringii acc. smg222 was used to analyze the complete chloroplast genome in current study. Special emphasis was placed on changes in genome structure, variation in the number, content, and order of genes and genome sizes, and the interrelationship among these factors. In addition, we developed polymorphic DNA markers that could efficiently distinguish Korean and Chinese Cymbidium accessions and apply for genetic diversity among Cymbidium species.

Materials and methods

Sampling and DNA extraction

Cymbidium goeringii acc. smg222 plants from the Saemangeum Bio Center, Republic of Korea were used. A total of 61 Cymbidium accessions were collected for validation of sequence variation (Table S1). DNA was extracted from young leaf using a DNeasy Plant Mini Kit (QIAGEN, Valencia, CA, USA) according to the manufacturer’s instructions. The relative purity and concentration of extracted DNA were estimated using the NanoDrop ND-1000 (NanoDrop Technologies, Inc., Wilmington, DE, USA). The final concentration of the DNA sample was adjusted to 20 ng/ml.

Library preparation and sequencing

An Illumina paired-end DNA library (average insert size of 500 bp) was constructed using the Illumina TruSeq Library Preparation Kit (San Diego, CA, USA) following the manufacturer’s instructions. The library was sequenced (2 × 300 bp) using the MiSeq instrument at LabGenomics (https://www.labgenomics.co.kr/).

Chloroplast genome assembly

Prior to de novo assembly of the chloroplast genome, low-quality sequences (quality score < 30; Q30) were filtered out. The remaining high-quality reads were assembled using the CLC Genome Assembler (version beta 4.6; CLC Bio, Aarhus, Denmark) with a minimum overlap size of 200 bp and maximum bubble size of 50 bp for the de Bruijn graph. Chloroplast contigs were selected from the initial assembly by performing a BLAST (version 2.2.31) search against the reference chloroplast genome of C. kanran (GenBank accession: KU179435) using CLC with the following parameters: 0.5 for length fraction, 0.8 for similarity, and 200–600 bp of overlap (Jo et al. 2011). The selected chloroplast contigs were merged into four contigs, and iterative contig extensions were performed to construct a complete C. goeringii acc. smg222 chloroplast genome by mapping raw reads to the contigs. Ambiguous nucleotides or gaps were corrected manually to build the complete cp genome.

Gene annotation and comparative analysis of genome structure

Dual Organellar GenoMe Annotator (DOGMA) (Wyman et al. 2004) and CpGAVAS (Liu et al. 2012) were used to annotate the assembled chloroplast genome using default parameters to predict protein-coding, rRNA, and tRNA genes. To further confirm the identified tRNA genes, the tRNA scan-SE search server was used to predict their corresponding structures (Schattner et al. 2005). Subsequently, BLASTN was used to further identify intron-containing genes by searching against a published cp genome of C. kanran. The genome map of C. goeringii acc. smg222 was drawn using Organellar Genome DRAW (OGDRAW, https://ogdraw.mpimp-golm.mpg.de). The mVISTA program was employed to plot the sequence identity by comparing the complete chloroplast genomes of C. goeringii acc. smg222 with Chinese C. goeringii.

Discovery of variations and primers design

SSR markers in the chloroplast genome of C. goeringii acc. smg222 were found using Sputnik (https://espressosoftware.com/pages/sputnik.jsp) software. It uses a recursive algorithm to search for repeats with lengths between 2 and 5 nucleotides, and finds perfect, compound, and imperfect repeats. To designate SSR in many species, including Arabidopsis and barley, Sputnik has been utilized (Cardle et al. 2000). BWA (Li and Durbin 2009) was used with the ‘mem’ command line options ‘-k19 –w100 –d100 –r1.5 –y20 –c500 –D0.5 –W0 –m50′ and SAMtools was used with the ‘mpileup’ for identification of SNP and InDel variants in the chloroplast genome of C. goeringii acc. smg222. The detailed methods and algorithms are described in Li (2012). 25 InDels were selected to develop markers using Oligo7.0 software to screen for polymorphisms in Korean and Chinese cymbidium accessions. PCRs were performed in a 20-μl volume containing 10 ng of DNA template, 10 pmol of each primer, 1X PCR buffer, 0.2 mM of dNTPs, and 1 unit of Taq DNA polymerase (Nurotics, Korea). PCRs were performed with an MJ Research PTC-100 thermocycler (Waltham, MA, USA) using the following conditions: initial denaturation at 94 °C for 5 min, followed by 36 cycles of denaturation at 94 °C for 30 s, annealing at 58 °C for 30 s, and extension at 72 °C for 1 min, with a final extension at 72 °C for 10 min. The PCR products were separated during 100 min using Fragment Analyzer™ (Advanced Analytical Technologies, Inc).

Genetic diversity and phylogenetic analysis

Genetic variability measured was based on the average number of alleles, gene diversity, major allele frequency and polymorphic information content (PIC) were calculated for each InDel markers using PowerMarker version 3.25 (Liu and Muse 2005, https://brcwebportal.cos.ncsu.edu/ powermarker/index.html). Phylogenetic tree was constructed by the arithmetic average (UPGMA) method implemented in PowerMarker 3.0. Genetic distance was constructed using MEGA 7.0 software (Kumar et al. 2016, https://www.megasoftware.net/).

Results and discussion

Chloroplast genome sequencing

Illumina as one of the most widely employed next-generation sequence technology has been significantly improved the field of high throughput sequencing date collection. Illumina sequencing can generate high-quality sequence assemblies covering a greater genome depth with an average error rate of 1–1.5%. A previous research demonstrated that genome sequencing depths of greater than tenfold have little additional effect on genome coverage (Li et al. 2008). Using the Illumina platform, the coverage depth was high (356×) for the C. goeringii acc. smg222 chloroplast genome. Next-generation sequence technologies are subject to sequence dependent the GC content bias, in which the level of GC content effect the number of reads produced during high-throughput sequencing (Price et al. 2017). The complete chloroplast genome sequence of C. goeringii acc. smg222 was 37.1% overall GC content (Wang et al. 2018), similar to those of other Cymbidium species. These indicates that the whole-genome sequencing approach for the chloroplast of C. goeringii acc. smg222 was successful, yielding an reasonable sequencing depth and coverage. Using the Illumina platform, there were four contigs comprising 148,441 bp covering the whole chloroplast genome of C. goeringii acc. smg222 (Fig. 1). However, previously reported Cymbidium chloroplast genomes range in length from 154,769 to 156,904 bp (Yang et al. 2013), inconsistent with the size of the chloroplast genome of C. goeringii acc. smg222 in recent study. The C. goeringii acc. smg222 complete chloroplast genome was 8,751 bp shorter than the Chinese C. goeringii chloroplast genome (GenBank accession: NC_028524) and 6,328–8,463 bp shorter than C. tortisepalum 1, C. tortisepalum 2, C. tortisepalum 3, C. mannii 1, C. mannii 1, C. aloifolium, C. sinense, and C. tracyanum chloroplast genomes (GenBank accessions: KC876124, KC876128, KC876125, KC876129, KC876126, KC876122, KC876123, and KC876127). Additionally, the LSC, SSC, and IR regions of C. goeringii acc. smg222 were 1,609–2,330, 2,619–4,019, and 711–1,100 bp shorter than those of other genomes, respectively. Species with shorter life cycles tend to have shorter chloroplast genomes than those of species with longer life cycles (Dong et al. 2013). Differences in genome size might indicate chloroplast genome evolution in C. goeringii acc. smg222.

Fig. 1.

Fig. 1

Gene map of complete chloroplast genome of the C. goeringii acc. smg222. Gene on the outside of the map are transcribed in the clockwise direction, while other genes on the inside of the map are transcribed in the counterclockwise direction. Differential functional gene groups are color-coded. The dark gray plot in the inner circle corresponds to GC content

Features of the C. goeringii acc. smg222 chloroplast genome

The 70 protein-coding, 30 transfer RNA, and 4 ribosomal RNA genes were composed as 104 unique genes in the C. goeringii acc. smg222 chloroplast genome (Wang et al. 2018). Among the 70 protein-coding genes, 21 ribosomal subunit genes consist of 12 small subunits and 9 large subunits and DNA-directed RNA polymerase genes were 4. Thirty-seven genes were linked with photosynthetic process, of which two were encoded subunits of the NADH oxidoreductase, seven were in photosystem I, 15 were in photosystem II, six were in the cytochrome b6/f complex, six encoded different subunits of ATP synthase, and one encoded the large chain of ribulose bisphosphate carboxylase. Six genes had functions unrelated to photosynthesis, and two genes were of unknown function. Seven distinct genes [petB, pteD, atpF, rpl16, rps16, rpoC1, and rpl2 (IR)] and six tRNA genes (trnA-UGC, trnG-GCC, trnI-GAU, trnK-UUU, trnL-UAA, and trnV-UAC) contained one intron, and three genes (ycf3, rps12, and clpP) contained two introns. We also detected some alternative start codons, for instance, ACG for rpl2, ACG for rps12, and GTG for rps19. Non-canonical start codons have been reported in other angiosperms (Raubeson et al. 2007) and ferns (Gao et al. 2009). From the above, the gene order of C. goeringii acc. smg222 was generally in agreement with that of previously reported chloroplast genomes of the genus Cymbidium.

Gene content differences

By aligning the chloroplast of C. goeringii acc. smg222, DNA with Chinese C. goeringii genome was shown (Fig. 2). C. goeringii acc. smg222 exhibited a small genome size, gene losses and intron losses. One exception was that the number and content of genes were not similar to the sequenced chloroplast DNAs of C. goeringii with respect to 11 subunits of ndh; in C. goeringii acc. smg222, ndhA, ndhB, ndhC, ndhD, ndhE, ndhF, ndhH, ndhJ, and ndhK were loss-of-function. We only detected remnants of ndhG and ndhI sequences. In the two C. goeringii varieties chloroplast genome studied, ndhA and ndhC genes are truncated (partial sequence remained), of which ndhA has a truncation in the 3′-end and ndhC has a truncation occurring at the 5′-end; ndhF, ndhH, ndhJ and ndhK genes are absent (no sequence exists); in ndhB, two nucleotides substitution occurred, ndhD has a 17 bp deletion and ndhE has a 2 bp insertion, creating a frameshift mutation. The non-functionality of ndh genes also occur in other orchids (Chang et al. 2006; Wu et al. 2010). In Oncidiinae varieties, most ndh genes, with the exception of ndhE in some species, have no function; ndhB has a stop codon in the first exon, ndhD and ndhJ are truncated, ndhK and ndhF are absent, and ndhC has a frameshift mutation, creating a 17 bp deletion and a premature stop codon (Wu et al. 2010). Phalaenopsis aphrodite lacks ndhA, ndhF, and ndhH, and contains only remnants of the other eight subunit sequences, moreover, the 11 ndh genes are presumably nonfunctional based on either truncation or frameshift mutations (Chang et al. 2006). Here, we displayed that the truncation and/or absence of ndh genes in the chloroplast genome region is a general phenomenon in Cymbidium, not restricted to Oncidiinae and Phalaenopsis.

Fig. 2.

Fig. 2

Visualization of alignment of the Korean and Chinese Cymbidium goeringii chloroplast genome sequences. The vertical scale indicates the percentage of identity, ranging from 50 to 100%. The horizontal axis indicates the coordinates within the chloroplast genome. Genome regions are color coded as protein coding, rRNA coding, tRNA coding or conserved noncoding sequences

Compared with those of other Cymbidium chloroplast genomes, C. goeringii acc. smg222 also exhibited the losses of three protein-coding genes with unknown function (ycf 1-like, ycf 15, and ycf 68), six transfer RNAs, and one conserved open reading frame (orf 42), of which, ycf 1-like as an incomplete duplication of the normal ycf 1 is absent and the non-functionality of ycf68 gene is located in the intron of C. goeringii acc. smg222 trnI-GAU gene. In the previous reports, ycf1-like, ycf15 and ycf68 genes are non-functional pseudogenes in other Cymbidium genomes, but they are conserved in IRs among different species (Yang et al. 2013). We also detected the loss-of-function ORF42 gene is located in the intron of C. goeringii acc. smg222 trnA-UGC gene. ORF42 is a fragment of the pvs-trnA gene, which is related to a mitochondrial gene (Do et al. 2013); however, the characteristics and function of ORF42 remain to be determined. Based on these observations, it has been hypothesized that selection for a reduced nuclear genome is beneficial owing to its lower maintenance cost when deletions are not deleterious or genes are dispensable Gray et al. (1998). suggested that an economic strategy can explain organellar genomes in which pseudogenes are usually rare and the non-coding content is low. In gnetophytes, the reduced chloroplast genome with deletions of genes and non-CDSs increases survivorship in harsh and competitive environments. It is reasonable to conclude that C. goeringii acc. smg222 might exist in resource-poor conditions and that a low-cost strategy is indispensable for survival.

Discovery of DNA variations

We analyzed the occurrence, type, and distribution of SSRs in the C. goeringii acc. smg222 chloroplast genome. In total, we identified 171 SSRs (Table S2). The majority of the SSRs in this chloroplast genome are mono-nucleotides 108 (63.2%) followed by di-nucleotides 53 (31.0%). Besides, tri-nucleotides 3 (1.8%), tetra-nucleotides 5 (2.9%), and compound SSRs 2 (1.2%) motifs were present in the C. goeringii chloroplast genome (Fig. 3). Among these, 113 were localized in intergenic spacers and 58 were in CDS region. Here, the detected frequency of SSR sites was about 1 per 868 bp of the chloroplast genome. Furthermore, these regions can be applied to intraspecific polymorphism and used for population structure and phytogeographic studies for Cymbidium.

Fig. 3.

Fig. 3

SSR candidates of C. goeringii acc. smg222 chloroplast genome. (A: A mono-nucleotides; T: T mono-nucleotides; C: C mono-nucleotides; G: G mono-nucleotides; di: di-nucleotides; tri: tri-nucleotides; tetra: tetra-nucleotides; com: compound nucleotides)

In an equivalence of the chloroplast genome sequence of C. goeringii acc. smg222 with the Chinese C. goeringii one, we detected 219 InDels (Table S3). Among these InDels, we observed 31 InDels in coding regions and 152, 13, 41, and 13 mutations located in the LSC, IRa, SSC, and IRb regions, respectively. These InDel loci observed in this study is valuable for further studies in genetic diversity, population structure, and evolutionary analyses.

InDel markers diversity in Cymbidium

Complete chloroplasts genome sequences have been widely accepted as valuable and informative genomic resources for understanding evolutionary biology and practical application of DNA markers because of their relatively stable genome structure and uniparental inherited features (Dong et al. 2012; Kim et al. 2015; Suo et al. 2012; Song et al. 2015).

Twenty-five InDel-based markers were developed by their sequence variations (Table 1) and were analyzed genetic diversity between 10 Korean and 10 Chinese Cymbidium accessions (Table 2). The mean major allele number in Cymbidium accessions was 2.32 and the average number of alleles in Korean Cymbidium accessions (Subgroup 1) and Chinese Cymbidium accessions (Subgroup 2) was 1.28 and 2.32, respectively. The overall PIC values ranged from 0 to 0.6324 with an average of 0.2360 (Table 2). These markers have a higher diversity in Subgroup 1 than Subgroup 2. In here, the InDel marker with one allele in a Subgroup and the frequency of the allele in another Subgroup (Fsub1) is 0 was selected as the specific marker for genetic diversity analyzing. Such as, the Fsub1 values of InDel 001,520, InDel 058,556 and InDel 076,063 markers were 0 that means these InDel markers were specific for Subgroup 1. The Fsub1 and Fsub2 values of InDel 009,101 and InDel 116,414 markers were 0 that shows the two InDel markers were specific for Subgroup 1 and Subgroup 2. InDel 015,557, InDel 070,173, InDel 097,251, InDel 102,357, InDel 110,602, InDel 133,740 and InDel 147,362 had the specific alleles in Subgroup 1 and Subgroup 2, respectilvely, and the values of Fsub1 and Fsub2 were 1 that means the specific band were monomorphic (Table 2).

Table 1.

Detail of InDel markers developed in this study

No. ID Left primer Right primer Locus
1 ID001520 GGAACTAGTCGGATGGAGTAGA CATCATATTCGTGGTGAGATTG trnK-UUU~matK
2 ID003480 TTTTGTTGCCGAAATCTATCTT GATCTGTAGATTGGGCTCTCTG matK~trnK-UUU
3 ID009101 CATTTTCATTGCCATTCCTAAT TTTCATTGATGAATTTCCGAAT trnS-GCU~trnG-UCC
4 ID012838 CCCATATTTTTGAGCCTATCCT CGAATCGACGACCTATGTATTG atpF~atpF
5 ID015557 ACTTGGGCATGAATTTGTAAAC TATTTTGGTTTGCATCTTTTGG atpI~rps2
6 ID037977 AAGATTCCGTCGGTATAGTTGA AGTAGAGCAGTTTGGTAGCTCG trnG-GCC~trnfM-CAU
7 ID044404 TATCTTTTCTCCCACCTTCAGA TAAAGTATTGAGCAGCGGTGTA ycf3~ycf3
8 ID048067 CATTACAAATGCGATGCTCTAA TTCCTCCTCCTTCATTTTTACA trnT-UGU~trnL-UAA
9 ID053814 CTTCATTGTGTTCAATTTGTGG GGCGGTTTTGCTAGAATAAGTA trnM-CAU~atpE
10 ID058556 CAGCAATTCCTTTTTGTTCTTC TCTATGGCCTGAAACTAAGGAA rbcL~accD
11 ID065011 TACGAGTCCAAGGTCTTCTGTT AGGAGGAATCCATCGATTTTAT petA~psbJ
12 ID070173 TTTTCTTCTATCTTCCCGGAGT TTTGGTTTCTTCTCATCGAAAT rps18~rpl20
13 ID073915 GACAAGTCGCACTATACGTCAA ATCCCCTTCGTTACAATCTTTC clpP~psbB
14 ID076063 TACAAAATTCCTTTGCCATCTT CCACACCTATTCATTTTGGATT psbB~psbT
15 ID078919 TGTGGAATGATCCAATGTTCTA CCTGTTCTTCCTTAGATCCCTT petD~petD
16 ID096984 TCCAAACGGACTCCTATAAAGA GACATGACCGATCGATAGAAAT ycf2~trnL-CAA
17 ID097251 TCTATCGATCGGTCATGTCATA TATTATGAGAAGGGGTCATTCG trnL-CAA~rps7
18 ID102357 ATCCTTTCGATGACCTATGTTG TCTCTCATGGTACAACCCTCTT rps12~trnV-GAC
19 ID110602 GCGACAGAAGTATTGAGAATCC TACAGTATCGTCACCGCAGTAG rrn4.5~rrn5
20 ID116414 CCAAATTTCCATTTTTGAATTG GATTTTGAGACCCAACACCTTA rpl32~trnL-UAG
21 ID116964 TATTATTTAAGGTAAGCCGCCA TCACAATTGAGATGATCGAAAA trnL-UAG~ccsA
22 ID120564 GCCAAGATACTTTGATTTCCAT TTAGTCATAATTCTTCTATTTTCCCA psaC~ndhG
23 ID133740 TACAGTATCGTCACCGCAGTAG GCGACAGAAGTATTGAGAATCC rrn5~rrn4.5
24 ID147109 TATTATGAGAAGGGGTCATTCG TCTATCGATCGGTCATGTCATA rps7~trnL-CAA
25 ID147362 GACATGACCGATCGATAGAAAT TCCAAACGGACTCCTATAAAGA trnL-CAA

Table 2.

Summary statistics for the 25 InDel markers analyzed in this study

Marker Subgroup1 (K01~K10) Subgroup2 (C01~C10) All Fsub1a Fsub2 Remark
Allele no. (A) PIC value Allele no. (B) PIC value Allele no. (C) PIC value
ID001520 1 0.0000 3 0.4992 4 0.5812 0 Specific for sub1
ID003480 1 0.0000 2 0.3648 2 0.3318 0.4
ID009101 1 0.0000 1 0.0000 2 0.3750 0 0 Specific for sub1 and sub2
ID012838 2 0.1638 1 0.0000 2 0.0905 0.9
ID015557 1 0.0000 1 0.0000 1 0.0000 1 1 Monomorphic
ID037977 2 0.1638 2 0.2688 3 0.2469
ID044404 1 0.0000 2 0.1638 2 0.0905 0.9
ID048067 3 0.2688 3 0.1769 3 0.4064
ID053814 2 0.1638 3 0.5478 4 0.6324
ID058556 1 0.0000 3 0.4992 4 0.5812 0 Specific for sub1
ID065011 1 0.0000 2 0.1638 2 0.0905 0.9
ID070173 1 0.0000 1 0.0000 1 0.0000 1 1 Monomorphic
ID073915 1 0.0000 3 0.4992 3 0.4662 0.2
ID076063 1 0.0000 2 0.2688 3 0.4918 0 Specific for sub1
ID078919 1 0.0000 4 0.6035 4 0.4059 0.5
ID096984 1 0.0000 2 0.0905 2 0.0476 0.9
ID097251 1 0.0000 1 0.0000 1 0.0000 1 1 Monomorphic
ID102357 1 0.0000 1 0.0000 1 0.0000 1 1 Monomorphic
ID110602 1 0.0000 1 0.0000 1 0.0000 1 1 Monomorphic
ID116414 1 0.0000 1 0.0000 2 0.3750 0 0 Specific for sub1 and sub2
ID116964 1 0.0000 2 0.1638 2 0.0905 0.9
ID120564 2 0.3648 5 0.6005 5 0.5054
ID133740 1 0.0000 1 0.0000 1 0.0000 1 1 Monomorphic
ID147109 2 0.1638 1 0.0000 2 0.0905 0.9
ID147362 1 0.0000 1 0.0000 1 0.0000 1 1 Monomorphic
Mean 1.28 0.0516 1.96 0.1964 2.32 0.2360

PIC polymorphism information content

aFsub1 frequency of subgroup1 specific allele in subgroup2, Fsub2 frequency of subgroup2 specific allele in subgroup1

Finally, five InDel markers (ID001520, ID009101, ID058556, ID076063 and ID116414) were selected as Subgroup 1 specific markers and were tested among 61 Cymbidium accessions (Table S1). The PCR results of InDel 001,502 marker shown a special band in Korean Cymbidium accessions and polymorphism among Chinese Cymbidium accessions. The InDel 009,101 and ID116414 markers are specific for Korean and Chinese Cymbidium accessions, respectively, except one was not same as other. The InDel 058,556 and InDel 076,063 markers also shown specially in Korean Cymbidium accessions, but diversity in Chinese Cymbidium accessions (Table S1). These results almost same as above that demonstrated the sequence of C. goeringii acc. smg222 chloroplast was reliable.

The neighbor-joining tree of 61 Cymbidium accessions based on Ner’s genetic distance revealed that Korean and Chinese Cymbidium accessions are divided into two different groups (Fig. 4). The phylogenetic relationship of the Cymbidiums is associated with the geographic characteristics, indicating that these InDel markers may be useful for authentication of Cymbidium genetic resources across different regions and for improving germplasm conservation methods.

Fig. 4.

Fig. 4

Neighbour-joining phylogenetic tree of the Cymbidium inferred from the five InDel markers of 61 Cymbidium accessions. A: Korean Cymbidium group; B: Chinese Cymbidium group

Conclusion

With the unique inherited nature of chloroplasts, where they are inherited maternally, the chloroplast genome sequence can provide useful information on plant evolution, systematics, and biogeography owing to its high conservation among species. The analysis of complete chloroplast genome sequence of Korean Cymbidium goeringii revealed genetic variation with Chinese Cymbidium and provided useful DNA makers such as 171 SSRs and 219 InDel. Among them 31 markers further evaluated by comparison among 10 Korean Cymbidium goeringii and 10 Chinese Cymbidium. In the validation analysis of the developed makers for the constructing the neighbor-joining tree among 61 Cymbidium accessions, 5 makers successfully distinguished Korean Cymbidium from Chinese Cymbidium. The results from this analysis can be used for further studies in the field of biogeography, plant biotechnology, and population structure studies in Cymbidium, as well as phylogenetic studies, of species of this genus.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Acknowledgements

Korea Institute of Planning and Evaluation for Technology in Food, Agriculture, Forestry and Fisheries (IPET) provided this research through Technology Commercialization Support Program. The finance was contributed by Ministry of Agriculture, Food and Rural Affairs (MAFRA) (815004–03).

Author contributions

HW, SYP and SWK developed ideas, designed and performed all experiments, and wrote the manuscript. YCK and DYK produced all cymbidium materials. THH, JL, and TSK analyzed the sequencing dates. SHS and SML analyzed the InDel markers for experiments. All authors read and approved the final manuscript.

Funding

This work was supported by grants from Ministry of Agriculture, Food and Rural Affairs of Korea (MAFRA).

Compliance with ethical standards

Conflict of interest

All the authors declare that they have on conflict of interest in the publication.

Footnotes

The complete chloroplast genome sequence of C. goeringii acc. smg222 has been deposited at GenBank under the accession number MF421552.

Heng Wang and So-Yeon Park are contributed equally.

References

  1. Asaf S, Khan AL, Khan MA, Imran QH, Kang SM, Al-Hosni K, Jeong EJ, Lee KE, Lee IJ. Comparative analysis of complete plastid genomes from wild soybean (Glycine soja) and nine other Glycine species. PLoS ONE. 2017;12:e0182281. doi: 10.1371/journal.pone.0182281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Cardle L, Ramsay L, Milbourne D, Macaulay M, Marshall D, Waugh R. Computational and experimental characterization of physically clustered simple sequence repeats in plants. Genetics. 2000;156(2):847–854. doi: 10.1093/genetics/156.2.847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Chang CC, Lin HC, Lin IP, Chow TY, Chen HH, Chen WH, Cheng CH, Lin CY, Liu SM, Chang CC, Chaw SM. The chloroplast genome of Phalaenopsis aphrodite (Orchidaceae): comparative analysis of evolutionary rate with that of grasses and its phylogenetic implications. Mol Biol Evol. 2006;23(2):279–291. doi: 10.1093/molbev/msj029. [DOI] [PubMed] [Google Scholar]
  4. Choi SH, Kim MJ, Lee JS, Ryu KH. Genetic diversity and phylogenetic relationships among and within species of oriental cymbidiums based on RAPD analysis. Sci Hortic. 2006;108(1):79–85. [Google Scholar]
  5. Delannoy E, des Fujii S, Francs-Small CC, Brundrett M, Small I. Rampant gene loss in the underground orchid Rhizanthella gardneri highlights evolutionary constraints on plastid genomes. Mol Biol Evol. 2011;28(7):2077–2086. doi: 10.1093/molbev/msr028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Do HDK, Kim JS, Kim JH. Comparative genomics of four Liliales families inferred from the complete chloroplast genome sequence of Veratrum patulum O. Loes. (Melanthiaceae) Gene. 2013;530(2):229–235. doi: 10.1016/j.gene.2013.07.100. [DOI] [PubMed] [Google Scholar]
  7. Dong W, Liu J, Yu J, Wang L, Zhou S. Highly variable chloroplast markers for evaluating plant phylogeny at low taxonomic levels and for DNA barcoding. PLoS ONE. 2012;7(4):e35071. doi: 10.1371/journal.pone.0035071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Dong W, Xu C, Cheng T, Zhou S. Complete chloroplast genome of Sedum sarmentosum and chloroplast genome evolution in Saxifragales. PLoS ONE. 2013;8(10):e77965. doi: 10.1371/journal.pone.0077965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Du Puy D, Cribb P, Tibbs M. The genus Cymbidium. Kew: Royal Botanic Gardens; 2007. [Google Scholar]
  10. Funk HT, Berg S, Krupinska K, Maier UG, Krause K. Complete DNA sequences of the plastid genomes of two parasitic flowering plant species, Cuscuta reflexa and Cuscuta gronovii. BMC Plant Biol. 2007;7(1):45. doi: 10.1186/1471-2229-7-45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Gao L, Yi X, Yang YX, Su YJ, Wang T. Complete chloroplast genome sequence of a tree fern Alsophila spinulosa: insights into evolutionary changes in fern chloroplast genomes. BMC Evol Biol. 2009;9(1):130. doi: 10.1186/1471-2148-9-130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gray MW, Lang BF, Cedergren R, Golding GB, Lemieux C, Sankoff D, Turmel M, Brossard N, Delage E, Tim GL, Plante I, Rioux P, Saint-Louis D, Zhu Y, Burger G. Genome structure and gene content in protist mitochondrial DNAs. Nucleic Acids Res. 1998;26(4):865–878. doi: 10.1093/nar/26.4.865. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Guisinger MM, Kuehl JV, Boore JL, Jansen RK. Extreme reconfiguration of plastid genomes in the angiosperm family Geraniaceae: rearrangements, repeats, and codon usage. Mol Biol Evol. 2011;28(1):583–600. doi: 10.1093/molbev/msq229. [DOI] [PubMed] [Google Scholar]
  14. Jo YD, Park J, Kim J, Song W, Hur CG, Lee YH, Kang BC. Complete sequencing and comparative analyses of the pepper (Capsicum annuum L.) plastome revealed high frequency of tandem repeats and large insertion/deletions on pepper plastome. Plant Cell Rep. 2011;30(2):217–229. doi: 10.1007/s00299-010-0929-2. [DOI] [PubMed] [Google Scholar]
  15. Kim K, Lee SC, Lee J, Lee HO, Joh HJ, Kim NH, Pakr HS, Yang TJ. Comprehensive survey of genetic diversity in chloroplast genomes and 45S nrDNAs within Panax ginseng species. PLoS ONE. 2015;10(6):e0117159. doi: 10.1371/journal.pone.0117159. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33(7):1870–1874. doi: 10.1093/molbev/msw054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Li H. Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly. Bioinformatics. 2012;28(14):1838–1844. doi: 10.1093/bioinformatics/bts280. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Li R, Li Y, Kristiansen K, Wang J. SOAP: short oligonucleotide alignment program. Bioinformatics. 2008;24(5):713–714. doi: 10.1093/bioinformatics/btn025. [DOI] [PubMed] [Google Scholar]
  20. Liu K, Muse SV. PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics. 2005;21(9):2128–2129. doi: 10.1093/bioinformatics/bti282. [DOI] [PubMed] [Google Scholar]
  21. Liu C, Shi L, Zhu Y, Chen H, Zhang J, Lin X, Guan X. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences. BMC Genom. 2012;13(1):715. doi: 10.1186/1471-2164-13-715. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Liu X, Huang Y, Li F, Xu C, Chen K. Genetic diversity of 129 spring orchid (Cymbidium goeringii) cultivars and its relationship to horticultural types as assessed by EST-SSR markers. Sci Horticult. 2014;174:178–184. [Google Scholar]
  23. Moe KT, Zhao W, Song HS, Kim YH, Chung JW, Cho YI, Park PH, Park HS, Chae SC, Park YJ. Development of SSR markers to study diversity in the genus Cymbidium. Biochem Syst Ecol. 2010;38(4):585–594. [Google Scholar]
  24. Obara-Okeyo P, Kako S. Genetic diversity and identification of Cymbidium cultivars as measured by random amplified polymorphic DNA (RAPD) markers. Euphytica. 1998;99(2):95–101. [Google Scholar]
  25. Parks M, Cronn R, Liston A. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes. BMC Biol. 2009;7(1):84. doi: 10.1186/1741-7007-7-84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Price A, Garhyan J, Gibas C. The impact of RNA secondary structure on read start locations on the Illumina sequencing platform. PLoS ONE. 2017;12(2):e0173023. doi: 10.1371/journal.pone.0173023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Raubeson LA, Peery R, Chumley TW, Dziubek C, Fourcade HM, Boore JL, Jansen RK. Comparative chloroplast genomics: analyses including new sequences from the angiosperms Nuphar advena and Ranunculus macranthus. BMC Genom. 2007;8(1):174. doi: 10.1186/1471-2164-8-174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Schattner P, Brooks AN, Lowe TM. The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Res. 2005;33(suppl 2):W686–W689. doi: 10.1093/nar/gki366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Song Y, Dong W, Liu B, Xu C, Yao X, Gao J, Corlett RT. Comparative analysis of complete chloroplast genome sequences of two tropical trees Machilus yunnanensis and Machilus balansae in the family Lauraceae. Front Plant Sci. 2015;6:662. doi: 10.3389/fpls.2015.00662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Suo Z, Zhang C, Zheng Y, He L, Jin X, Hou B, Li J. Revealing genetic diversity of tree peonies at micro-evolution level with hyper-variable chloroplast markers and floral traits. Plant Cell Rep. 2012;31(12):2199–2213. doi: 10.1007/s00299-012-1330-0. [DOI] [PubMed] [Google Scholar]
  31. Wang HZ, Wang YD, Zhou XY, Ying QC, Zheng KL. Analysis of genetic diversity of 14 species of Cymbidium based on RAPDs and AFLPs. Shi Yan Sheng Wu Xue Bao. 2004;37(6):482–486. [PubMed] [Google Scholar]
  32. Wang HZ, Wu ZX, Lu JJ, Shi NN, Zhao Y, Zhang ZT, Liu JJ. Molecular diversity and relationships among Cymbidium goeringii cultivars based on inter-simple sequence repeat (ISSR) markers. Genetica. 2009;136(3):391–399. doi: 10.1007/s10709-008-9340-0. [DOI] [PubMed] [Google Scholar]
  33. Wang H, Park SY, Lee AR, Jang SG, Im DE, Jun TH, Lee JH, Chung JW, Kwon SW. Next-generation sequencing yields the complete chloroplast genome of C. goeringii acc. smg222 and phylogenetic analysis. Mitochondrial DNA B. 2018;3(1):215–216. doi: 10.1080/23802359.2018.1437812. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Wicke S, Schneeweiss GM, Müller KF, Quandt D. The evolution of the plastid chromosome in land plants: gene content, gene order, gene function. Plant Mol Biol. 2011;76(3–5):273–297. doi: 10.1007/s11103-011-9762-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Wolfe KH, Morden CW, Palmer JD. Function and evolution of a minimal plastid genome from a nonphotosynthetic parasitic plant. Proc Natl Acad Sci. 1992;89(22):10648–10652. doi: 10.1073/pnas.89.22.10648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Wu FH, Chan MT, Liao DC, Hsu CT, Lee YW, Daniell H, Duval MR, Lin CS. Complete chloroplast genome of Oncidium Gower Ramsey and evaluation of molecular markers for identification and breeding in Oncidiinae. BMC Plant Biol. 2010;10(1):68. doi: 10.1186/1471-2229-10-68. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Wyman SK, Jansen RK, Boore JL. Automatic annotation of organellar genomes with DOGMA. Bioinformatics. 2004;20(17):3252–3255. doi: 10.1093/bioinformatics/bth352. [DOI] [PubMed] [Google Scholar]
  38. Yang JB, Tang M, Li HT, Zhang ZR, Li DZ. Complete chloroplast genome of the genus Cymbidium: lights into the species identification, phylogenetic implications and population genetic analyses. BMC Evol Biol. 2013;13(1):84. doi: 10.1186/1471-2148-13-84. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials


Articles from 3 Biotech are provided here courtesy of Springer

RESOURCES