Abstract
Melia dubia Cav. (Meliaceae), a fast-growing tropical tree finds use in plywood, pulp and high-value solid wood products. To increase its productivity, we must essentially capture genetic diversity and identify genotypes with superior wood properties. This study aimed to develop novel microsatellite markers from genomic data and validate the markers in M. dubia. Direct Seq-to-SSR approach was adopted and using an in-house Perl script, 426,390 SSR markers identified. For validation, selected 151 markers, of which 50 were genomic markers chosen randomly, and 101 were genic markers identified through BLAST2GO. Amplification was observed in all loci, and 81.4% generated high-quality, reproducible amplicons of the expected size. Out of 50 genomic markers, we used ten highly polymorphic markers to assess genetic diversity among 75 genotypes from three populations. One hundred fourteen alleles were recorded, with a moderate level of diversity and a positive fixation index. Twenty-nine genic markers representing 13 enzymes showing polymorphism for wood stiffness were selected for diversity assessment of 24 genotypes (12 genotypes each with high and low-stress wave velocity). The product size ranged from 87 to 279, covering the majority of the genome. Cluster and structure analysis segregated ~ 80% of the genotypes based on the trait. This is the first report of the development of genic markers from a genomic survey and has proved efficient in differentiating genotypes based on the trait. The markers developed in this study will be useful for genetic mapping, diversity estimation, marker-assisted selection for desired traits and breeding for wood traits in M. dubia.
Supplementary Information
The online version contains supplementary material available at 10.1007/s13205-021-02858-w.
Keywords: NGS, Genomic, Genic SSR, Genetic diversity, Wood stiffness
Introduction
Melia dubia Cav. (Family: Meliaceae) is a fast-growing, deciduous multipurpose tree species native to India. The sapwood is greyish white, heartwood light pink to light red turning pale russet brown on ageing. The wood was traditionally used for fuelwood, musical instruments, agricultural implements, cigar boxes, match sticks, splints, packaging, catamarans, etc. (Swaminathan et al. 2012). It has tremendous potential as plywood, pulp and high-value solid wood (Parthiban et al. 2009; Saravanan et al. 2013). Its excellent processing and peeling ability make it an ideal species for core veneer in the plywood industry. Its exceptional wood properties, fast growth, short rotation period and rapid returns on investment makes it lucrative for commercial plantations. This species is scattered in its natural habitat and has fragmented natural populations (Rawat et al. 2018), is often self-pollinated (Johar et al. 2015), and has low germination ability (Anand et al. 2012). As extensive plantations of M. dubia are being raised, there is an urgent need to initiate tree improvement programmes to identify superior genotypes and have extensive collections for developing breeding strategies. The primary step in improving a species is assessing genetic diversity, which can be accomplished efficiently and rapidly through molecular breeding approaches (Collard et al. 2008).
Allelic information in a species helps in the selection of desired traits. Among the different markers the dominant markers do not allow detection of allelic information. Detection of allelic diversity is possible through co-dominant markers. Single Sequence Repeats (SSR) markers are co-dominant, short tandem repeat motifs (Zane et al. 2002). They vary from mono to hexanucleotide, are highly polymorphic (Byrne et al. 1996; Lowe et al. 2000), multi-allelic, reproducible (Powell et al. 1996; Liu and Cordes 2004) and neutral (Li et al. 2002). SSRs are used for genetic diversity estimation, discovering quantitative trait loci(QTL), construction of linkage maps, marker-assisted selection (MAS) for desired traits, forensics and parentage analysis, cultivar DNA fingerprinting, genome-wide association study (GWAS), germplasm characterization, etc. (Taheri et al. 2018).
SSRs or microsatellites can be developed through the traditional Sangers method (Sanger et al. 1997). However, this procedure is time and labour intensive and yields low SSRs (Gilmore et al. 2013). Recent developments in sequencing such as Next Generation Sequencing (NGS) using high throughput platforms like Illumina facilitate whole-genome sequencing of uncharacterized species, large-scale re-sequencing in well-characterized species or de novo transcriptome sequencing for species without a reference sequence. Illumina sequences can produce long reads up to 150 bp length and millions of sequences with Hiseq and can accommodate paired-end sequencing from both ends of 200–600 bp fragments (Castoe et al. 2015). Through Illumina technology using de novo assembly, sequence reads could be obtained in many non-model organisms without a reference sequence (Feldmeyer et al. 2011; Mizrachi et al. 2010; Li et al. 2012; Fu et al. 2013; Wang et al. 2010). The large amount of sequence data produced on these high throughput platforms could be used to mine genetic markers (Ossowski et al. 2008; Varshney et al. 2009; Bai et al. 2010, Garg et al. 2011; Malausa et al. 2011). SSR markers are of two types (i) developed by a random segment of DNA from the target species, known as genomic SSRs, (ii) markers derived from transcripts (reside in the functional genes) known as genic SSR markers. The latter is useful in determining the phenotypic trait, functional diversity in populations (Varshney et al. 2005). Genic or EST-SSR markers are sequence repeats designed from a known cDNA library, constructed from the messenger RNA, isolated from the desired plants directly from the gene (Varshney et al. 2005). Instead of de novo transcriptome analysis, using data sets from DNA sequencing itself, specific biochemical and enzyme-related functional markers are developed using software like Blast + version 2.2.26 (Ambreen et al. 2015).
Microsatellite markers have been developed and used in the molecular analysis of numerous tree species. In the Meliaceae, SSR markers have been developed through traditional methods in Swietenia macrophylla (Lemes et al. 2002; Lemes et al. 2003; Novick et al. 2003 and Lemes et al. 2011), Dysoxylum malabaricum (Sumangala et al. 2013; Hemmila et al. 2010; Ismail et al. 2014), Cabralea canjerana (Pereira et al. 2011 and Melo et al. 2014) and Azadirachta indica var. indica and A. indica var. siamensis (Boontong et al. 2009). Using genome sequencing, SSRs have been developed in Khaya senegalensis and Azadirachta indica (Karan et al. 2012; Krishnan et al. 2012). Rawat (2016) reported that in M. dubia primers from D. malabaricum (Sumangala et al. 2013) and A. indica (Boontong et al. 2009) cross amplified but did not show amplification, however, primers from M. volkensii (Hanaoka et al. 2012) amplified but did not show polymorphism.
The primary goal of any tree improvement programme is to improve the quality and quantity of wood. Wood quality traits are considered quantitative, controlled by multiple genes with moderate to a high degree of heritability (Thumma et al. 2010). Traits with high heritability are selected for genetic gain in tree improvement programs and Marker Assisted Selection (MAS) to identify trees with desired characters.MAS depends on detecting DNA markers that relate to a high proportion of additive variation in phenotypic traits.
In the present study, using the potential application of Illumina sequencing, SSR markers were developed through next-generation sequence methodology (Genotypic Technology [P] Ltd., India) from M. dubia genome. SSR loci and flanking PCR primer sites were identified by Seq-to-SSR approach using a software without library enrichment or post-sequencing assembly of reads. From the developed SSR markers, genic markers were identified using BLAST2GO. Further, 151 SSR markers consisting of 50 genomic and 101 genic markers were validated,10 genomic markers were used to assess the genetic diversity of 75 genotypes from three locations and 29 genic markers were used to assess the diversity of 24 genotypes with contrasting values for wood stiffness.
Materials and methods
Plant material and genomic DNA extraction
Leaf samples from mature trees were collected and dried with silica gel. Total genomic DNA was isolated using a standardized protocol (Rawat et al. 2016). The total gDNA was assessed for their quality and quantity. DNA concentration was 25.8 ng/μl and yield was 516 ng.
Library preparation
Sequencing libraries were prepared using the NEXTflex DNA Sequencing Kit (Cat # 5140–02) with 3 μg DNA as per the manufacturer’s instruction. The prepared libraries' quantification and size distribution were determined using Qubitfluorometer and the Agilent High Sensitivity DNA Kit (Agilent Technologies as per the manufacturer’s instructions. The Bioanalyzer profile of prepared library showed fragments in a size range of 300–700 bp.
Genome sequencing, SSR prediction, primer design and annotation
The Nextseq 150 bp paired-end sequencing was performed in Illumina platform. The Illumina paired-end raw reads were quality checked using FastQC program. Read quality processing (adapter and low-quality trimming) was done using in-house Perl script. The script provides details for SSR type, repeat number, start and end position of repeat in query sequences, length of repeats and the complete sequence and has separate modules for different SSR types (di- to hexanucleotides). Processed reads were joined using Fastq join tool; altered MISA program was used for SSR prediction from quality-checked raw reads (fastq file). As the MISA program can predict both simple and compound repeats, the minimum criteria for identifying SSRs were dimer with a minimum of 6 repeats, trimer with at least 5 repeats; tetramer, pentamer and hexamer with at least 4 repeats. In compound SSR prediction, the minimum length between two simple SSR was considered as 30 bp. Primer3 tool was used for primer generation. The primer design parameters include primer size—18–25 bases, amplicon size—50 bp-300 bp, optimum annealing temperature between 57 and 63 °C and GC content of 30% to 70% with an optimum value of 50%. M. dubia markers having similarity with Eucalyptus and Populus species were queried using BLAST2GO based on Gene Ontology, and further selection was madebased on enzyme ontology for wood properties.
Validation of microsatellite markers
For validation, 151 markers were selected by two approaches (a) 50 genomic markers randomly selected from markers developed by Illumina sequencing; (b) 101 genic markers selected based on enzyme annotation of Eucalyptus and Populus species using BLAST2GO. The 50 genomic markers were screened and ten markers with clear bands, stable amplification and high polymorphism were used for assessing the genetic diversity of 75 genotypes from 3 locations (Table 1). From101 genic markers, 29 markers showing high polymorphism for wood stiffness were selected, and diversity was estimated for 24 genotypes with contrasting values for wood stiffness. For selecting genotypes, stress wave velocity (SWV) in standing trees was measured using a microsecond timer tool (FAKOPP). Two probes of the timer tool were inserted inline in the stem facing each other at a distance (L) of 1000 m. The start probe was tapped gently with a hammer and the transit time (t in μs) of the pulse was recorded. The data was recorded thrice. SWV was calculated using the formula, SWV = L/t m/s. 24 individuals (12 genotypes each with high (HA1 to HA12) and low-stress wave velocity (LA1 to LA12)) were identified based on 1.7σp (phenotypic standard deviation) above and below the mean velocity. For both the experiments leaf samples were collected from selected genotypes and DNA was extracted by standardized protocol (Rawat et al. 2016).
Table 1.
Details of Melia dubia trees selected from plantations in Karnataka
| Sl. no. | Location | District | Trees selected | Age of plantation (years) | Latitude (North) | Longitude (East) |
|---|---|---|---|---|---|---|
| 1 | Yeshwantpur Plantation, Hoskote (Y) | Bangalore Rural | 40 | 8 years | 13°06′8.20″ | 77°50′44.04″ |
| 2 | Galli Bore Estate, Kamplapura, Periyapatna (P) | Mysore | 20 | 8 years | 12°23′31.9″ | 76°10′08.8″ |
| 3 | Arepalya, Kollegal (K) | Chamrajnagar | 15 | 8 years | 12°04′30.6″ | 77°12′20.1″ |
The PCR was carried out on an Eppendorf Master cycler (Eppendorf AG) with standardized conditions (Rawat et al. 2016). Genetic diversity estimation with 10 genomic markers was carried with capillary electrophoresis. The forward primer sequence was modified with either FAM or HEX dyes for PCR amplification through capillary electrophoresis. PCR product sizing was done by capillary electrophoresis using an ABI3730xl Genetic Analyzer (Applied Biosystems). The allelic data were then analyzed using GeneMarker 2.2.0 (SoftGenetics, State College, Pennsylvania). Diversity estimation with 29 genic markers was carried with agarose gel electrophoresis. Amplified PCR products were electrophoresed on 4% agarose gel in 1X TAE buffer at 70 V for 4½-7 h, depending on the base pair of the marker. A low molecular weight ladder (50 bp) Fermantas was used to score the banding pattern. Gene Tools software was used to score the bands.
In both cases, genetic diversity of the populations was characterized by the number of observed alleles (Na), the effective number of alleles (Ne) (Nei 1978), observed heterozygosity (Ho), expected heterozygosity (He) estimated by reciprocal of homozygosity (Kimura and Crow 1964), Shannon’s Information Index (I) and Wright’s fixation index (Wright 1978). Data analysis for genetic diversity was done using GENAlExVer.6.5 (Peakall and Smouse 2006, 2012). Clustering of genotypes was done by the Neighbour Joining method using Darwin (version 6) and Bayesian model-based analysis using Structure. It was used to test the hypothesis of one to ten subpopulations (K = 1 to K = 10) assuming the admixture model with correlated allele frequencies. Ten iterations and a burn-in period of 100,000 were carried out for each run. The results were imported to STRUCTURE HARVESTER and the value of K was detected by an ad-hoc quantity based on the second-order rate of change of likelihood function with respect to K (∆K) as suggested by Evanno et al. (2005).
Results
Genome sequencing, SSR prediction, primer design and annotation +
Sequencing of M. dubia genome using Illumina pair-end technology generated 2,89,24,426 raw reads having an average read length of 150 bp, with 3.0X coverage of the genome. The FastQ files were submitted to the sequence read archive (SRA) at the GenBank—National Centre for Biotechnology Information (NCBI) under the accession number SRX2834475.
A total of 2,72,94,460 processed reads (94.36% of raw reads) were obtained, while1,99,52,623 sequences were examined,with a total size of 357,64,75,640 bp and 4,26,390 SSR were identified with 11.73% compound microsatellites. There were 1,36,571 sequences with repeat length > 20bases. The GC content of the sequences was 36%.
By Illumina genome sequencing and assembling, plenty of sequences were produced and using in-house Perl script, numerous markers were developed in M. dubia. The frequency, motif type and repeat length of SSRs had significant heterogeneity. Of the total SSRs, dinucleotides were the most frequent (63.48%), followed by tetra (16.25%), tri (15.60%), penta (2.74%) and hexa (1.90%) nucleotides (Fig. 1). Among the dinucleotides motifs, the genome of M. dubia was enriched with 70.08% of AT/TA repeats and CG/GC repeats were minimum (0.20%) (Figs. 2 and 3). Among trinucleotide repeats, motif with AT repeats (AAT, ATA, TTA, ATT, TAA and TAT) was 51.77% with AAT being the highest (13.55%) motif. Motif with GC repeats was 8.7% with lowest 0.09% CGG (Figs. 2 and 4). Among tetranucleotide repeats, the TATG motif was most predominant (Fig. 2).
Fig.1.

Distribution of SSR motifs in M. dubia genome
Fig. 2.

Distribution of SSR motifs with most and least frequent motif in each class
Fig. 3.

Characterization of dinucleotide microsatellites in M. dubia genome
Fig. 4.

Characterization of trinucleotide microsatellites in M. dubia genome
To find similarity with other timber yielding species, the annotation was performed with Eucalyptus and Populus species. Out of 426,390 microsatellites sequences in M. dubia, 3370 and 2054 microsatellite sequences were found to have similarity with Eucalyptus and Populus SSR sequences respectively and the product size ranged from 54 to 286 bp. Among 16 species of Eucalyptus, 98.01% microsatellite sequences were from Eucalyptus grandis and out of 18 Populus species, 94.6% microsatellites were from Populus trichocarpa. Specific to wood properties, 3757 microsatellites sequences were identified in M. dubia and 626 and 473 sequences were found to have similarity with Eucalyptus and Populus SSR sequences, respectively (Fig. 5). Further, 24 enzymes for wood properties having similarity with these two species were filtered and 101 genic markers were annotated (Fig. 6).
Fig. 5.

Venn Diagram depicting similarities between annonated SSRs specific to wood properties of M. dubia, Eucalyptus and Populus species
Fig. 6.
Annotated 101 genic SSRs of M. dubia for 24 enzyme based on wood properties
Validation of microsatellite markers
Selection of microsatellites
A set of 151 microsatellite loci designated as MSSR 1 to MSSR 151 was chosen for validation of markers. The majority of loci were tri (67) followed by di (58), tetra (18) penta (5) and hexa (3) nucleotide repeats. Among the 151 tested SSR primers, all of them generated PCR products, while 121 generated high-quality, reproducible amplicons of the expected size (Supplementary Tables 1 and 2). DNA fragment length ranged from 71 to 276 bp. Thirty primer pairs that resulted in multiple bands, were difficult to evaluate and were excluded from further analysis.
Validation of genomic markers
For genetic relationship among 75 genotypes from three populations, genotyping was carried with 10 markers (Table 2) showing high polymorphism by capillary electrophoresis.The species showed moderate level of diversity (Na = 9.17; Ne = 4.13; Ho = 0.63 and He = 0.71) with a positive fixation index (Table 3). Among the three populations, the Yeswanthpura population showed greater variation in terms of observed alleles (11.40) and expected alleles (4.66). For all the populations, the observed heterozygosity was less compared to expected with a positive fixation index. Among ten markers, MSSR 3 showed high observed heterozygosity with a negative fixation index (Table 4). Cluster analysis with 75 genotypes revealed three major clusters, of which only 33% of genotypes were distinctly segregated as per the geographical location (Fig. 7). Cluster I had the majority of genotypes from the Yeswanthpur and cluster II had from the Hunsur area. Cluster III had genotypes from all three locations. So it can be inferred that the majority of genotypes did not segregate as per the geographical location. The results obtained by structure software showed the highest peak at K = 3 when analysed in structure harvester, giving ∆K value as three, suggesting three subpopulations for 75 genotypes of M. dubia (Fig. 8). It was found that when considering K = 3 (SG1 to SG3), 29 genotypes were pure and the remaining were admixtures.
Table 2.
Details of genomic markers (nucleotide, primer sequence, expected and observed product size) used for diversity analysis
| Sl. no. | Primer name | SSR primers | Primer sequence | Expected product size | Observed product size |
|---|---|---|---|---|---|
| 1 | MSSR2 | (TTA)12 |
F: CGTCGAACAAGCGAGCAGAACA R: AACGCCGACCGAGCGTAACT |
113 | 89–154 |
| 2 | MSSR3 | (ATA)12 |
F: GCAACAAGTGGCATTAGCATAGGCA R: AACGTATGCATCAGCCGAGATTTGA |
177 | 143–179 |
| 3 | MSSR7 | (TAA)15 |
F: ACGCAAAGCTTCGAGAACCTTCAA R: ACGATGTGGGCGTTCTACGCA |
164 | 146–166 |
| 4 | MSSR10 | (ATC)14 |
F: ACGCGATACCAAGTCATGTGGGA R: AGCATGGACCGAGCCAACCA |
117 | 97–136 |
| 5 | MSSR11 | (TA)19 |
F: CACAAGTACACCACATGCGCCA R: ACACCAATCTGGTCCTCCGTCC |
149 | 139–153 |
| 6 | MSSR12 | (AT)16 |
F: CCTGCCTAGTTGAAGTGAGTGGCA R: AACCTCGTTGGATTGAGGCATGTT |
183 | 157–194 |
| 7 | MSSR15 | (TC)16 |
F: TCCTTACTATGTTCGGCGGGCA R: ACCGACAGACCCACCAGTGT |
156 | 145–177 |
| 8 | MSSR18 | (AT)17 |
F: CAAGCCAGTCGCAGTCTCGT R: AGCTAGCTGTCGTCCCTGACTT |
132 | 113–145 |
| 9 | MSSR34 | (ATC)10 |
F:TCTGTGTTGACGTTTGCCTCCAA R: CCGAGAAGTCCTCTTTGCTCATCG |
100 | 90–130 |
| 10 | MSSR45 | (AACA)6 |
F: TCCAATTCTCGAATTCCTTGGAGCC R: ACCAGTGCCCAGATTGAGTTTGCT |
221 | 197–221 |
Table 3.
Overall genetic diversity estimates of 3 populations with 10 genomic SSR markers in M. dubia
| Pop1 (n = 40) | Pop 2 (n = 20) | Pop 3 (n = 15) | Mean | |
|---|---|---|---|---|
| Na | 11.40 | 8.50 | 7.60 | 9.17 |
| Ne | 4.66 | 4.06 | 4.58 | 4.43 |
| I | 1.74 | 1.62 | 1.58 | 1.65 |
| Ho | 0.61 | 0.63 | 0.65 | 0.63 |
| He | 0.71 | 0.71 | 0.70 | 0.71 |
| F | 0.13 | 0.15 | 0.09 | 0.12 |
Number of alleles (Na), effective number of alleles (Ne), Shannon’s Information Index(I), observed heterozygosity (Ho), expected heterozygosity (He) and fixation index (F)
Table 4.
Genetic diversity estimates of 3 populations with 10 genomic SSR markers in M. dubia
| MSSR 2 | MSSR 3 | MSSR 7 | MSSR 10 | MSSR 11 | MSSR 12 | MSSR 15 | MSSR 18 | MSSR 34 | MSSR 45 | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
|
Pop 1 (n = 40) |
Na | 19 | 8 | 12 | 10 | 8 | 12 | 12 | 12 | 19 | 2 |
| Ne | 6.85 | 4.22 | 4.32 | 3.86 | 3.59 | 3.87 | 3.78 | 5.54 | 9.52 | 1.05 | |
| I | 2.37 | 1.65 | 1.89 | 1.77 | 1.59 | 1.80 | 1.70 | 1.95 | 2.55 | 0.12 | |
| Ho | 0.98 | 0.63 | 0.80 | 0.50 | 0.15 | 0.73 | 0.78 | 0.60 | 0.90 | 0.05 | |
| He | 0.85 | 0.76 | 0.77 | 0.74 | 0.72 | 0.74 | 0.74 | 0.82 | 0.90 | 0.05 | |
| F | − 0.14 | 0.18 | − 0.04 | 0.33 | 0.79 | 0.02 | − 0.05 | 0.27 | − 0.01 | − 0.03 | |
|
Pop 2 (n = 20) |
Na | 10 | 9 | 7 | 6 | 7 | 8 | 11 | 10 | 12 | 5 |
| Ne | 5.76 | 4.28 | 3.72 | 3.35 | 3.27 | 2.53 | 5.93 | 3.74 | 6.50 | 1.54 | |
| I | 2.01 | 1.67 | 1.58 | 1.41 | 1.47 | 1.36 | 2.04 | 1.75 | 2.15 | 0.78 | |
| Ho | 0.95 | 0.70 | 0.85 | 0.55 | 0.15 | 0.50 | 0.95 | 0.75 | 0.80 | 0.10 | |
| He | 0.83 | 0.77 | 0.73 | 0.70 | 0.69 | 0.61 | 0.83 | 0.73 | 0.85 | 0.35 | |
| F | − 0.15 | 0.09 | − 0.16 | 0.22 | 0.78 | 0.17 | − 0.14 | − 0.02 | 0.05 | 0.71 | |
|
Pop 3 (n = 15) |
Na | 12 | 4 | 7 | 6 | 3 | 11 | 11 | 10 | 10 | 2 |
| Ne | 7.26 | 2.26 | 5.92 | 4.21 | 2.53 | 7.63 | 4.21 | 5.63 | 5.06 | 1.14 | |
| I | 2.20 | 1.00 | 1.85 | 1.56 | 1.01 | 2.19 | 1.87 | 2.00 | 1.90 | 0.24 | |
| Ho | 1.00 | 0.53 | 0.87 | 0.80 | 0.00 | 0.93 | 0.93 | 0.53 | 0.73 | 0.13 | |
| He | 0.86 | 0.56 | 0.83 | 0.76 | 0.60 | 0.87 | 0.76 | 0.82 | 0.80 | 0.12 | |
| F | − 0.16 | 0.04 | − 0.04 | − 0.05 | 1.00 | − 0.07 | − 0.22 | 0.35 | 0.09 | − 0.07 | |
| Mean | Na | 13.67 | 7.00 | 8.67 | 7.33 | 6.00 | 10.33 | 11.33 | 10.67 | 13.67 | 3.00 |
| Ne | 6.62 | 3.59 | 4.66 | 3.81 | 3.13 | 4.68 | 4.64 | 4.97 | 7.03 | 1.24 | |
| I | 2.19 | 1.44 | 1.77 | 1.58 | 1.36 | 1.78 | 1.87 | 1.90 | 2.20 | 0.38 | |
| Ho | 0.98 | 0.62 | 0.84 | 0.62 | 0.10 | 0.72 | 0.89 | 0.63 | 0.81 | 0.09 | |
| He | 0.85 | 0.70 | 0.78 | 0.73 | 0.67 | 0.74 | 0.78 | 0.79 | 0.85 | 0.17 | |
| F | − 0.15 | 0.10 | − 0.08 | 0.16 | 0.86 | 0.04 | − 0.14 | 0.20 | 0.04 | 0.21 |
Number of alleles (Na), effective number of alleles (Ne),Shannon’s Information Index(I), observed heterozygosity (Ho), expected heterozygosity (He) and fixation index (F)
Fig. 7.
Cluster analysis of 75 genotypes with 10 SSR markers
Fig. 8.
Structure analysis of 75 genotypes with 10 SSR markers
Validation of genic markers
For validation of SSR markers for the trait wood stiffness, 29 genic markers showing trait-related polymorphism were tested for estimating diversity for 24 genotypes (12 with a high-stress wave and 12 with low-stress wave velocity). These 29 markers represented 13 enzymes (cellulose synthase, cinnamoyl alcohol dehydrogenase, caffeate methyltransferase, 4 coumarate CoA ligase, laccase, alpha expansion 11 family, pectate lyase, auxin-responsive protein, glycosyltransferase, pectinesterase, amino cyclase, peroxidase and MYB83 like protein) relating to wood properties.
Among the 29 markers, nine were dinucleotide repeats, 15 markers tri and 5 tetranucleotide repeats. The product range for 29 markers was 87-279 bp covering the majority of the genome. Diversity estimates are presented in Table 5. The observed number of alleles ranged from 5 to 26, with the highest for MSSR 94 followed by MSSR 54 and 52. For most markers (25), diversity estimates in terms of observed heterozygosity were nil with the highest positive fixation index and only four markers viz; MSSR 52, 54, 65 and 94 showed heterozygosity. Cluster analysis with 29 markers of 24 genotypes resulted in two major clusters, of which 83.3% of genotypes segregated as per the trait (Fig. 9). Structure analysis resulted in 100% pures and considering K = 2, 85% genotypes segregated as per the trait (Fig. 10).
Table 5.
Genetic diversity parameters with 29 genic SSR markers in 24 genotypes for wood stiffness in M. dubia
| Sl. no. | Primers | Enzyme details | Repeats | Expected size | Observed size | Na | Ne | Ho | He | F | I |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | MSSR 51 | Cellulose synthase | (TA)12 | 162 | 128–177 | 19.0 | 15.2 | 0.00 | 0.93 | 1.00 | 2.85 |
| 2 | MSSR 52 | Cellulose synthase | (TA)10 | 181 | 165–195 | 18.0 | 14.6 | 0.04 | 0.93 | 0.96 | 2.79 |
| 3 | MSSR 54 | Cellulose synthase | (TA)10 | 99 | 92–112 | 18.0 | 11.0 | 0.46 | 0.91 | 0.50 | 2.62 |
| 4 | MSSR 64 | Cinnamyl alcohol dehydrogenase | (ATAC)6 | 211 | 201–210 | 9.0 | 4.8 | 0.00 | 0.79 | 1.00 | 1.85 |
| 5 | MSSR 65 | Cinnamyl alcohol dehydrogenase | (ATAC)6 | 217 | 198–219 | 10.0 | 7.0 | 0.13 | 0.86 | 0.85 | 2.08 |
| 6 | MSSR 68 | Caffeate o methyl transferase | (ATTT)4 | 125 | 145–162 | 14.0 | 9.9 | 0.00 | 0.90 | 1.00 | 2.46 |
| 7 | MSSR 69 | Caffeate o methyl transferase | (TG)6 | 276 | 268–281 | 8.0 | 6.8 | 0.04 | 0.85 | 0.95 | 1.99 |
| 8 | MSSR 70 | Caffeate o methyl transferase | (GAA)6 | 156 | 146–158 | 11.0 | 7.8 | 0.04 | 0.87 | 0.95 | 2.19 |
| 9 | MSSR 79 | 4 coumarate CoA ligase | (CGA)5 | 105 | 99–105 | 7.0 | 6.0 | 0.00 | 0.83 | 1.00 | 1.85 |
| 10 | MSSR 82 | 4 coumarate CoA ligase | (CGA)5 | 145 | 134–151 | 10.0 | 3.9 | 0.00 | 0.74 | 1.00 | 1.79 |
| 11 | MSSR 83 | 4 coumarate CoA ligase | (TCG)5 | 161 | 157–166 | 8.0 | 4.7 | 0.00 | 0.79 | 1.00 | 1.79 |
| 12 | MSSR 84 | 4 coumarate CoA ligase | (CGA)5 | 161 | 145–162 | 14.0 | 9.9 | 0.00 | 0.90 | 1.00 | 2.46 |
| 13 | MSSR 88 | Laccase | (GT)6 | 160 | 156–163 | 7.0 | 5.0 | 0.00 | 0.80 | 1.00 | 1.74 |
| 14 | MSSR 90 | Laccase | (AAT)7 | 176 | 163–180 | 10.0 | 8.2 | 0.00 | 0.88 | 1.00 | 2.20 |
| 15 | MSSR 93 | Laccase | (TA)7 | 267 | 260–269 | 8.0 | 5.9 | 0.00 | 0.83 | 1.00 | 1.92 |
| 16 | MSSR 94 | Alpha expansion 11 family | (TA)8 | 217 | 179–257 | 26.0 | 20.9 | 0.67 | 0.95 | 0.30 | 3.14 |
| 17 | MSSR 97 | Pectatelyase | (TGCA)4 | 240 | 216–245 | 15.0 | 9.0 | 0.00 | 0.89 | 1.00 | 2.48 |
| 18 | MSSR 98 | Pectatelyase | (CAA)6 | 210 | 202–211 | 9.0 | 5.5 | 0.00 | 0.82 | 1.00 | 1.94 |
| 19 | MSSR 108 | Auxin-responsive protein | (ATA)7 | 242 | 239–243 | 5.0 | 3.4 | 0.00 | 0.70 | 1.00 | 1.39 |
| 20 | MSSR 109 | Glycosyltransferase | (ATG)5 | 201 | 200–205 | 5.0 | 3.2 | 0.00 | 0.69 | 1.00 | 1.33 |
| 21 | MSSR 110 | Glycosyltransferase | (ATG)5 | 197 | 188–203 | 12.0 | 10.3 | 0.00 | 0.90 | 1.00 | 2.41 |
| 22 | MSSR 111 | Aminoacylase-1 | (GAA)5 | 214 | 200–218 | 14.0 | 10.7 | 0.00 | 0.91 | 1.00 | 2.50 |
| 23 | MSSR 114 | Pectinesterase | (GAA)5 | 179 | 156–185 | 16.0 | 12.0 | 0.00 | 0.92 | 1.00 | 2.64 |
| 24 | MSSR 117 | Pectinesterase | (TCA)5 | 111 | 102–114 | 10.0 | 6.9 | 0.00 | 0.85 | 1.00 | 2.11 |
| 25 | MSSR 119 | Auxin-responsive protein | (CAT)5 | 245 | 241–250 | 8.0 | 5.8 | 0.00 | 0.83 | 1.00 | 1.87 |
| 26 | MSSR 123 | MYB83-like protein | (CTA)6 | 252 | 212–250 | 19.0 | 16.0 | 0.00 | 0.94 | 1.00 | 2.87 |
| 27 | MSSR 131 | Peroxidase | (ATTA)5 | 97 | 87–104 | 15.0 | 12.5 | 0.00 | 0.92 | 1.00 | 2.61 |
| 28 | MSSR 136 | Peroxidase | (TC)7 | 177 | 167–177 | 11.0 | 7.0 | 0.00 | 0.86 | 1.00 | 2.16 |
| 29 | MSSR 150 | Laccase | (AT)8 | 120 | 119–124 | 5.0 | 3.7 | 0.00 | 0.73 | 1.00 | 1.41 |
| Mean | 11.8 | 8.5 | 0.05 | 0.85 | 0.95 | 2.19 |
Number of alleles (Na), effective number of alleles (Ne), observed heterozygosity (Ho), expected heterozygosity (He), fixation index (F), Shannon’s Information Index (I)
Fig. 9.
Cluster analysis of 24 genotypes with 29 SSR markers
Fig. 10.
Structure analysis of 24 genotypes with 29 SSR markers
Discussion
In non-model plants for which no genetic information is available high throughput NGS technologies is the most cost-effective, rapid technology for developing SSRs. It provides more coverage than the conventional whole-genome sequencing approach facilitating the development of many markers, as has been observed in our study. In Meliaceae members, genome sequencing for the development of SSRs has been reported in Khaya senegalensis and Azadirachta indica (Karan et al. 2012; Krishnan et al. 2012). To our knowledge, in M. dubia this is the first report on the development of microsatellites through genome sequencing. Using the direct Seq-to-SSR approach, 4,26,390 SSRs were generated with 88.27% perfect microsatellites and 11.73% compound microsatellites. According to Macaubas et al. (1996), perfect repeats report higher mutation rates than imperfect loci and are expected to yield more polymorphism. Higher allelic variation is observed with an increase in the number of repeats (Queller et. al. 1993). Various characteristics of the markers were also analysed. The generated genome had an average GC content of 36%, which is similar to other plant species such as tomato (36.2), potato (35.6), rubber (36.2%) and safflower (38%) (Jaillon et al. 2007; Zhu et al. 2008; Pootakham et al. 2012; Tangphatsornruang et al. 2009; Ambreen et al. 2015). In Azadirachta indica, a Meliaceae member, the mean GC content in raw genome sequence was ~ 30% (Krishnan et al. 2012). It is reported that dicot genomes have lower GC content than monocots, which is reflected by an average difference in codon usage between them (Fennoy and Bailey-Serres 1993; Carels and Bernardi 2000). Among all the motifs, dinucleotides were the most abundant, accounting for almost 70% of the total markers. However, in another Meliaceae member, Khaya senegalensis trinucleotides were more than dinucleotides (Karan et al. 2012). The prominence of dinucleotide motifs has been reported in the genome of several plant species (Liu et al. 2020; Dutta et al. 2011; Wei et al. 2011; Wang et al. 2010). Interestingly, in M. dubia the dominant motifs were all A/T rich in di, tri and tetranucleotide motifs. In dicots, A/T rich repeats have been reported to be more frequent (Sonah et al. 2011). AT-rich nucleotide composition has also been reported in A. indica (Krishnan et al. 2012). In several other plant genomes, the prominence of AT repeats has been reported (Ambreen et al. 2015; Powell et al. 1996; Tóth et al. 2000; Ellegren 2004). Inplant genomes, the dominance of repeat motifs of a particular sequence and length is the consequence of selection pressures applied on that specific motif during evolution (Sonah et al. 2011).
For validation of the microsatellites, 151 were chosen with a maximum representation of trimeric repeats. Trimeric repeats have a greater probability of presence in the coding region (Morgante et al. 2002), and stronger marker-gene/ trait association can be expected from these repeats (Ambreen et al. 2015). It has been advocated that selective forces allow the expansion of only trimeric repeats in the coding region to avoid frameshift mutation that could alter protein functionality (Ellegren 2004). PCR product was generated by all the microsatellites tested, and the high-quality amplicon was produced by 81.4%, signifying the accuracy of genome sequencing and assembly and suggesting that the microsatellites are valid and applicable. The process of SSR development is subjected to attrition at each step. Attrition of 50% between primer design and successful amplification of SSR loci has been reported (Pootakham et al. 2012; Squirrell et al. 2003).
In this study, we validated both genomic and genic SSRs. Genomic markers have no defined genic function; therefore, they are less likely to have close linkages to transcribed regions (Hu et al. 2011). However, genic SSR markers derived from EST sequences are functionally relevant as they are identified from expressed regions (Casu et al. 2001; Carson and Botha 2000).
Diversity studies for assessing genetic relationship among 75 genotypes from three populations estimated with 10 highly polymorphic genomic markers showed a total of 114 alleles ranging from 2 to 19 and Shannon’s information index was 1.65 (Tables 3 and 4), indicating that the primers provided abundant polymorphic information for the 75 genotypes.
In a population, gene heterozygosity is considered an optimum parameter to measure genetic diversity (Ott 1991). The study indicated that the germplasm was diverse, with 9.17 observed alleles, an expected heterozygosity Ho = 0.63 and observed heterozygosity He = 0.71. Thus, the genetic diversity observed in M. dubia is moderate. Our study corroborates with similar studies in other relatives of Meliaceae, namely Khaya senegalensis (Karan et al. 2012), Cedrela balansae (Soldati et al. 2013), Swietenia humilis (White and Powell 1997), and S. macrophylla (Novick et al. 2003), wherein, the He values were 0.739, 0.643, 0.548 and 0.657 using 11, 7, 10 and 7 highly polymorphic SSRs, respectively. However, higher values of He have been observed in some other Meliaceae species like C. fissilis (Kageyama et al. 2004) and C. odorata (Hernández 2008) with He values were higher than 0.8.
Genetic differentiation among 75 genotypes from three locations was not based on geographic distance. The three locations, though geographically distinct, were plantations raised from seed sources. This could be one of the reasons for relatively moderate diversity. In another study, in M. dubia, clustering of natural populations was not based on geographical distance (Communicated). Another reason could be the scattered distribution of M. dubia. The species has restricted distribution in South India and North East India (Troup 1921), though it is currently being popularized through plantations as a raw material for the plywood industry. Hamrick et al. (1992) suggest that natural distribution range is a good predictor of a species’ genetic variation. It implies that species like M.dubia, which has a restricted distribution, tend to exhibit lower genetic diversity than more widely distributed Meliaceae members like A. indica.
Most tree improvement programmes are aimed at improving biomass and wood traits. Considering the long gestation period, such programmes would benefit tremendously from advanced molecular technologies, contributing to molecular breeding through marker aided or gene-assisted selection. Many studies report the analysis of genes expressed during wood formation and xylogenesis in globally important species like Eucalyptus (Moran et al. 2002; Paux et al. 2004; Creux et al. 2008; Grattapaglia and Kirst 2008) and Populus (Creux et al. 2008; Wegrzyn et al. 2010) where some important metabolic pathways are now well known.
However, when non-model species are taken up for tree improvement, considering their potential as an important resource, small to medium breeding programmes are drawn up. Genome sequencing is an ideal strategy to generate information on the DNA, and analysis of sequences and complete decoding of a species genome. In situations where financial resources are not adequate, further advancements such as transcriptome analysis and differential gene expression studies may not be feasible. But to develop genic markers, transcriptome analysis of coding sequences having assigned known functions is essential.
M. dubia has exceptional wood properties and has tremendous potential for wood-based products; identifying superior trait-specific genotypes is of utmost importance for improving the species. In the absence of a reference genome, identifying trait-specific/genic markers was difficult. Transcriptome sequencing entailed demand for greater financial resources. Given the above, we attempted a new strategy for understanding wood-related/associated sequences using the whole-genome sequence of M. dubia. Our approach aimed to target trait-specific regions in the reference genomes of different woody species, identifying similar sequences using bioinformatics tools, and validate the SSR markers. Since Eucalypts and poplars are the most commonly used timber species globally, and due to the publicly available genomic resources, we selected these databases to conduct a BLAST analysis against the whole genome of M. dubia. Such information is not available within members of the Meliaceae family. We obtained 0.79 and 0.48% matches of Eucalyptus and Populus microsatellites to M. dubia. The identified markers were annotated and 101 genic SSR markers representing 24 enzymes that had significant similarity to wood trait-related genes were identified. Though polymorphism in the genic regions is low, the likelihood of identifying functional variability among genotypes is high (MathiThumilan et al. 2016). However, in our study, amplification was observed in 84 genic markers and 29 genic markers representing 13 enzymes showed polymorphism for wood stiffness. The mean number of alleles was 11.8 and Shannon’s information index was 2.19, showing that primers showed polymorphism for traits. Twenty-five markers were homozygous with a high fixation index, which might be because of higher inbreeding. Structure analysis showed that 85% of genotypes could be segregated as per trait, indicating these markers could differentiate the high and low-stress wave groups.
The usefulness of a genic SSR marker is higher if it is transferable. Though genomic resource information is available for other Meliaceae species like A. indica (Kuravadi et al. 2015), information on wood traits is lacking. In our case, adopting sequences from unrelated species, namely, Eucalyptus and Populus, and validating in M. dubia suggests the transferability of the genic SSR markers. On the contrary, in another separate study (unpublished; data not shown), we observed that 20 genic SSR for wood traits obtained directly from Eucalyptus and Populus genome did not cross amplify in M. dubia. It suggests that evolutionarily, these traits may have a certain degree of relatedness at the genome level. Therefore, sequences related to common traits may be transferable across species. Segregation of the genotypes based on trait using the genic SSRs is considered a robust validation of the identified markers. Considering the scarce availability of functional markers in this commercial species, this strategy appears to be a very plausible and resource-efficient one, for identifying genic markers. In the absence of information on transcriptomes, this study provided a strong impetus for developing genic co-dominant SSR markers assigning functionality to genomic markers.
Conclusion
This study is a novel attempt in genome sequencing and developing SSR markers in M. dubia using Illumina paired-end sequencing technology in the absence of a reference genome. Using direct Seq-to-SSR approach 426,390 SSR containing sequences were identified and 151 markers were experimentally validated. All markers generated PCR products and 121 generated amplicons of the expected size. Genomic markers efficiently predicted genetic diversity. This paper is the first to report the identification of genic markers related to wood traits from genomic markers through BLAST analysis to the best of our knowledge. The generated markers were used successfully to segregate the M. dubia genotypes into high and low-stress wave groups. More genic markers for wood traits can be identified in the future by annotating with other tree species. These genic SSR markers would provide a valuable resource for developing trait-specific markers for marker-assisted selection in M. dubia. This would enable the generation of a set of valuable tools for the future breeding of this species.
Data archiving statement
This research contains no data that requires submission to a public database. The details have already been deposited in the GenBank—National Centre for Biotechnology Information (NCBI) under the accession number SRX2834475. The SSRs selected for the study are listed in Supplementary Tables 1 and 2.
Accession numbers
The FastQ files containing the raw reads were submitted to the sequence read archive (SRA) at the National Centre for Biotechnology Information (NCBI) under the accession number SRX2834475.
Supplementary Information
Below is the link to the electronic supplementary material.
Acknowledgements
We are grateful to the Director and the Group Coordinator of Research at the Institute of Wood Science and Technology, Bengaluru for their encouragement and providing the necessary facilities. The authors appreciate the timely suggestion by Dr. Nataraj Karaba for genome sequencing of Melia dubia and also Dr. Modhumita Dasgupta for the initial stage of negotiation with Genotypic Technology [P] Ltd., India. We are grateful to Dr. S. S. Chauhan for helping in identifying genotypes for wood stiffness. We sincerely acknowledge Karnataka State Forest Department and farmers for allowing us to visit the plantations of M. dubia to collect data and samples for this study.
Authors' contributions
AD was involved in sample collection, methodology, laboratory work, data curation, formal analysis, manuscript writing. RRW contributed to data curation, manuscript reviewing and editing. ANA carried out fieldwork, partial funding acquisition and manuscript reviewing. AR and SCN supported laboratory work, GJ contributed to conceptualization, fieldwork, methodology, resources, supervision, partial funding acquisition, writing original draft and editing. All authors read and approved the final manuscript.
Funding
This work was supported by the Karnataka Forest Department and Indian Council of Forestry Research and Education.
Declarations
Conflict of interest
The authors declare no conflicts of interest.
References
- Ambreen H, Kumar S, Variath MT, Joshi G, Bali S, Agarwal M, Kumar A, Jagannath A, Goel S. Development of genomic microsatellite markers in Carthamus tinctorius L. (Safflower) using next generation sequencing and assessment of their cross-species transferability and utility for diversity analysis. PLoS ONE. 2015;10(8):e0135443. doi: 10.1371/journal.pone.0135443. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anand B, Devagiri GM, Maruti G, Vasudev HS, Anil KK. Effects of pre-sowing seed treatments on germination and seedling growth performance of Melia dubia Cav.: an important multipurpose tree. Inter J Life Sci. 2012;1(3):59–63. [Google Scholar]
- Bai X, Zhang W, Orantes L, Jun TH, Mittapalli O, Mian MR, Michel AP. Combining next-generation sequencing strategies for rapid molecular resource development from an invasive aphid species Aphis glycines. PLoS ONE. 2010;5:e11370. doi: 10.1371/journal.pone.0011370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boontong C, Pandey M, Changtragoon S. Isolation and characterization of microsatellite markers in Indian neem (Azadirachta indica var. indica A. Juss) and cross-amplification in Thai neem (A. indica var. siamensis Valeton) Conserv Genet. 2009;10(3):669–671. doi: 10.1007/s10592-008-9610-5. [DOI] [Google Scholar]
- Byrne M, Marquezgarcia MI, Uren T, Smith DS, Moran GF. Conservation and genetic diversity of microsatellite loci in the genus Eucalyptus. Aust J Bot. 1996;44(3):331–341. doi: 10.1071/BT9960331. [DOI] [Google Scholar]
- Carels N, Bernardi G. Two classes of genes in plants. Genetics. 2000;154(4):1819–2182. doi: 10.1093/genetics/154.4.1819. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carson DL, Botha FC. Preliminary analysis of expressed sequence tags for sugarcane. Crop Sci. 2000;40:1769–1779. doi: 10.2135/cropsci2000.4061769x. [DOI] [Google Scholar]
- Castoe TA, Poole AW, de Koning AJ, Jones KL, Tomback DF, Oyler-McCance SJ, Fike JA, Lance SL, Streicher JW, Smith EN, Pollock DD. Correction: Rapid microsatellite identification from Illumina paired-end genomic sequencing in two birds and a snake. PloS one. 2015;10(8):e0136465. doi: 10.1371/journal.pone.0136465. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Casu R, Dimmock C, Thomas M, Bower N, Knight D, Grof C. Genetic and expression profiling in sugarcane. Proc Int Soc Sugarcane Technol. 2001;24:626–627. [Google Scholar]
- Collard BC, Mackill DJ. Marker-assisted selection: an approach for precision plant breeding in the twenty-first century. Philos Trans R Soc Lond B Biol Sci. 2008;363:557–572. doi: 10.1098/rstb.2007.2170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Creux NM, Ranik M, Berger DK, Myburg AA. Comparative analysis of orthologous cellulose synthase promoters from Arabidopsis, Populus and Eucalyptus: evidence of conserved regulatory elements in angiosperms. New Phytol. 2008;179(3):722–737. doi: 10.1111/j.1469-8137.2008.02517.x. [DOI] [PubMed] [Google Scholar]
- Dutta S, Kumawat G, Singh BP, Gupta DK, Singh S, Dogra V, Gaikwad K, Sharma TR, Raje RS, Bandhopadhya TK, Datta S. Development of genic-SSR markers by deep transcriptome sequencing in pigeon pea (Cajanus cajan (L.) Millspaugh) BMC Plant Biol. 2011;11(1):17. doi: 10.1186/1471-2229-11-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ellegren H. Microsatellites: Simple sequences with complex evolution. Nat Rev Genet. 2004;5:435–445. doi: 10.1038/nrg1348. [DOI] [PubMed] [Google Scholar]
- Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol. 2005;14:2611–2620. doi: 10.1111/j.1365-294X.2005.02553.x. [DOI] [PubMed] [Google Scholar]
- Feldmeyer B, Wheat CW, Krezdorn N, Rotter B, Pfenninger M. Short read Illumina data for the de novo assembly of a non-model snail species transcriptome (Radix balthica, Basommatophora, Pulmonata), and a comparison of assembler performance. BMC Genomics. 2011;12:317. doi: 10.1186/1471-2164-12-317. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fennoy SL, Bailey-Serres J. Synonymous codon usage in Zea mays L. nuclear genes is varied by levels of C and G-ending codons. Nucleic Acids Res. 1993;21(23):5294–5300. doi: 10.1093/nar/21.23.5294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fu N, Wang Q, Shen HL. De novo assembly, gene annotation and marker development using Illumina paired-end transcriptome sequences in celery (Apium graveolens L.) PloS One. 2013;8:57686. doi: 10.1371/journal.pone.0057686. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Garg R, Patel RK, Tyagi AK, Jain M. De novo assembly of chickpea transcriptome using short reads for gene discovery and marker identification. DNA Res. 2011;18:53–63. doi: 10.1093/dnares/dsq028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilmore B, Bassil N, Nyberg A, Knaus B, Smith D, Barney DL, Hummer K. Microsatellite marker development in Peony using next generation sequencing. J Am Soc Hort Sci. 2013;138:64–74. doi: 10.21273/JASHS.138.1.64. [DOI] [Google Scholar]
- Grattapaglia D, Kirst M. Eucalyptus applied genomics: from gene sequences to breeding tools. New Phytology. 2008;179(4):911–929. doi: 10.1111/j.1469-8137.2008.02503.x. [DOI] [PubMed] [Google Scholar]
- Hamrick JL, Godt MJW, Sherman-Broyles SL. Factors influencing levels of genetic diversity in woody plant species. New for. 1992;6:95–124. doi: 10.1007/BF00120641. [DOI] [Google Scholar]
- Hanaoka S, Muturi GM, Watanabe A. Isolation and characterization of microsatellite markers in Melia volkensii Gurke. Conserv Genet Resour. 2012;4(2):395–398. doi: 10.1007/s12686-011-9558-5. [DOI] [Google Scholar]
- Hemmilä S, Mohana Kumara P, Ravikanth G, Gustafsson S, Sreejayan N, Vasudeva R, Ganeshaiah KN, Uma Shaanker R, Lascoux M. Development of polymorphic microsatellite loci in the endangered tree species Dysoxylum malabaricum. Mol Ecol Resour. 2010;10:404–408. doi: 10.1111/j.1755-0998.2009.02827.x. [DOI] [PubMed] [Google Scholar]
- Hernández LG (2008) Genetic diversity and mating system analysis of Cedrela odorata L. (Meliaceae) populations under different human dominated landscapes and primary forests. CATIE, Masters Thesis, pp 89
- Hu J, Wang L, Li J. Comparison of genomic SSR and EST-SSR markers for estimating genetic diversity in cucumber. Biol Plantarum. 2011;55(3):577–580. doi: 10.1007/s10535-011-0129-0. [DOI] [Google Scholar]
- Ismail SA, Ghazoul J, Ravikanth G, Kushalappa CG, Shaanker CU, Kettle CJ. Forest trees in human modified landscapes: ecological and genetic drivers of recruitment failure in Dysoxylum malabaricum (Meliaceae) PLoS ONE. 2014;9(2):89437. doi: 10.1371/journal.pone.0089437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, Aubourg S, Vitulo N, Jubin C, Vezzi A. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007;449(7161):463–467. doi: 10.1038/nature06148. [DOI] [PubMed] [Google Scholar]
- Johar V, Dhillon RS, Bangarwa KS, Ajit HAK. Phenological behaviour and reproductive biology of Melia composita. Ind J Agro. 2015;17(1):62–67. [Google Scholar]
- Kageyama P, Caron D, Gandara F, Dagoberto do Santos J. Conservation of Mata Atlântica forest fragments in the State of São Paulo, Brazil. In: Vinceti B, Amaral W, Meilleur B, editors. Challenges in managing forest genetic resources for livelihoods: examples from Argentina and Brazil. Rome, Italy: International Plant Genetic Resources Institute; 2004. pp. 167–185. [Google Scholar]
- Karan M, Evans DS, Reilly D, Schulte K, Wright C, Innes D, Holton TA, Nikles DG, Dickinson GR. Rapid microsatellite marker development for African mahogany (Khaya senegalensis, Meliaceae) using next-generation sequencing and assessment of its intra-specific genetic diversity. Mol Ecol Resour. 2012;12(2):344–353. doi: 10.1111/j.1755-0998.2011.03080.x. [DOI] [PubMed] [Google Scholar]
- Kimura M, Crow JF. The number of alleles that can be maintained in a finite population. Genetics. 1964;49(4):725–738. doi: 10.1093/genetics/49.4.725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krishnan NM, Pattnaik S, Jain P, Gaur P, Choudhary R, Vaidyanathan S, Deepak S, Hariharan AK, Bharath Krishna PG, Nair J, Varghese L, Valivarthi NK, Dhas K, Ramaswamy K, Panda B. A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica. Genomics. 2012;13:464. doi: 10.1186/1471-2164-13-464. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuravadi NA, Yenagi V, Rangiah K, Mahesh HB, Rajamani A, Shirke MD, Russiachand H, Loganathan RM, Shankara LC, Siddappa S, Ramamurthy A, Sathyanarayana BN, Gowda M. Comprehensive analyses of genomes, transcriptomes and metabolites of neem tree. PeerJ. 2015;3:e1066. doi: 10.7717/peerj.1066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lemes MR, Brondani RPV, Grattapaglia D. Multiplexed systems of microsatellite markers for genetic analysis of Mahogany, Swietenia macrophylla King (Meliaceae), a threatened neotropical timber species. J Hered. 2002;93(4):287–291. doi: 10.1093/jhered/93.4.287. [DOI] [PubMed] [Google Scholar]
- Lemes MR, Esashika T, Gaoue OG. Microsatellites for mahoganies: Twelve new loci for Swietenia macrophylla and its high transferability to Khaya senegalensis. Amer J Bot. 2011;98(8):e207–e209. doi: 10.3732/ajb.1100074. [DOI] [PubMed] [Google Scholar]
- Lemes MR, Gribel R, Proctor J, Grattapaglia D. opulation genetic structure of mahogany (Swietenia macrophylla King, Meliaceae) across the Brazilian Amazon based on variation at microsatellite loci: implication for conservation. Mol Ecol. 2003;12(11):2875–2883. doi: 10.1046/j.1365-294X.2003.01950.x. [DOI] [PubMed] [Google Scholar]
- Li D, Deng Z, Qin B, Liu X, Men Z. De novo assembly and characterization of bark transcriptome using Illumina sequencing and development of EST-SSR markers in rubber tree (Hevea brasiliensis Müll. Arg.) BMC Genomics. 2012;13(1):1–14. doi: 10.1186/1471-2164-13-192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li YC, Korol AB, Fahima T, Beiles A, Nevo E. Microsatellites: genomic distribution, putative functions and mutational mechanisms: a review. Mol Ecol. 2002;11:2453–2465. doi: 10.1046/j.1365-294X.2002.01643.x. [DOI] [PubMed] [Google Scholar]
- Liu C, Li J, Qin G. Genome-wide distribution of simple sequence repeats in pomegranate and their application to the analysis of genetic diversity. Tree Genet Genomes. 2020;16:36. doi: 10.1007/s11295-020-1428-4. [DOI] [Google Scholar]
- Liu ZJ, Cordes JF. DNA marker technologies and their applications in aquaculture genetics. Aquaculture. 2004;238:1–37. doi: 10.1016/j.aquaculture.2004.05.027. [DOI] [Google Scholar]
- Lowe AJ, Gillies ACM, Wilson J, Dawson IK. Conservation genetics of bush mango from central/ west Africa: implications from random amplified polymorphic DNA analysis. Mol Ecol. 2000;9(7):831–841. doi: 10.1046/j.1365-294x.2000.00936.x. [DOI] [PubMed] [Google Scholar]
- Malausa T, Gilles A, Meglécz E, Blanquart H, Duthoy S, Costedoat C, Dubut V, Pech N, Philippe CS, Délye C, Feau N, Frey P, Gauthier P, Guillemaud T, Hazard L, Le Corre V, Lung Escarement B, Malé PG, Ferreira S, Martin JF. High-throughput microsatellite isolation through 454 GS-FLX Titanium pyrosequencing of enriched DNA libraries. Mol Ecol Resour. 2011;11:638–644. doi: 10.1111/j.1755-0998.2011.02992.x. [DOI] [PubMed] [Google Scholar]
- MathiThumilan B, Sajeevan RS, Biradar J, Madhuri T, Nataraja NK, Sreeman MS. Development and characterization of genic SSR markers from Indian Mulberry transcriptome and their transferability to related species of Moraceae. PLoS ONE. 2016;11(9):e0162909. doi: 10.1371/journal.pone.0162909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Melo ATO, Coelho ASG, Pereira MF, Blanco AJV, Franceschinelli EV. High genetic diversity and strong spatial genetic structure in Cabralea canjerana (Vell.) Mart. (Meliaceae): implications to Brazilian Atlantic Forest tree conservation. Brazilian J Nat Conser. 2014;12(2):129–133. doi: 10.1016/j.ncon.2014.08.001. [DOI] [Google Scholar]
- Mizrachi E, Hefer CA, Ranik M, Joubert F, Myburg AA. De novo assembled expressed gene catalog of a fast-growing Eucalyptus tree produced by Illumina mRNA-Seq. BMC Genomics. 2010;11:681. doi: 10.1186/1471-2164-11-681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moran GF, Thamarus K, Raymond CA, Qiu D, Uren T, Southerton SG. Genomics of Eucalyptus wood traits. Ann for Sci. 2002;59(5–6):645–650. doi: 10.1051/forest:2002050. [DOI] [Google Scholar]
- Morgante M, Hanafey M, Powell W. Microsatellites are preferentially associated with non-repetitive DNA in plant genomes. Nat Genet. 2002;30(2):194–200. doi: 10.1038/ng822. [DOI] [PubMed] [Google Scholar]
- Nei M. Estimation of average heterozygosity and genetic distance from a small number of individuals. Genetics. 1978;89:583–590. doi: 10.1093/genetics/89.3.583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Novick RR, Dick CW, Lemes MR, Navarro C, Caccone S, Bermingham E. Genetic structure of Mesoamerican populations of big leaf mahogany (Swietenia macrophylla) inferred from microsatellite analysis. Mol Eco. 2003;12:2885–2893. doi: 10.1046/j.1365-294X.2003.01951.x. [DOI] [PubMed] [Google Scholar]
- Ott J. Analysis of human genetic linkage. Revised. Baltimore: Johns Hopkins University Press; 1991. [Google Scholar]
- Ossowski S, Schneeberger K, Clark RM, Lanz C, Warthmann N, Weigel D. Sequencing of natural strains of Arabidopsis thaliana with short reads. Genome Res. 2008;18:2024–2033. doi: 10.1101/gr.080200.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parthiban KT, Bharathi AK, Seenivasan R, Kamala K, Rao MG. Integrating Melia dubia in agroforestry farms as an alternate pulpwood species. Int J Life Sci. 2009;1(3):59–63. [Google Scholar]
- Paux E, Tamasloukht M, Ladouce N, Sivadon P, Grima-Pettenati J. Identification of genes preferentially expressed during wood formation in Eucalyptus. Plant Mol Biol. 2004;55(2):263–280. doi: 10.1007/s11103-004-0621-4. [DOI] [PubMed] [Google Scholar]
- Peakall R, Smouse PE. Genalex 6: genetic analysis in Excel. Population genetic software for teaching and research. Mol Ecol Notes. 2006;6:288–295. doi: 10.1111/j.1471-8286.2005.01155.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peakall R, Smouse PE. GenAlEx 6.5: Genetic analysis in Excel. Population genetic software for teaching and research—an update. Bioinformatics (oxford, England) 2012;28:2537–2539. doi: 10.1093/bioinformatics/bts460. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pereira MF, Bandeira LF, Blanco AJ, Coelho AS, Ciampi AY, Franceschinelli EV. Isolation and characterization of microsatellite loci in Cabralea canjerana (Meliaceae) Amer J Bot. 2011;98(1):10–12. doi: 10.3732/ajb.1000336. [DOI] [PubMed] [Google Scholar]
- Pootakham W, Chanprasert J, Jomchai N, Sangsrakru D, Yoocha T, Tragoonrung S, Tangphatsornruang S. Development of genomic-derived simple sequence repeat markers in Hevea brasiliensis from 454 genome shotgun sequences. Plant Breed. 2012;131(4):555–562. doi: 10.1111/j.1439-0523.2012.01982.x. [DOI] [Google Scholar]
- Powell W, Machray GC, Provan J. Polymorphism revealed by simple sequence repeats. Trends Plant Sci. 1996;1(7):215–222. doi: 10.1016/S1360-1385(96)86898-0. [DOI] [Google Scholar]
- Queller DC, Strassmann JE, Hughes CR. Microsatellites kinship. Trends Ecol Evol. 1993;8:285–288. doi: 10.1016/0169-5347(93)90256-O. [DOI] [PubMed] [Google Scholar]
- Rawat S (2016) Assessment of morphological and genetic diversity in populations of Melia dubia in Karnataka. PhD Thesis, Forest Research Institute University Dehra Dun, Uttarakhand, India
- Rawat S, Arunkumar AN, Annapurna D, Karaba NN, Joshi G. Genetic diversity of Melia dubia using ISSR markers for natural populations and plantations. Inter J Genetics. 2018;10(9):490–494. [Google Scholar]
- Rawat S, Joshi G, Annapurna D, Arunkumar AN, Karaba NK. Standardization of DNA extraction method from mature dried leaves and ISSR-PCR conditions for Melia dubia Cav. —a fast-growing multipurpose tree species. Amer J Plant Sci. 2016;7:437–445. doi: 10.4236/ajps.2016.73037. [DOI] [Google Scholar]
- Sanger F, Nicklen S, Coulson A. DNA sequencing with chain-terminating inhibitors. Proc Natl AcadSci USA. 1997;74:5463–5467. doi: 10.1073/pnas.74.12.5463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saravanan V, Parthiban KT, Kumar P, Marimuthu P. Wood characterization studies on Melia dubia Cav. for pulp and paper industry at different age gradation. Res J Recent Sci. 2013;2:183–188. [Google Scholar]
- Soldati MC, Fornes L, Van Zonneveld M, Thomas E, Zelener N. An assessment of the genetic diversity of Cedrela balansae C. DC. (Meliaceae) in Northwestern Argentina by means of combined use of SSR and AFLP molecular markers. Biochem Syst Ecol. 2013;47:45–55. doi: 10.1016/j.bse.2012.10.011. [DOI] [Google Scholar]
- Sonah H, Deshmukh RK, Sharma A, Singh VP, Gupta DK, Gacche RN, Rana JC, Singh NK, Sharma TR. Genome-wide distribution and organization of microsatellites in plants: an insight into marker development in Brachypodium. PLoS ONE. 2011;6:e21298. doi: 10.1371/journal.pone.0021298. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Squirrell J, Hollingsworth PM, Woodhead M, Russell J, Lowe AJ, Gibby M, Powell W. How much effort is required to isolate nuclear microsatellites from plants? Mol Ecol. 2003;12(6):1339–1348. doi: 10.1046/j.1365-294X.2003.01825.x. [DOI] [PubMed] [Google Scholar]
- Sumangala RC, Mohana Kumara P, Shaanker RU. Development and characterization of microsatellite markers for Dysoxylum binectariferum, a medicinally important tree species in Western Ghats. India J Genet. 2013;92:85–88. doi: 10.1007/s12041-013-0207-5. [DOI] [PubMed] [Google Scholar]
- Swaminathan C, Vijendrarao R, Shashikala S. Preliminary evaluations of variations in anatomical properties of Melia dubia Cav. wood. Int Res J Biol Sci. 2012;4:1–6. [Google Scholar]
- Taheri S, Abdullah TL, Yusop MR, Hanafi MM, Sahebi M, Azizi P, Shamshiri RR. Mining and development of novel SSR markers using next generation sequencing (NGS) data in plants. Molecules. 2018;23:399. doi: 10.3390/molecules23020399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tangphatsornruang S, Somta P, Uthaipaisanwong P, Chanprasert J, Sangsrakru D, Seehalak W, Sommanas W, Tragoonrung S, Srinives P. Characterization of microsatellites and gene contents from genome shotgun sequences of mungbean (Vigna radiata L. Wilczek) BMC Plant Biol. 2009;9(1):137. doi: 10.1186/1471-2229-9-137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thumma BR, Southerton SG, Bell JC, Owen JV, Henery ML, Moran GF. Quantitative trait locus (QTL) analysis of wood quality traits in Eucalyptus nitens. Tree Genet Genomes. 2010;6:305–317. doi: 10.1007/s11295-009-0250-9. [DOI] [Google Scholar]
- Tóth G, Gáspári Z, Jurka J. Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 2000;10(7):967–981. doi: 10.1101/gr.10.7.967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Troup RS. Silviculture of Indian trees. Clarendon Press London. 1921;1:152. [Google Scholar]
- Varshney RK, Graner A, Sorrells MW. Genic microsatellite markers in plants: features and applications. Trends Biotechnol. 2005;23:48–55. doi: 10.1016/j.tibtech.2004.11.005. [DOI] [PubMed] [Google Scholar]
- Varshney RK, Nayak SN, May GD, Jackson SA. Next-generation sequencing technologies and their implications for crop genetics and breeding. Trends Biotechnol. 2009;27:522–530. doi: 10.1016/j.tibtech.2009.05.006. [DOI] [PubMed] [Google Scholar]
- Wang Z, Fang B, Chen J, Zhang X, Luo Z, Huang L, Chen X, Li Y. De novo assembly and characterization of root transcriptome using Illumina paired-end sequencing and development of cSSR markers in sweet potato (Ipomoea batatas) BMC Genomics. 2010;11:726. doi: 10.1186/1471-2164-11-726. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wegrzyn JL, Eckert AJ, Choi M, Lee JM, Stanton BJ, Sykes R, Davis MF, Tsai CJ, Neale DB. Association genetics of traits controlling lignin and cellulose biosynthesis in black cottonwood (Populus trichocarpa, Salicaceae) secondary xylem. New Phytol. 2010;188(2):515–532. doi: 10.1111/j.1469-8137.2010.03415.x. [DOI] [PubMed] [Google Scholar]
- Wei W, Qi X, Wang L, Zhang Y, Hua W, Li D, Lv H, Zhang X. Characterization of the sesame (Sesamum indicum L.) global transcriptome using Illumina paired-end sequencing and development of EST-SSR markers. BMC Genomics. 2011;12(1):451. doi: 10.1186/1471-2164-12-451. [DOI] [PMC free article] [PubMed] [Google Scholar]
- White G, Powell W. Isolation and characterization of microsatellite loci in Swietenia humilis (Meliaceae): an endangered tropical hardwood species. Mol Ecol. 1997;6:851–860. doi: 10.1111/j.1365-294X.1997.tb00139.x. [DOI] [Google Scholar]
- Wright S (1978) Evolution and the genetics of populations, vol IV. Variability within and among natural populations. University of Chicago Press, Chicago
- Zane L, Bargelloni L, Patarnello T. Strategies for microsatellite isolation: a review. Mol Ecol. 2002;11(1):1–16. doi: 10.1046/j.0962-1083.2001.01418.x. [DOI] [PubMed] [Google Scholar]
- Zhu W, Ouyang S, Lovene M, O'Brien K, Vuong H, Jiang J, Buell CR. Analysis of 90 Mb of the potato genome reveals conservation of gene structures and order with tomato but divergence in repetitive sequence composition. BMC Genomics. 2008;9(1):286. doi: 10.1186/1471-2164-9-286. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
This research contains no data that requires submission to a public database. The details have already been deposited in the GenBank—National Centre for Biotechnology Information (NCBI) under the accession number SRX2834475. The SSRs selected for the study are listed in Supplementary Tables 1 and 2.





