Abstract
• Premise of the study: New microsatellite (simple sequence repeat [SSR]) primers were developed from Eucalyptus expressed sequence tags (ESTs) and optimized for genetic studies of the southwestern Australian tree E. gomphocephala, which is severely impacted by tree health decline and habitat fragmentation.
• Methods and Results: A total of 133 gene-homologous EST-SSR primer pairs were designed for Eucalyptus, and 44 were screened in E. gomphocephala. Of these, 17 produced reliable amplification products and 11 were polymorphic. Between two and 13 alleles were observed per locus, and observed heterozygosities ranged from 0.172 to 0.867. All 17 EST-SSRs that amplified E. gomphocephala cross-amplified to at least one of E. marginata, E. camaldulensis, and E. victrix.
• Conclusions: This set of EST-SSR primer pairs will be valuable tools for future population genetic studies of E. gomphocephala and other eucalypts, particularly for studying gene-linked variation and informing seed-sourcing strategies for ecological restoration.
Keywords: ecologically important genetic variation, EST-microsatellite, Eucalyptus gomphocephala, Myrtaceae, SSR mining, tuart
The Australasian tree genus Eucalyptus L’Hér. (Myrtaceae) comprises more than 700 species (Brooker, 2000) and is economically significant to the forestry industry. Vast genomic resources for model species have allowed molecular markers to be developed in related nonmodel taxa of conservation significance. Eucalyptus gomphocephala DC. (subg. Symphyomyrtus, common name tuart) is endemic to the Swan Coastal Plain of southwestern Australia, and is a culturally iconic species in the region. Severe pathogen-mediated tree decline (Cai et al., 2010), together with extensive habitat fragmentation in urbanized areas, has required rapid conservation actions for the species. To inform seed sourcing strategies for ecological restoration and forest management in the context of climate change, measures of adaptive population genetic diversity and structure are critical, but no microsatellite markers (simple sequence repeats [SSRs]) have yet been tested or optimized for E. gomphocephala. Specifically, SSRs within gene coding regions such as expressed sequence tags (ESTs) will be valuable for investigating variation in functional and potentially adaptive regions of the genome.
Here, we design novel PCR primer pairs for SSRs within Eucalyptus ESTs, targeting those that are homologous to annotated genes and therefore those with putative ecological relevance. For the first time, we optimize a subset for utility in population genetic analysis of E. gomphocephala to inform future conservation programs. To demonstrate the markers’ broader utility in the genus, we report cross-species amplification in three additional ecologically important species (E. camaldulensis Dehnh., E. marginata Sm., E. victrix L. A. S. Johnson & K. D. Hill). This is the first set of gene-homologous EST-SSRs to our knowledge that have been reported and tested in natural populations of nonmodel eucalypt species of ecological and conservation significance. These EST-SSR primers add value to a recently expanding set of EST-SSRs being made available for Eucalyptus, increasing the variety of genes that can be investigated in novel population and conservation genetic studies of nonmodel species.
METHODS AND RESULTS
A total of 36,001 Eucalyptus EST sequences were downloaded from GenBank in 2007 and screened for microsatellite repeats with Tandem Repeats Finder (Benson, 1999) using default parameters, and organized using Tandem Repeats Database version 2.30 (Gelfand et al., 2007). A total of 1098 repeats were detected in 1073 (3%) of 36,001 ESTs. Sequences were clustered using the sequence assembly program CAP3 (Huang and Madan, 1999) with default parameters, into a set of 128 nonredundant contigs and 401 singletons (Appendix S1 (38KB, doc) and S2 (73.5KB, doc) ). A total of 154 of 529 (29%) ESTs were homologous to various annotated genes, following BLASTN searches of the National Center for Biotechnology Information’s nucleotide (nr/nt) collection, using an E-value threshold of 10−10. PCR primers were designed for all 154 sequences using Primer3 (Rozen and Skaletsky, 2000). Annotated gene homologs were specifically targeted so that loci could inform hypotheses relating to adaptive genetic variation in future studies of E. gomphocephala. All primer pairs and putative gene functions are provided in Appendix S3 (273.5KB, doc) .
We prioritized a random subset of 44 EST-SSRs for further PCR optimization in E. gomphocephala. Primer pairs were synthesized by GeneWorks Pty Ltd (Hindmarsh, Australia) and initially tested for amplification products on a screening panel of seven E. gomphocephala individuals. DNA was extracted from freeze-dried leaf material using a NucleoSpin 8/96 Plant II Core Kit, with buffer set PL2/3 (Macherey-Nagel GmbH & Co., Düren, Germany), as per the manufacturer’s instructions. PCRs were carried out in a total volume of 10 μL, containing 10 ng genomic DNA template, 1× PCR Polymerization Buffer containing dNTPs, 0.2 μM of each primer, 0.44 units Taq DNA polymerase, and 2 mM of MgCl2 (reagents from Fisher Biotech, Perth, Western Australia, Australia). Thermocycling was carried out as follows: denaturation at 94°C for 5 min; followed by 30 cycles of 94°C for 1 min, annealing at 55°C for 30 s, and extension at 72°C for 30 s; followed by a final extension at 72°C for 15 min. Annealing temperatures (Ta) were optimized for each locus and products were visualized on 2% agarose gels stained with SYBR Safe (Invitrogen, Carlsbad, California, USA).
A set of 17 (39%) primer pairs reliably amplified a product of expected size in E. gomphocephala. These loci were screened for allele size polymorphism and genotyped in multiplex PCR of between two and four loci per reaction, using a commercial kit (PCR Multiplex Kit using Q Solution; QIAGEN, Hilden, Germany). Forward primers were 5′ end-labeled with WellRED fluorescent dyes (D2, D3, D4; Sigma-Aldrich, St. Louis, Missouri, USA). Reactions were carried out in a total volume of 12.5 μL, containing 5–30 ng DNA template, 1× QIAGEN Multiplex PCR Master Mix, 0.5× Q-solution, and 0.2 μM of each primer (with some exceptions, Table 1). PCR was conducted with an initial activation step at 95°C for 15 min; followed by 30 cycles of 94°C for 30 s, annealing at Ta (Table 1) for 90 s, and extension at 72°C for 60 s; followed by final extension at 72°C for 10 min. Products were genotyped using a CEQ 8800 Genetic Analysis System and analyzed using CEQ Fragment Analysis Software (Beckman Coulter, Brea, California, USA). Two loci (EGM09 and EGM24) exhibited amplification failures or dubious peaks in multiplex and were analyzed in singleplex for further reactions. Following genotyping tests, 11 EST-SSRs were polymorphic and produced reliable electrophoretic profiles in E. gomphocephala (Tables 1 and 2). For these loci, we further investigated the most likely gene annotations by conducting a BLASTX search against proteins in the UniRef100 database (European Molecular Biology Laboratory–European Bioinformatics Institute [EMBL-EBI], http://www.ebi.ac.uk/Tools/blastall/) (Table 1).
Table 1.
Characteristics of 11 new EST-SSR loci developed and optimized for Eucalyptus gomphocephala.
| Locus | Primer sequences (5′–3′) | Repeat motif | GenBank accession no.a | Source speciesb | Putative function and E-valuec | Dye | AT | Allele size range (bp) | Md | [Final] | Ta (°C) |
| EGM09 | F: ATTTGCTGAAGTGGGTCTCG | (AG)17 | ES594818 | E. globulus | Ricinus communis glucan endo-1,3-beta-glucosidase, putative [2.0e-68] | D4 | 4 | 160–166 | — | 0.2 | 56 |
| R: ACAGGTCCAGAAGCATGAGC | |||||||||||
| EGM12 | F: GCGCCGAGAATCAATACG | (CAG)10 | CD668471 | E. tereticonis | Populus deltoides CONSTANS-like protein CO1 [3.0e-22] | D3 | 4 | 188–197 | A | 0.2 | 56 |
| R: GTAGCTGTTGGCAGCTTTGG | |||||||||||
| EGM14 | F: CACTGCCACTTACCAGAGTCG | (CT)18 | CB968019 | E. grandis | Glycine max heat shock transcription factor 21 [1.0e-17] | D4 | 8 | 350–372 | B | 0.1 | 54 |
| R: CCTCCACCATCTCGAACG | |||||||||||
| EGM24 | F: CCTGCAACGCTTCTCGTC | (CT)17 | ES589764 | E. globulus | Prunus dulcis putative S-adenosylmethionine decarboxylase (SAMDC) [1.0e-26] | D2 | 3 | 204–212 | — | 0.2 | 56 |
| R: TCTGTATTGAGGCTCGCGTA | |||||||||||
| EGM25 | F: CCAGAAGCAACCTCAATTTCC | (TC)15 | ES589925 | E. globulus | Arabidopsis thaliana metal transporter Nramp3 [1.0e-109] | D2 | 2 | 350–356 | C | 0.1 | 56 |
| R: AGCCACAGCAGGGAGTAGC | |||||||||||
| EGM30 | F: AGTGCAGCACCTTTCAGACC | (AG)18 | CU398186 | E. gunnii | Ricinus communis chlorophyll A/B binding protein, putative [5.0e-45] | D3 | 13 | 225–255 | C | 0.1 | 56 |
| R: AAGATTGATTGCTAGATCAGTCACC | |||||||||||
| EGM35 | F: ATACGCGTCCCAGTGATTTC | (AG)18 | Contig 92* | E. globulus | Ricinus communis fructose-1,6-bisphosphatase, cytosolic [1.0e-143] | D2 | 9 | 196–212 | C | 0.1 | 56 |
| R: AGGAGCAGACGAACTTGCAT | |||||||||||
| EGM37 | F: TGAGGTCACTTCAAGCACCAAGA | (GCTTA)5 | Contig85* | E. globulus | Gossypium hirsutum quinone oxidoreductase [1.0e-106] | D4 | 2 | 256–261 | B | 0.025 | 54 |
| R: GGAAGCGGCAACAACCTTAACA | |||||||||||
| EGM46 | F: ATATTCGGCCTCTTCGCATT | (CAG)4(AG)12 | Contig18* | E. gunnii | Glycine max desiccation protectant protein Lea14 homolog [4.0e-79] | D2 | 13 | 233–257 | A | 0.2 | 56 |
| R: ACCTTGGCGTTGTACTCGAC | |||||||||||
| EGM47 | F: TCGTTCGGTTTCTGTTCTGA | (AATCG)6 | Contig79* | E. grandis + E. gunnii | Nicotiana tabacum small GTPase Rab2 [3.0e-65] | D2 | 3 | 94–114 | C | 0.05 | 56 |
| R: ACATCCTTCGATCCAACCAG | |||||||||||
| EGM48 | F: TCACACTCCAATCTCCAACG | (CT)12 | CU396026 | E. gunnii | Quercus macrocarpa aquaporin PIP2;1 [1.0e-125] | D2 | 6 | 141–155 | A | 0.2 | 56 |
Note: AT = total number of alleles observed, based on 60 individuals from two natural populations of E. gomphocephala; Dye = WellRED dye label; [Final] = final concentration of primer pairs in PCR reaction (μM); M = PCR multiplex group; Ta = annealing temperature.
GenBank accession number of source EST, or nonredundant contig number. Contigs are marked with an asterisk (*) and were derived from multiple redundant singleton ESTs. Contig sequence assembly is reported in Appendix S1 (38KB, doc) ; EST contig FASTA sequences are reported in Appendix S2 (73.5KB, doc) .
Species from which EST(s) were derived.
Putative EST function based on a BLASTX search of the UniRef100 database; E-value of the match is given in brackets.
Loci allocated the same letter (A, B, or C) were amplified together in the same multiplex PCR reaction; a dash (—) indicates the locus was amplified in singleplex.
Table 2.
Population genetic properties of the 11 newly developed EST-SSRs in two natural populations of Eucalyptus gomphocephala.a
| Yalgorup National Park (n = 30) | Ludlow Tuart Forest (n = 30) | |||||
| Locus | AT | Ho | He | AT | Ho | He |
| EGM09 | 4 | 0.400 | 0.639 | 4 | 0.233 | 0.558 |
| EGM12 | 4 | 0.600 | 0.629 | 4 | 0.621 | 0.640 |
| EGM14 | 7 | 0.733 | 0.779 | 8 | 0.733 | 0.776 |
| EGM24 | 3 | 0.345 | 0.580 | 3 | 0.172 | 0.506 |
| EGM25 | 2 | 0.233 | 0.255 | 2 | 0.207 | 0.238 |
| EGM30 | 9 | 0.667 | 0.728 | 12 | 0.833 | 0.839 |
| EGM35 | 9 | 0.667 | 0.832 | 6 | 0.767 | 0.748 |
| EGM37 | 2 | 0.433 | 0.495 | 2 | 0.300 | 0.406 |
| EGM46 | 12 | 0.867 | 0.816 | 9 | 0.862 | 0.861 |
| EGM47 | 3 | 0.533 | 0.515 | 3 | 0.500 | 0.452 |
| EGM48 | 6 | 0.833 | 0.776 | 5 | 0.862 | 0.794 |
Note: AT = total number of alleles observed; He = expected heterozygosity; Ho = observed heterozygosity; n = number of individuals sampled.
Geographic coordinates for each population are: Yalgorup National Park = 32°51′17″S, 115°39′54″E; Ludlow Tuart Forest = 33°34′42″S, 115°29′42″E.
To screen genetic diversity of the 11 EST-SSRs, we genotyped 60 E. gomphocephala individuals from two natural populations: ‘Yalgorup’ (voucher: R. Davis 1390, PERTH 04473167) and ‘Ludlow’ (voucher: C. A. Gardner, PERTH 01350501) (Table 2). Genetic diversity parameters were calculated with GenAlEx version 6.4.1 (Peakall and Smouse, 2006). Deviation from Hardy–Weinberg equilibrium (HWE) and linkage disequilibrium among loci were calculated with GENEPOP version 4.0.10 (Raymond and Rousset, 1995; Rousset, 2008), using the Bonferroni correction for multiple testing. The total number of alleles per locus ranged from two to 13 (mean = 6) (Table 1). Observed and expected heterozygosities ranged from 0.172 to 0.867 and 0.238 to 0.861, respectively (Table 2). Significant deviation from HWE was detected for EGM09 at Ludlow. We did not detect evidence of linkage disequilibrium between any pair of loci in more than one population. The potential presence of null alleles and their frequency (r) in each population were estimated using MICRO-CHECKER version 2.2.3 (van Oosterhout et al., 2004), and were predicted for EGM09 and EGM24 in both populations (r range = 0.17–0.29) and for EGM35 at Yalgorup (r = 0.09). The potential for stuttering was predicted for EGM09 at Ludlow.
All 17 primer pairs that initially amplified E. gomphocephala DNA were screened for cross-amplification in three additional species, belonging to diverse sections within two subgenera: E. camaldulensis (sect. Exsertaria) and E. victrix (sect. Adnataria) (both subg. Symphyomyrtus), and E. marginata (sect. Longistylus) (subg. Eucalyptus) according to the methods described above. Samples were collected from natural populations (Appendix 1). Sixteen out of 17 (94%) loci cross-amplified within subg. Symphyomyrtus, and 14 out of 17 (82%) cross-amplified across the subgenus boundary to subg. Eucalyptus, demonstrating high transferability rates within the genus.
CONCLUSIONS
This new set of 11 polymorphic EST-SSR loci will enable the characterization of population genetic diversity and structure throughout the species’ range of E. gomphocephala in conjunction with putatively neutral, nuclear genomic SSRs (gSSRs). Given their gene-linked nature, the EST-SSRs can be used to test for signatures of selection in natural populations. The high transferability of the loci demonstrates their suitability for application to other species of ecological, conservation, or economic importance in the genus. The broader set of 133 EST-SSRs represents a resource that could be exploited to optimize additional gene-linked EST-SSRs for Eucalyptus and further expand the types of genes available for investigation. Data obtained using these markers will be particularly valuable to inform seed sourcing and conservation strategies in natural E. gomphocephala populations.
Supplementary Material
Appendix 1.
Cross-species amplification of 17 EST-SSR loci from Eucalyptus gomphocephala to three additional eucalypt species from two subgenera.abc
| Locus | Annotationd | Sourceef | E. gomphocephala | E. camaldulensis | E. victrix | E. marginata |
| EGM09 | Glucan endo-1,3-beta-glucosidase | E. globulus | + | + | + | + |
| EGM12 | CONSTANS-like protein CO1 | E. tereticornis | + | + | + | + |
| EGM14 | Heat shock transcription factor 21 | E. grandis | + | + | + | + |
| EGM24 | Putative S-adenosylmethionine decarboxylase (SAMDC) | E. globulus | + | + | + | + |
| EGM25 | Metal transporter Nramp3 | E. globulus | + | NA | + | NA |
| EGM30 | Chlorophyll A/B binding protein, putative | E. gunnii | + | + | + | + |
| EGM35 | Fructose-1,6-bisphosphatase, cytosolic | E. globulus | + | + | + | + |
| EGM37 | Quinone oxidoreductase | E. globulus | + | + | + | + |
| EGM46 | Desiccation protectant protein Lea14 homolog | E. gunnii | + | + | + | + |
| EGM47 | Small GTPase Rab2 | E. grandis + E. gunnii | + | + | + | + |
| EGM48 | Aquaporin PIP2;1 | E. gunnii | + | + | + | + |
| EGM28 | Chloroplastic phosphoglycerate kinase | E. globulus + E. grandis | + | + | + | NA |
| EGM19 | Aquaporin (PIP1) | E. globulus + E. gunnii | + | + | + | + |
| EGM33 | Synaptobrevin-related protein 1 (SAR1) | E. tereticornis | + | + | + | + |
| EGM34 | Rapid alkalinization factor (RALF) | E. globulus | + | + | + | + |
| EGM26 | Magnesium transporter CorA-like family protein MRS2-2 | E. globulus | + | + | NA | NA |
| EGM42 | bZIP (basic leucine zipper) transcription factor | E. globulus | + | + | + | + |
Note: + = positive amplification; NA = no amplification or inconsistent amplification.
Eucalyptus gomphocephala is taxonomically classified within the monotypic sect. Bolites (Brooker, 2000) but has been shown to have affinities with sect. Bisectae I according to ITS (Steane et al., 2007) and Diversity Arrays Technology (DArT) analyses (Steane et al., 2011).
Taxonomic information for tested species: E. gomphocephala DC. = sect. Bolites/Bisectae I, subg. Symphyomyrtus; E. camaldulensis Dehnh. = sect. Exsertaria, subg. Symphyomyrtus; E. victrix L. A. S. Johnson & K. D. Hill = sect. Adnataria, subg. Symphyomyrtus; E. marginata Sm. = sect. Longistylus, subg. Eucalyptus.
Source locations: E. camaldulensis = 23°06′S, 119°34′E (n = 2) and 23°10′S, 119°45′E (n = 1); E. victrix = 22°55′S, 119°10′E (n = 3); E. marginata = 31°56′S, 115°46′E (n = 1), 32°46′S, 116°26′E (n = 1), and 32°42′S, 116°03′E (n = 1).
Predicted gene annotation based on BLAST searches.
Species from which the original EST was derived.
Taxonomic information for source species: E. globulus Labill. = sect. Maidenaria, subg. Symphyomyrtus; E. gunnii Hook. f. = sect. Maidenaria, subg. Symphyomyrtus; E. tereticornis Sm. = sect. Exsertaria, subg. Symphyomyrtus; E. grandis W. Hill = sect. Latoangulatae, subg. Symphyomyrtus.
LITERATURE CITED
- Benson G. 1999. Tandem repeats finder: A program to analyze DNA sequences. Nucleic Acids Research 27: 573–580 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brooker M. I. H. 2000. A new classification of the genus Eucalyptus L’Hér. (Myrtaceae). Australian Systematic Botany 13: 79–148 [Google Scholar]
- Cai Y. F., Barber P., Dell B., O’Brien P., Williams N., Bowen B., Hardy G. 2010. Soil bacterial functional diversity is associated with the decline of Eucalyptus gomphocephala. Forest Ecology and Management 260: 1047–1057 [Google Scholar]
- Gelfand Y., Rodriguez A., Benson G. 2007. TRDB: The Tandem Repeats Database. Nucleic Acids Research 35: D80–D87 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang X., Madan A. 1999. CAP3: A DNA sequence assembly program. Genome Research 9: 868–877 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peakall R., Smouse P. 2006. GenAlEx 6: Genetic analysis in Excel. Population genetic software for teaching and research. Molecular Ecology Notes 6: 288–295 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raymond M., Rousset F. 1995. GENEPOP (version 1.2): Population genetics software for exact tests and ecumenicism. Journal of Heredity 86: 248–249 [Google Scholar]
- Rousset F. 2008. GENEPOP’007: A complete reimplementation of the GENEPOP software for Windows and Linux. Molecular Ecology Resources 8: 103–106 [DOI] [PubMed] [Google Scholar]
- Rozen S., Skaletsky H. 2000. Primer3 on the WWW for general users and for biologist programmers. In S. Misener and S. A. Krawetz [eds.], Methods in molecular biology, vol. 132: Bioinformatics methods and protocols, 365–386. Humana Press, Totowa, New Jersey, USA. [DOI] [PubMed] [Google Scholar]
- Steane D. A., Nicolle D., Potts B. M. 2007. Phylogenetic positioning of anomalous eucalypts by using ITS sequence data. Australian Systematic Botany 20: 402–408 [Google Scholar]
- Steane D. A., Nicolle D., Sansaloni C. P., Petroli C. D., Carling J., Kilian A., Myburg A. A., et al. 2011. Population genetic analysis and phylogeny reconstruction in Eucalyptus (Myrtaceae) using high-throughput, genome-wide genotyping. Molecular Phylogenetics and Evolution 59: 206–224 [DOI] [PubMed] [Google Scholar]
- van Oosterhout C., Hutchinson W. F., Willa D. P. M., Shipley P. 2004. MICRO-CHECKER: Software for identifying and correcting genotyping errors in microsatellite data. Molecular Ecology Notes 4: 535–538 [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
