Abstract
We present the whole genome sequences of 56 wild Erythroxylum species from Africa, China, and the American tropics. Deep Illumina sequencing was performed on a single leaf of each voucher. We de novo assembled sequence reads and then identified and used conserved regions across all preassemblies join contigs in a finishing step. The raw and assembled data is publicly available via Genbank.
Keywords: erythroxylum, genomes
Introduction
The coca genus, Erythroxylum, represents a pantropical clade of small trees and shrubs mostly known by the cocaine-alkaloid bearing coca crops (Erythroxylum coca & E. novogranatense). Within the species-rich Erythroxylum genus, about 205 species (75%) occur in the American tropics, with centers of diversity in eastern Brazil, Colombia, and Venezuela. Another center of diversity is Madagascar, with about 33 species (12% of Erythroxylum; (Daly 2004; White 2019).
Many species of Erythroxylum are utilized in traditional medicine, most conspicuously E. monogynum of India as well as the coca crops – which have been cultivated and used for more than 9,000 years (Dillehay et al. 2010; White et al. 2020). A recent review by Lv and colleagues (Lv et al. 2022) reports our state of knowledge on the natural products and bioactivity of the group. The diversity of tropane alkaloids, diterpenes, triterpenes, flavonoids, and derivatives found in Erythroxylum provide a rich source of bioactive compounds for medicinal applications.
Dating back to Darwin (Darwin 1877), Erythroxylum has also served as a case study of the diversity and biology of heterostyly. All species in the group are distylous, with the exception of 11 dioecious species in the Caribbean region (Fuentes Marrero 2018). The genomic basis of distyly in Erythroxylum remains unstudied.
These genome assemblies add to a growing repository of plant biology resources that will assist investigations focused on plant ecology, evolution, and natural products.
Methods
Leaf tissue from individual wild trees was collected for this study. DNA extraction was performed using the Qiagen DNAeasy genomic extraction kit using the standard process. A paired-end sequencing library was constructed using the Illumina TruSeq kit according to the manufacturer’s instructions.
The library was sequenced on an Illumina Hi-Seq platform in paired-end, 2 × 150 bp format. The resulting fastq files were trimmed of adapter/primer sequence and low-quality regions with Trimmomatic v0.33 (Bolger, Lohse, and Usadel 2014). The trimmed sequence was assembled by SPAdes v2.5 (Bankevich et al. 2012) followed by a finishing step using Zanfona (Kieras, O’Neill, and Pirro 2021).
Results
Data Availability
Raw read data and assembled genome sequences are available via Genbank.
| taxname | genome_size | N50 | SRA_acc | genome_acc |
|---|---|---|---|---|
| Erythroxylum acuminatum | 520,944,951 | 26.6 MB | SRR16518260 | JANKOD000000000 |
| Erythroxylum alaternifolium | 1,140,126,797 | 54.9 MB | SRR19167553 | JANIIA000000000 |
| Erythroxylum anguifugum | 1,398,821,111 | 61.1 MB | SRR19071309 | JANIPN000000000 |
| Erythroxylum areolatum | 1,181,226,222 | 63.9 MB | SRR19067407 | JANIJH000000000 |
| Erythroxylum baracoense | 674,802,793 | 32.4 MB | SRR20897294 | JAOTAY000000000 |
| Erythroxylum bequaertii | 1,069,376,509 | 66.2 MB | SRR19169396 | JANIJF000000000 |
| Erythroxylum brevipes | 934,221,715 | 62.1 MB | SRR16641246 | JANIHY000000000 |
| Erythroxylum carthagenense | 1,260,613,813 | 66.4 MB | SRR16134036 | JANZLD000000000 |
| Erythroxylum cassinoides | 1,151,892,187 | 66.6 MB | SRR19077077 | JANIPM000000000 |
| Erythroxylum cataractarum | 1,016,125,520 | 69.3 MB | SRR13024521 | JANJEY000000000 |
| Erythroxylum citrifolium | 1,079,479,909 | 61.8 MB | SRR17913416 | JANKOE000000000 |
| Erythroxylum confusum | 1,475,026,490 | 94.5 MB | SRR17999774 | JANZXO000000000 |
| Erythroxylum coriaceum | 1,397,282,891 | 60.7 MB | SRR17999775 | JANQAP000000000 |
| Erythroxylum densum | 961,004,938 | 34.7 MB | SRR16133662 | JANPWT000000000 |
| Erythroxylum divaricatum | 883,246,698 | 33.4 MB | SRR19579570 | JANTPM000000000 |
| Erythroxylum echinodendron | 767,751,986 | 32.0 MB | SRR19170120 | JANZXR000000000 |
| Erythroxylum engleri | 592,837,144 | 30.0 MB | SRR17999777 | JANZLJ000000000 |
| Erythroxylum fimbriatum | 623,511,794 | 25.1 MB | SRR17839727 | JANZLH000000000 |
| Erythroxylum flavicans | 692,607,313 | 32.8 MB | SRR17853861 | JANZXP000000000 |
| Erythroxylum foetidum | 616,347,371 | 36.4 MB | SRR16132259 | JANZLF000000000 |
| Erythroxylum glaucum | 679,154,896 | 34.3 MB | SRR17834921 | JANZLK000000000 |
| Erythroxylum gonoclados | 524,222,889 | 24.5 MB | SRR17999776 | JANZLY000000000 |
| Erythroxylum gracilipes | 675,766,074 | 35.3 MB | SRR13004464 | JANZLC000000000 |
| Erythroxylum guanchezii | 729,565,920 | 31.0 MB | SRR20897173 | JAOTAX000000000 |
| Erythroxylum haughtii | 657,627,051 | 34.1 MB | SRR19169397 | JANSVB000000000 |
| Erythroxylum havanense | 823,408,479 | 30.7 MB | SRR16128326 | JANTNU000000000 |
| Erythroxylum hondense | 991,084,625 | 65.4 MB | SRR19579446 | JANKYF000000000 |
| Erythroxylum incrassatum | 698,894,607 | 27.8 MB | SRR16511100 | JANVCL000000000 |
| Erythroxylum kapplerianum | 838,545,021 | 30.4 MB | SRR19612725 | JANSVI000000000 |
| Erythroxylum kunthianum | 1,325,754,177 | 50.8 MB | SRR19612843 | JANIMX000000000 |
| Erythroxylum lineolatum | 500,963,852 | 31.2 MB | SRR19612731 | JAOQMO000000000 |
| Erythroxylum longipes | 725,526,181 | 33.6 MB | SRR20971339 | JAOQMS000000000 |
| Erythroxylum macrophyllum | 678,289,703 | 22.1 MB | SRR17839751 | JAOQMK000000000 |
| Erythroxylum mucronatum | 712,302,707 | 23.3 MB | SRR20897490 | JAOSBP000000000 |
| Erythroxylum orinocense | 682,537,204 | 28.9 MB | SRR19076290 | JANZLX000000000 |
| Erythroxylum oxycarpum | 651,926,524 | 28.3 MB | SRR20971318 | JAOTAZ000000000 |
| Erythroxylum panamense | 572,741,246 | 34.2 MB | SRR17839679 | JAOTIZ000000000 |
| Erythroxylum pedicellare | 772,611,573 | 33.9 MB | SRR19612799 | JAOQMP000000000 |
| Erythroxylum pelleterianum | 574,749,498 | 28.0 MB | SRR20971396 | JAOQMR000000000 |
| Erythroxylum platyclados | 505,405,536 | 19.5 MB | SRR19624006 | JAOQMM000000000 |
| Erythroxylum plowmanianum | 930,928,887 | 29.8 MB | SRR19071311 | JANVCQ000000000 |
| Erythroxylum raimondii | 507,564,953 | 25.4 MB | SRR19622179 | JAOQMN000000000 |
| Erythroxylum reticulatum | 933,378,646 | 32.7 MB | SRR19169603 | JANSVA000000000 |
| Erythroxylum roigii | 624,502,766 | 26.1 MB | SRR16638488 | JANZXV000000000 |
| Erythroxylum roraimae | 659,978,004 | 27.6 MB | SRR19579568 | JAOSBL000000000 |
| Erythroxylum rotundifolium | 884,866,973 | 30.0 MB | SRR20912047 | JAOSBO000000000 |
| Erythroxylum rufum | 1,026,915,537 | 66.4 MB | SRR19169398 | JANIHZ000000000 |
| Erythroxylum savannarum | 545,546,056 | 27.3 MB | SRR19612969 | JAOTAW000000000 |
| Erythroxylum shatona | 692,895,338 | 36.7 MB | SRR19579569 | JAOSBK000000000 |
| Erythroxylum squamatum | 552,878,497 | 30.0 MB | SRR20971441 | JAOTBA000000000 |
| Erythroxylum suave | 858,558,633 | 31.2 MB | SRR20912045 | JAOSBN000000000 |
| Erythroxylum subrotundum | 548,290,955 | 26.5 MB | SRR19616481 | JAOQMQ000000000 |
| Erythroxylum tenue | 639,465,533 | 26.9 MB | SRR19580238 | JAOSBM000000000 |
| Erythroxylum ulei | 1,042,824,208 | 62.9 MB | SRR16633785 | JANITX000000000 |
| Erythroxylum urbanii | 763,890,317 | 28.6 MB | SRR19626866 | JAOQML000000000 |
| Erythroxylum williamsii | 1,013,529,337 | 66.2 MB | SRR16133663 | JANITW000000000 |
Funding
Funding was provided by Iridian Genomes, grant # IRGEN_RG_2021-1345 Genomic Studies of Eukaryotic Taxa. Dawson White is supported by the Grainger Bioinformatics Center at the Field Museum and an NSF Postdoctoral Fellowship in Biology, award 2010821.
REFERENCES
- Bankevich Anton, Nurk Sergey, Antipov Dmitry, Gurevich Alexey A., Dvorkin Mikhail, Kulikov Alexander S., Lesin Valery M., et al. 2012. “SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing.” Journal of Computational Biology 19 (5): 455–77. 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolger Anthony M., Lohse Marc, and Usadel Bjoern. 2014. “Trimmomatic: A Flexible Trimmer for Illumina Sequence Data.” Bioinformatics 30 (15): 2114–20. 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Daly DC 2004. “Erythroxylaceae (Coca Family).” In Flowering Plants of the Neotropics, edited by Smith N, Mori SA, Henderson A, Stevenson DW, and V Heald S, 143–45. Princeton: Princeton University Press. [Google Scholar]
- Darwin C 1877. The Different Forms of Flowers on Plants of the Same Species. Cambridge, UK: Cambridge University Press. [Google Scholar]
- Dillehay Tom D., Rossen Jack, Ugent Donald, Karathanasis Anathasios, Vásquez Víctor, and Netherly Patricia J.. 2010. “Early Holocene Coca Chewing in Northern Peru.” Antiquity 84 (326): 939–53. 10.1017/s0003598x00067004. [DOI] [Google Scholar]
- Fuentes Marrero IM 2018. Morfología floral y polimorfismos sexuales en el género Erythroxylum (Erythroxylaceae) en Cuba.
- Kieras M, O’Neill K, and Pirro S. 2021. “Zanfona, a Genome Assembly Finishing Tool for Paired-End Illumina Reads.” 2021. https://github.com/zanfona734/zanfona.
- Lv Y, Tian T, Wang Y-J, Huang J-P, and X Huang S. 2022. “Advances in Chemistry and Bioactivity of the Genus Erythroxylum.” Nat Prod Bioprospect 12: 15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- White Dawson M. 2019. “Biogeography, Diversification, and Domestication in the Coca Family (Erythroxylaceae).”
- White Dawson M, Huang Jen-Pan, Jara-Muñoz Orlando Adolfo, MadriñáN Santiago, Ree Richard H, and Mason-Gamer Roberta J. 2020. “The Origins of Coca: Museum Genomics Reveals Multiple Independent Domestications from ProgenitorErythroxylum Gracilipes.” Systematic Biology 70 (1): 1–13. 10.1093/sysbio/syaa074. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Raw read data and assembled genome sequences are available via Genbank.
| taxname | genome_size | N50 | SRA_acc | genome_acc |
|---|---|---|---|---|
| Erythroxylum acuminatum | 520,944,951 | 26.6 MB | SRR16518260 | JANKOD000000000 |
| Erythroxylum alaternifolium | 1,140,126,797 | 54.9 MB | SRR19167553 | JANIIA000000000 |
| Erythroxylum anguifugum | 1,398,821,111 | 61.1 MB | SRR19071309 | JANIPN000000000 |
| Erythroxylum areolatum | 1,181,226,222 | 63.9 MB | SRR19067407 | JANIJH000000000 |
| Erythroxylum baracoense | 674,802,793 | 32.4 MB | SRR20897294 | JAOTAY000000000 |
| Erythroxylum bequaertii | 1,069,376,509 | 66.2 MB | SRR19169396 | JANIJF000000000 |
| Erythroxylum brevipes | 934,221,715 | 62.1 MB | SRR16641246 | JANIHY000000000 |
| Erythroxylum carthagenense | 1,260,613,813 | 66.4 MB | SRR16134036 | JANZLD000000000 |
| Erythroxylum cassinoides | 1,151,892,187 | 66.6 MB | SRR19077077 | JANIPM000000000 |
| Erythroxylum cataractarum | 1,016,125,520 | 69.3 MB | SRR13024521 | JANJEY000000000 |
| Erythroxylum citrifolium | 1,079,479,909 | 61.8 MB | SRR17913416 | JANKOE000000000 |
| Erythroxylum confusum | 1,475,026,490 | 94.5 MB | SRR17999774 | JANZXO000000000 |
| Erythroxylum coriaceum | 1,397,282,891 | 60.7 MB | SRR17999775 | JANQAP000000000 |
| Erythroxylum densum | 961,004,938 | 34.7 MB | SRR16133662 | JANPWT000000000 |
| Erythroxylum divaricatum | 883,246,698 | 33.4 MB | SRR19579570 | JANTPM000000000 |
| Erythroxylum echinodendron | 767,751,986 | 32.0 MB | SRR19170120 | JANZXR000000000 |
| Erythroxylum engleri | 592,837,144 | 30.0 MB | SRR17999777 | JANZLJ000000000 |
| Erythroxylum fimbriatum | 623,511,794 | 25.1 MB | SRR17839727 | JANZLH000000000 |
| Erythroxylum flavicans | 692,607,313 | 32.8 MB | SRR17853861 | JANZXP000000000 |
| Erythroxylum foetidum | 616,347,371 | 36.4 MB | SRR16132259 | JANZLF000000000 |
| Erythroxylum glaucum | 679,154,896 | 34.3 MB | SRR17834921 | JANZLK000000000 |
| Erythroxylum gonoclados | 524,222,889 | 24.5 MB | SRR17999776 | JANZLY000000000 |
| Erythroxylum gracilipes | 675,766,074 | 35.3 MB | SRR13004464 | JANZLC000000000 |
| Erythroxylum guanchezii | 729,565,920 | 31.0 MB | SRR20897173 | JAOTAX000000000 |
| Erythroxylum haughtii | 657,627,051 | 34.1 MB | SRR19169397 | JANSVB000000000 |
| Erythroxylum havanense | 823,408,479 | 30.7 MB | SRR16128326 | JANTNU000000000 |
| Erythroxylum hondense | 991,084,625 | 65.4 MB | SRR19579446 | JANKYF000000000 |
| Erythroxylum incrassatum | 698,894,607 | 27.8 MB | SRR16511100 | JANVCL000000000 |
| Erythroxylum kapplerianum | 838,545,021 | 30.4 MB | SRR19612725 | JANSVI000000000 |
| Erythroxylum kunthianum | 1,325,754,177 | 50.8 MB | SRR19612843 | JANIMX000000000 |
| Erythroxylum lineolatum | 500,963,852 | 31.2 MB | SRR19612731 | JAOQMO000000000 |
| Erythroxylum longipes | 725,526,181 | 33.6 MB | SRR20971339 | JAOQMS000000000 |
| Erythroxylum macrophyllum | 678,289,703 | 22.1 MB | SRR17839751 | JAOQMK000000000 |
| Erythroxylum mucronatum | 712,302,707 | 23.3 MB | SRR20897490 | JAOSBP000000000 |
| Erythroxylum orinocense | 682,537,204 | 28.9 MB | SRR19076290 | JANZLX000000000 |
| Erythroxylum oxycarpum | 651,926,524 | 28.3 MB | SRR20971318 | JAOTAZ000000000 |
| Erythroxylum panamense | 572,741,246 | 34.2 MB | SRR17839679 | JAOTIZ000000000 |
| Erythroxylum pedicellare | 772,611,573 | 33.9 MB | SRR19612799 | JAOQMP000000000 |
| Erythroxylum pelleterianum | 574,749,498 | 28.0 MB | SRR20971396 | JAOQMR000000000 |
| Erythroxylum platyclados | 505,405,536 | 19.5 MB | SRR19624006 | JAOQMM000000000 |
| Erythroxylum plowmanianum | 930,928,887 | 29.8 MB | SRR19071311 | JANVCQ000000000 |
| Erythroxylum raimondii | 507,564,953 | 25.4 MB | SRR19622179 | JAOQMN000000000 |
| Erythroxylum reticulatum | 933,378,646 | 32.7 MB | SRR19169603 | JANSVA000000000 |
| Erythroxylum roigii | 624,502,766 | 26.1 MB | SRR16638488 | JANZXV000000000 |
| Erythroxylum roraimae | 659,978,004 | 27.6 MB | SRR19579568 | JAOSBL000000000 |
| Erythroxylum rotundifolium | 884,866,973 | 30.0 MB | SRR20912047 | JAOSBO000000000 |
| Erythroxylum rufum | 1,026,915,537 | 66.4 MB | SRR19169398 | JANIHZ000000000 |
| Erythroxylum savannarum | 545,546,056 | 27.3 MB | SRR19612969 | JAOTAW000000000 |
| Erythroxylum shatona | 692,895,338 | 36.7 MB | SRR19579569 | JAOSBK000000000 |
| Erythroxylum squamatum | 552,878,497 | 30.0 MB | SRR20971441 | JAOTBA000000000 |
| Erythroxylum suave | 858,558,633 | 31.2 MB | SRR20912045 | JAOSBN000000000 |
| Erythroxylum subrotundum | 548,290,955 | 26.5 MB | SRR19616481 | JAOQMQ000000000 |
| Erythroxylum tenue | 639,465,533 | 26.9 MB | SRR19580238 | JAOSBM000000000 |
| Erythroxylum ulei | 1,042,824,208 | 62.9 MB | SRR16633785 | JANITX000000000 |
| Erythroxylum urbanii | 763,890,317 | 28.6 MB | SRR19626866 | JAOQML000000000 |
| Erythroxylum williamsii | 1,013,529,337 | 66.2 MB | SRR16133663 | JANITW000000000 |
