Skip to main content
Health Research Alliance Author Manuscripts logoLink to Health Research Alliance Author Manuscripts
. Author manuscript; available in PMC: 2023 Mar 30.
Published in final edited form as: Biodivers Genomes. 2023 Mar 17;2023:10.56179/001c.73625. doi: 10.56179/001c.73625

The Complete Genome Sequence of Curcuma longa (Zingiberaceae, Zingiberales), Turmeric

Zainab El Ouafi 1,2, Stacy Pirro 3, Najib Al Idrissi 1,2, Hassan Ghazal 1,2,4
PMCID: PMC10062376  NIHMSID: NIHMS1884921  PMID: 37009557

Biodiversity Genomes

Curcuma longa is a perennial native to India and Southeast Asia. We present the whole genome sequence of this species. Illumina paired-end reads were assembled by a de novo method followed by a finishing step. The raw and assembled data are publicly available via GenBank: Sequence Read Archive (SRR11229490) and assembled genome (JAOBBC000000000).

Keywords: turmeric, genome, viridiplantae

Introduction

Curcuma longa, commonly known as Turmeric, is a perennial native to India and Southeast Asia. The rhizomes are used as a coloring and flavoring agent in many Asian cuisines and for dyeing cloth a deep orange-yellow.

Methods

A single cultivated specimen was used for this study. DNA extraction was performed using the Qiagen DNAeasy genomic extraction kit using the standard process. A paired-end sequencing library was constructed using the Illumina TruSeq kit, according to the manufacturer’s instructions. The library was sequenced on an Illumina Hi-Seq platform in paired-end, 2 × 150bp format. The resulting fastq files were trimmed of adapter/primer sequences and low-quality regions with Trimmomatic v0.33 (Bolger, Lohse, and Usadel 2014). The trimmed sequences were assembled by SPAdes v2.5 (Bankevich et al. 2012) followed by a finishing step using Zanfona v1.0 (Kieras 2021) to make additional contig joins based on conserved regions in related species.

Results

The genome assembly yielded a total sequence length of 581,175,068bp.

Funding

Funding was provided by Iridian Genomes, grant# IRGEN_RG_2021-1345 Genomic Studies of Eukaryotic Taxa. Hassan Ghazal is a US NIH grant recipient through the H3abionet/H3africa consortium U24HG006941.

Footnotes

Conflict of Interest Statement

The authors declare they have no conflicts of interest.

Data availability

Raw and assembled data is publicly available via GenBank:

RAW GENOME DATA

https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR11229490

ASSEMBLED GENOME

https://www.ncbi.nlm.nih.gov/nuccore/JAOBBC000000000

REFERENCES

  1. Bankevich Anton, Nurk Sergey, Antipov Dmitry, Gurevich Alexey A., Dvorkin Mikhail, Kulikov Alexander S., Lesin Valery M., et al. 2012. “SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing.” Journal of Computational Biology 19 (5): 455–77. 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bolger Anthony M., Lohse Marc, and Usadel Bjoern. 2014. “Trimmomatic: A Flexible Trimmer for Illumina Sequence Data.” Bioinformatics 30 (15): 2114–20. 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Kieras M 2021. Zanfona, a genome finishing process for use with paired-end short reads. https://github.com/zanfona734/zanfona.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Raw and assembled data is publicly available via GenBank:

RAW GENOME DATA

https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR11229490

ASSEMBLED GENOME

https://www.ncbi.nlm.nih.gov/nuccore/JAOBBC000000000


Articles from Biodiversity genomes are provided here courtesy of Health Research Alliance manuscript submission

RESOURCES