The Complete Genome Sequence of Curcuma longa (Zingiberaceae, Zingiberales), Turmeric

Zainab El Ouafi; Stacy Pirro; Najib Al Idrissi; Hassan Ghazal

doi:10.56179/001c.73625

. Author manuscript; available in PMC: 2023 Mar 30.

Published in final edited form as: Biodivers Genomes. 2023 Mar 17;2023:10.56179/001c.73625. doi: 10.56179/001c.73625

The Complete Genome Sequence of Curcuma longa (Zingiberaceae, Zingiberales), Turmeric

Zainab El Ouafi ^1,², Stacy Pirro ³, Najib Al Idrissi ^1,², Hassan Ghazal ^1,^2,⁴

PMCID: PMC10062376 NIHMSID: NIHMS1884921 PMID: 37009557

Biodiversity Genomes

Curcuma longa is a perennial native to India and Southeast Asia. We present the whole genome sequence of this species. Illumina paired-end reads were assembled by a de novo method followed by a finishing step. The raw and assembled data are publicly available via GenBank: Sequence Read Archive (SRR11229490) and assembled genome (JAOBBC000000000).

Keywords: turmeric, genome, viridiplantae

Introduction

Curcuma longa, commonly known as Turmeric, is a perennial native to India and Southeast Asia. The rhizomes are used as a coloring and flavoring agent in many Asian cuisines and for dyeing cloth a deep orange-yellow.

Methods

A single cultivated specimen was used for this study. DNA extraction was performed using the Qiagen DNAeasy genomic extraction kit using the standard process. A paired-end sequencing library was constructed using the Illumina TruSeq kit, according to the manufacturer’s instructions. The library was sequenced on an Illumina Hi-Seq platform in paired-end, 2 × 150bp format. The resulting fastq files were trimmed of adapter/primer sequences and low-quality regions with Trimmomatic v0.33 (Bolger, Lohse, and Usadel 2014). The trimmed sequences were assembled by SPAdes v2.5 (Bankevich et al. 2012) followed by a finishing step using Zanfona v1.0 (Kieras 2021) to make additional contig joins based on conserved regions in related species.

Results

The genome assembly yielded a total sequence length of 581,175,068bp.

Funding

Funding was provided by Iridian Genomes, grant# IRGEN_RG_2021-1345 Genomic Studies of Eukaryotic Taxa. Hassan Ghazal is a US NIH grant recipient through the H3abionet/H3africa consortium U24HG006941.

Footnotes

Conflict of Interest Statement

The authors declare they have no conflicts of interest.

Data availability

Raw and assembled data is publicly available via GenBank:

RAW GENOME DATA

https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR11229490

ASSEMBLED GENOME

https://www.ncbi.nlm.nih.gov/nuccore/JAOBBC000000000

REFERENCES

Bankevich Anton, Nurk Sergey, Antipov Dmitry, Gurevich Alexey A., Dvorkin Mikhail, Kulikov Alexander S., Lesin Valery M., et al. 2012. “SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing.” Journal of Computational Biology 19 (5): 455–77. 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bolger Anthony M., Lohse Marc, and Usadel Bjoern. 2014. “Trimmomatic: A Flexible Trimmer for Illumina Sequence Data.” Bioinformatics 30 (15): 2114–20. 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kieras M 2021. Zanfona, a genome finishing process for use with paired-end short reads. https://github.com/zanfona734/zanfona.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Raw and assembled data is publicly available via GenBank:

RAW GENOME DATA

https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR11229490

ASSEMBLED GENOME

https://www.ncbi.nlm.nih.gov/nuccore/JAOBBC000000000

[R1] Bankevich Anton, Nurk Sergey, Antipov Dmitry, Gurevich Alexey A., Dvorkin Mikhail, Kulikov Alexander S., Lesin Valery M., et al. 2012. “SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing.” Journal of Computational Biology 19 (5): 455–77. 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Bolger Anthony M., Lohse Marc, and Usadel Bjoern. 2014. “Trimmomatic: A Flexible Trimmer for Illumina Sequence Data.” Bioinformatics 30 (15): 2114–20. 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] Kieras M 2021. Zanfona, a genome finishing process for use with paired-end short reads. https://github.com/zanfona734/zanfona.

PERMALINK

The Complete Genome Sequence of Curcuma longa (Zingiberaceae, Zingiberales), Turmeric

Zainab El Ouafi

Stacy Pirro

Najib Al Idrissi

Hassan Ghazal

Biodiversity Genomes

Introduction

Methods

Results

Funding

Footnotes

Data availability

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

The Complete Genome Sequence of Curcuma longa (Zingiberaceae, Zingiberales), Turmeric

Zainab El Ouafi

Stacy Pirro

Najib Al Idrissi

Hassan Ghazal

Biodiversity Genomes

Introduction

Methods

Results

Funding

Footnotes

Data availability

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases