Skip to main content
Open Research Europe logoLink to Open Research Europe
. 2025 Oct 22;5:323. [Version 1] doi: 10.12688/openreseurope.21514.1

ERGA-BGE reference genome of Eunicella cavolini, an IUCN Near Threatened Gorgonian of the Mediterranean Sea

Didier Aurelle 1,2, Dorian Guillemain 3, Frédéric Zuberer 3, Denys Malengros 3, Astrid Böhne 4, Rita Monteiro 4, Thomas Marcussen 5,6, Torsten H Struck 5, Rebekah A Oomen 5,6,7,8,9; Genoscope Sequencing Team, Alice Moussy 10, Corinne Cruaud 10, Karine Labadie 10, Lola Demirdjian 11, Caroline Belser 11, Patrick Wincker 11, Pedro H Oliveira 11, Jean-Marc Aury 11, Chiara Bortoluzzi 12,a
PMCID: PMC12759274  PMID: 41488400

Abstract

The Eunicella cavolini reference genome provides an important resource to study the adaptation of this species to different environments and anthropic pressures. This species is impacted by human activities, including climate change, and this reference genome will be useful to study the genomic evolution of this species. The entirety of the genome sequence was assembled into 17 contiguous chromosomal pseudomolecules. This chromosome-level assembly encompasses 0.49 Gb, composed of 159 contigs and 46 scaffolds, with contig and scaffold N50 values of 7.7 Mb and 51.1 Mb, respectively.

Keywords: Eunicella cavolini, genome assembly, European Reference Genome Atlas, Biodiversity Genomics Europe, Earth Biogenome Project, Eunicellidae, yellow gorgonian

Introduction

Eunicella cavolini, also known by the French local name of Gorgone jaune, is a cnidaria member of the sub-class Octocorallia and of the Eunicellidae family ( McFadden et al., 2022) and is one of the most common gorgonians in the Mediterranean Sea. This species can be found in the Western and Eastern Mediterranean Sea, at depths of less than 10 meters and more than 150 meters ( Carugati et al., 2022; Sini et al., 2015). Eunicella cavolini is currently classified as 'Near Threatened' on the IUCN Red List ( Otero et al., 2017). As other Mediterranean octocorals, Eunicella species are impacted by mass mortality events linked with marine heat waves, fishing activities, or pollution ( Garrabou et al., 2022; Sini et al., 2015; Topçu & Öztürk, 2015). Along with other gorgonians , E. cavolini plays an important ecological role by creating 3D habitats supporting a diversity of species ( Gibson et al., 2006), including various bacterial species depending on the environment. Nonetheless, the dominant bacterial taxa are from the genus Endozoicomonas (Gammaproteobacteri) ( Bayer et al., 2013). A high-quality reference genome for E. cavolini is useful to study the genomic diversity of this species and its adaptive potential facing contrasted and changing environments ( Aurelle et al., 2022). Such genomic resources are pivotal for species management, allowing, for example, the identification of key populations to be conserved. The genome will also be used to study species limits and hybridization with other Eunicella species ( Aurelle et al., 2024).

The generation of this reference resource was coordinated by the European Reference Genome Atlas (ERGA) initiative’s Biodiversity Genomics Europe (BGE) project, supporting ERGA’s aims of promoting transnational cooperation to promote advances in the application of genomics technologies to protect and restore biodiversity ( Mazzoni et al., 2023).

Materials & methods

ERGA's sequencing strategy includes Oxford Nanopore Technology (ONT) and/or Pacific Biosciences (PacBio) for long-read sequencing, along with Hi-C sequencing for chromosomal architecture, Illumina Paired-End (PE) for polishing (i.e. recommended for ONT-only assemblies), and RNA sequencing for transcriptomic profiling, to facilitate genome assembly and annotation.

Sample and sampling information

Dorian Guillemain, Frédéric Zuberer and Denys Malengros, all from the Pythéas Institute, sampled one specimen of Eunicella cavolini (sex unknown), which was identified based on macro-morphology and colour. The identification was performed by Dorian Guillemain, Frédéric Zuberer and Didier Aurelle. The sample was collected on the Frioul archipelago, Marseille, France, on the 19th of July 2023. Sampling was performed under permission Arrêté n°107 issued by the Direction Interrégionale de la mer Méditerranée. Sampling was performed through scuba diving. The specimen was euthanized by being put flash frozen at -80 °C. The sample was preserved at this temperature until DNA and RNA extraction.

Vouchering information

Physical reference materials for the here sequenced specimen have been deposited in the Museum National d'Histoire Naturelle, Paris ( https://www.mnhn.fr/fr) under the accession number MNHN-IK-2019-2701.

Frozen reference tissue material from the same individual is being deposited at the Museum National d'Histoire Naturelle, Paris https://www.mnhn.fr/fr.

Genetic information

The estimated genome size, estimated by Genomes on a Tree (GoaT) ( Challis et al., 2023) by ancestral state reconstruction, is 0.62 Gb. This is a diploid genome with a haploid number of 6 chromosomes (2n=12). All information for this species was retrieved from GoaT.

DNA/RNA processing

DNA extraction was performed by first grinding 400 mg of reproductive tissue in liquid nitrogen, followed by digestion in 5 mL of buffer containing 30 mM Tris-HCl, 10 mM EDTA, 1% SDS, and proteinase K (20 µL/mL) at 53°C for 4 h. The lysate was centrifuged at 500 × g for 5 min at room temperature to remove debris, including sclerites. To the resulting supernatant, 10 mL of absolute ethanol were added, and the mixture was incubated at −20 °C for 1 h. DNA was pelleted by centrifugation at 8500 × g for 15 min at 4 °C, washed once with 70% ethanol, and centrifuged again under the same conditions. The pellet was then resuspended in 9.5 mL of G2 buffer from Genomic Tip 100/G kit (QIAGEN, MD, USA) and incubated overnight at 4 °C. The following day, RNase A (19 µL; 100 mg/mL) was added and the sample was incubated for 1 h at 50 °C. DNA was subsequently purified using the Genomic Tip 100/G kit according to the manufacturer’s protocol. DNA fragment size selection was performed using Short Read Eliminator (PacBio). Quantification was performed using a Qubit dsDNA HS Assay kit (Thermo Fisher Scientific) and integrity was assessed in a FemtoPulse system (Agilent). DNA was stored at 4 ºC until usage.

RNA was extracted from reproductive tissue (70 mg) using RNeasy Plus Universal Kit (Qiagen) following manufacturer instructions. Residual genomic DNA was removed with 6U of TURBO DNase (2 U/μL) (Thermo Fisher Scientific). Quantification was performed using a Qubit RNA HS Assay and integrity was assessed in a Bioanalyzer system (Agilent). RNA was stored at -80°

Library preparation and sequencing

Long-read DNA library was prepared with SMRTbell prep kit 3.0 following manufacturers' instructions and sequenced on a Revio system (PacBio). Hi-C library was generated from reproductive tissue using the Arima High Coverage HiC Kit (following the Animal Tissues low input protocol v01) and sequenced on a NovaSeq6000 instrument (Illumina) with 2 x 150 read length. Poly(A) RNA-Seq libraries were constructed using the Illumina Stranded mRNA Prep, Ligation Prep kit (Illumina) and sequenced on a Illumina NovaSeq X Plus instrument.

In total 53x PacBio and 130x HiC data were sequenced to generate the assembly.

Genome assembly methods

The genome of Eunicella cavolini was assembled using the Genoscope GALOP pipeline ( https://workflowhub.eu/workflows/1200). Briefly, raw PacBio HiFi reads were assembled using Hifiasm v0.19.8-r603 ( Cheng et al., 2021). Remaining allelic duplications were removed using purge_dups v1.2.5 ( Guan et al., 2020) with default parameters and the proposed cutoffs, but all contigs larger than 1Mb were kept in the purged assembly. This assembly was scaffolded using YaHS v1.2.2 ( Zhou et al., 2023) and assembled scaffolds were then curated through manual inspection using PretextView v0.2.5 to remove false joins and incorporate sequences not automatically scaffolded into their respective locations within the chromosomal pseudomolecules. Chromosome-scale scaffolds confirmed by Hi-C data were named in order of size. The mitochondrial genome was assembled as one circular contig using Oatk v1.0 ( Zhou et al., 2025) and included in the released assembly. Summary analysis of the released assembly was performed using the ERGA-BGE Genome Report ASM Galaxy workflow ( 10.48546/workflowhub.workflow.1104.1).

Results

Genome assembly

The genome assembly has a total length of 489,969,558 bp in 46 scaffolds ( Figure 1 & Figure 2), with a GC content of 37.3%. The assembly has a contig N50 of 7,689,143 bp and L50 of 20 and a scaffold N50 of 51,102,291 bp and L50 of 4. The assembly has a total of 113 gaps, totalling 15.3 kb in cumulative size. The single-copy gene content analysis using the Eukaryota database with BUSCO ( Manni et al., 2021) resulted in 95.3% completeness (94.5% single and 0.8% duplicated). 80.6% of reads k-mers were present in the assembly and the assembly has a base accuracy Quality Value (QV) of 62.4 as calculated by Merqury ( Rhie et al., 2020).

Figure 1. Snail plot summary of assembly statistics.

Figure 1.

The main plot is divided into 1,000 size-ordered bins around the circumference, with each bin representing 0.1% of the 489,969,558 bp assembly. The distribution of sequence lengths is shown in dark grey, with the plot radius scaled to the longest sequence present in the assembly (84.7 Mb, shown in red). Orange and pale-orange arcs show the scaffold N50 and N90 sequence lengths (51,102,291 and 16,173,997 bp), respectively. The pale grey spiral shows the cumulative sequence count on a log-scale, with white scale lines showing successive orders of magnitude. The blue and pale-blue area around the outside of the plot shows the distribution of GC, AT, and N percentages in the same bins as the inner plot. A summary of complete, fragmented, duplicated, and missing BUSCO genes found in the assembled genome from the Eukaryota database (odb10) is shown in the top right.

Figure 2. Hi-C contact map showing spatial interactions between regions of the genome.

Figure 2.

The diagonal corresponds to intra-chromosomal contacts, depicting chromosome boundaries. The frequency of contacts is shown on a logarithmic heatmap scale. Hi-C matrix bins were merged into a 100 kb bin size for plotting.

Acknowledgements

We acknowledge the support of the Freiburg Galaxy Team: Saim Momin and Björn Grüning, Bioinformatics, University of Freiburg (Germany), funded by the German Federal Ministry of Education and Research BMBF grant 031 A538A de.NBI-RBC and the Ministry of Science, Research and the Arts Baden-Württemberg (MWK) within the framework of LIBIS/de.NBI Freiburg. We would like to acknowledge the assembly reviewer, Michael Paulini from the Wellcome Sanger Institute. We acknowledge Magalie Castelin responsible for the cnidaria collection at MNHN.

Funding Statement

Biodiversity Genomics Europe (Grant no. 101059492) is funded by Horizon Europe under the Biodiversity, Circular Economy and Environment call (REA.B.3); co-funded by the Swiss State Secretariat for Education, Research and Innovation (SERI) under contract numbers 22.00173 and 24.00054; and by the UK Research and Innovation (UKRI) under the Department for Business, Energy and Industrial Strategy’s Horizon Europe Guarantee Scheme. The project leading to this publication has received funding from European FEDER Fund under project 1166-39417 and the Excellence Initiative of Aix-Marseille University - A*MIDEX, a French ‘Investissements d’Avenir’ programme. This work was supported by the Genoscope, the Commissariat à l'Énergie Atomique et aux Énergies Alternatives (CEA), France Génomique (ANR-10-INBS-09-08), and the exploratory research programme ‘ATLASea: Atlas of marine genomes and its targeted project SEQ-Sea (ANR-22-EXAT-0003-SEQ-Sea).

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

[version 1; peer review: 2 approved]

Data availability

Eunicella cavolini and the related genomic study were assigned to Tree of Life ID (ToLID) 'jaEunCavo1' and all sample, sequence, and assembly information are available under the umbrella BioProject PRJEB79972. The sample information is available at the following BioSample accession SAMEA115358980. The genome assembly is accessible from ENA under accession number GCA_965177985. 1. The annotated genome will be made available through the Ensembl website ( https://projects.ensembl.org/erga-bge/). Sequencing data produced as part of this project are available from ENA at the following accessions: ERX14096372, ERX14096373, ERX14096374, ERX14096395, ERX14096397, ERX14169049, and ERX14169050. Documentation related to the genome assembly and curation can be found in the ERGA Assembly Report (EAR) document available at https://github.com/ERGA-consortium/EARs/tree/main/Assembly_Reports/Eunicella_cavolini/jaEunCavo1. Further details and data about the project are hosted on the ERGA portal at https://portal.erga-biodiversity.eu/data_portal/317547.

Author contributions

DA coordinated the project; DG, FZ, and DM collected the species; DG, FZ, and DA identified the species; DA sampled and preserved biological material and provided metadata; DG, DA, AsB, RM, TM, THS, and RAM provided support in sampling, shipping of biological material, metadata collection, and management; GST extracted DNA, prepared libraries, and performed sequencing under the supervision of AM, CC, KL, PHO, and PW; LD, CB, and JMA performed genome assembly and curation under the supervision of JMA; CB generated the analysis and report. All authors contributed to the writing, review, and editing of this genome note and read and approved the final version. This work is part of the species assigned to Genoscope, which was instrumental in the wet lab, sequencing, and assembly processes, and represents a key contribution to BGE's outputs.

Author information

Members of the Genoscope Sequencing Team are listed here: https://zenodo.org/records/14611490.

References

  1. Aurelle D, Haguenauer A, Bally M, et al. : Symbiosis, hybridization, and speciation in Mediterranean octocorals (Octocorallia, Eunicellidae). Biol J Linn Soc. 2024;143(4): blae116. 10.1093/biolinnean/blae116 [DOI] [Google Scholar]
  2. Aurelle D, Thomas S, Albert C, et al. : Biodiversity, climate change, and adaptation in the Mediterranean. Ecosphere. 2022;13(4): e3915. 10.1002/ecs2.3915 [DOI] [Google Scholar]
  3. Bayer T, Arif C, Ferrier-Pagès C, et al. : Bacteria of the genus Endozoicomonas dominate the microbiome of the Mediterranean gorgonian coral Eunicella cavolini. Mar Ecol Prog Ser. 2013;479:75–84. 10.3354/meps10197 [DOI] [Google Scholar]
  4. Carugati L, Moccia D, Bramanti L, et al. : Deep-dwelling populations of Mediterranean Corallium rubrum and Eunicella cavolini: distribution, demography, and co-occurrence. Biology (Basel). 2022;11(2):333. 10.3390/biology11020333 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Challis R, Kumar S, Sotero-Caio C, et al. : Genomes on a Tree (GoaT): a versatile, scalable search engine for genomic and sequencing project metadata across the eukaryotic Tree of Life [version 1; peer review: 2 approved]. Wellcome Open Res. 2023;8:24. 10.12688/wellcomeopenres.18658.1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Cheng H, Concepcion GT, Feng X, et al. : Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 2021;18(2):170–175. 10.1038/s41592-020-01056-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Garrabou J, Gómez-Gras D, Medrano A, et al. : Marine heatwaves drive recurrent mass mortalities in the Mediterranean Sea. Glob Chang Biol. 2022;28(19):5708–5725. 10.1111/gcb.16301 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Gibson R, Atkinson R, Gordon J, et al. : Mediterranean coralligenous assemblages: a synthesis of present knowledge. Oceanography and Marine Biology: An Annual Review.2006;44:123–195. 10.1201/9781420006391-7 [DOI] [Google Scholar]
  9. Guan D, McCarthy SA, Wood J, et al. : Identifying and removing haplotypic duplication in primary genome assemblies. Bioinformatics. 2020;36(9):2896–2898. 10.1093/bioinformatics/btaa025 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Manni M, Berkeley MR, Seppey M, et al. : BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 2021;38(10):4647–4654. 10.1093/molbev/msab199 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Mazzoni CJ, Ciofi C, Waterhouse RM: Biodiversity: an atlas of European reference genomes. Nature. 2023;619(7969):252. 10.1038/d41586-023-02229-w [DOI] [PubMed] [Google Scholar]
  12. McFadden CS, Van Ofwegen LP, Quattrini AM: Revisionary systematics of Octocorallia (Cnidaria: Anthozoa) guided by phylogenomics. Bull Soc Syst Biol. 2022;1(3). 10.18061/bssb.v1i3.8735 [DOI] [Google Scholar]
  13. Otero MM, Numa C, Bo M, et al. : Overview of the conservation status of Mediterranean Anthozoa.International Union for Conservation of Nature,2017. 10.2305/IUCN.CH.2017.RA.2.en [DOI]
  14. Rhie A, Walenz BP, Koren S, et al. : Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 2020;21(1): 245. 10.1186/s13059-020-02134-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Sini M, Kipson S, Linares C, et al. : The yellow gorgonian Eunicella cavolini: demography and disturbance levels across the Mediterranean Sea. PLoS One. 2015;10(5): e0126253. 10.1371/journal.pone.0126253 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Topçu EN, Öztürk B: Composition and abundance of octocorals in the Sea of Marmara, where the Mediterranean meets the Black Sea. Sci Mar. 2015;79(1):125–135. 10.3989/scimar.04120.09A [DOI] [Google Scholar]
  17. Zhou C, McCarthy SA, Durbin R: YaHS: Yet another Hi-C Scaffolding tool. Bioinformatics. 2023;39(1): btac808. 10.1093/bioinformatics/btac808 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Zhou C, Brown M, Blaxter M, et al. : Oatk: a de novo assembly tool for complex plant organelle genomes. Genome Biol. 2025;26(1):235. 10.1186/s13059-025-03676-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
Open Res Eur. 2026 Jan 2. doi: 10.21956/openreseurope.23272.r66318

Reviewer response for version 1

Alan Elena 1

Aurelle et al. describe the generation of the only available E. cavolini genome. The assembly quality is high enough to be classified as a reference sequence. The provided data is relevant since it provides a high quality benchmark for future comparison. Authors highlight the relevance of such sequences in a proper way. I believe the communication provides what it is supposed to provide and there are no major comments on my side.

Just a few minor questions:

1) Was this analysis done only once on one individual? Maybe to minimise genetic variability in a reference sequence and ensure robustness of results, this protocol could be repeated on independent organisms.

2) Given the fragmentation of the genome and the status of this entry as RefSeq, was there/will there be an attempt to close the scaffolds?

3) Why was reproductive tissue selected for sequencing?

4) How did the authors deal with  potential DNA contamination from colonising species?

5) Aside from the present, there are very few sequences belonging to this species and most of these focus on specific structures (e.g.: Cytochromes) rather than on broader fragments of DNA. What is the reason for this?

Are sufficient details of methods and materials provided to allow replication by others?

Yes

Is the rationale for creating the dataset(s) clearly described?

Yes

Are the datasets clearly presented in a useable and accessible format?

Yes

Are the protocols appropriate and is the work technically sound?

Yes

Reviewer Expertise:

Bioinformatics, genomics, microbiology, environment, clinic.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Open Res Eur. 2025 Nov 5. doi: 10.21956/openreseurope.23272.r62849

Reviewer response for version 1

Benjamin D Young 1

This is a fantastic resource for the soft coral community and the assembly pipeline follows best practices. I have a few small comments that I believe will improve the manuscript readability and should be super easy to incorporate!

  • Along with other gorgonians, E. cavolini plays an important ecological role by creating 3D habitats supporting a diversity of species ( Gibson et al., 2006), including various bacterial species depending on the environment. Nonetheless, the dominant bacterial taxa are from the genus Endozoicomonas (Gammaproteobacteri) ( Bayer et al., 2013).

This sentence reads a little weird to me, especially as the Endozoicomonas are within the coral microbiome form the Bayer paper and the statement you are making seems to be about the habitat the coral generates. I would recommend rewording this section or removing the section about the bacteria and microbiome as I do not think it is needed.

  • The specimen was euthanized by being put flash frozen at -80 °C.

Remove the “put” it makes the sentence read weird.

You say “reproductive tissue” a few times, can you be a little more specific? I understand the sex was unknown so did you just grind up several whole polyps? Or did you extract mesenteries form within the polyps?

  • In total 53x PacBio and 130x HiC data were sequenced to generate the assembly.

This is wonderful but did you get any results from the RNA-sequencing? You do specify you did RNA-prep and sequencing in the methods but then there is no further mention anywhere in the report of why you did this or what it was used for. I assume it was/is being used for annotation so maybe stating that would be good, and potentially providing some preliminary results for the gene prediction and annotation? If not, then I would recommend removing the RNA aspects completely.

  • and assembled scaffolds were then curated through manual inspection using PretextView v0.2.5 to remove false joins and incorporate sequences not automatically scaffolded into their respective locations within the chromosomal pseudomolecules.

Is there an article/github citation you can include for PretextView ?

  • The genome assembly has a total length of 489,969,558 bp in 46 scaffolds ( Figure 1 &  Figure 2), with a GC content of 37.3%. The assembly has a contig N50 of 7,689,143 bp and L50 of 20 and a scaffold N50 of 51,102,291 bp and L50 of 4.

This is a little nit-picky, but can you add in at the beginning the results that these results you are stating are for the 17 contiguous chromosomal pseudomolecules? It just makes it instantly obvious which assembly (initial versus chromosomal) you are talking about. I had to scroll to the abstract to make sure this was the chromosomal stats.

This is not necessary but could be nice, before giving the chromosomal results could you give the initial assembly size/stats from just HiFiAsm? I think it is interesting to see how use of HiC can improve the assembly. Again, not necessary at all if this is too much extra work!

  • The single-copy gene content analysis using the Eukaryota database with BUSCO 

Can you please include the version (i.e. odb10, odb12) BUSCO database you used. 

Again, this is wonderful work and a amazing resource for researchers using this species!

Are sufficient details of methods and materials provided to allow replication by others?

Yes

Is the rationale for creating the dataset(s) clearly described?

Yes

Are the datasets clearly presented in a useable and accessible format?

Yes

Are the protocols appropriate and is the work technically sound?

Yes

Reviewer Expertise:

I work in a number of different organisms (e.g., bacteria, fungi, coral, insects, fish) assembling genomes and undertaking genetic research.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Availability Statement

    Eunicella cavolini and the related genomic study were assigned to Tree of Life ID (ToLID) 'jaEunCavo1' and all sample, sequence, and assembly information are available under the umbrella BioProject PRJEB79972. The sample information is available at the following BioSample accession SAMEA115358980. The genome assembly is accessible from ENA under accession number GCA_965177985. 1. The annotated genome will be made available through the Ensembl website ( https://projects.ensembl.org/erga-bge/). Sequencing data produced as part of this project are available from ENA at the following accessions: ERX14096372, ERX14096373, ERX14096374, ERX14096395, ERX14096397, ERX14169049, and ERX14169050. Documentation related to the genome assembly and curation can be found in the ERGA Assembly Report (EAR) document available at https://github.com/ERGA-consortium/EARs/tree/main/Assembly_Reports/Eunicella_cavolini/jaEunCavo1. Further details and data about the project are hosted on the ERGA portal at https://portal.erga-biodiversity.eu/data_portal/317547.


    Articles from Open Research Europe are provided here courtesy of European Commission, Directorate General for Research and Innovation

    RESOURCES