Skip to main content
Data in Brief logoLink to Data in Brief
. 2024 Feb 20;53:110200. doi: 10.1016/j.dib.2024.110200

A global dataset of demosponge distribution records

Ariadni Vafeiadou a, Eliza Fragkopoulou a,, Jorge Assis a,b,
PMCID: PMC10907141  PMID: 38435734

Abstract

Biodiversity information in the form of species occurrence records is key for monitoring and predicting current and future biodiversity patterns, as well as for guiding conservation and management strategies. However, the reliability and accuracy of this information are frequently undermined by taxonomic and spatial errors. Additionally, biodiversity information facilities often share data in diverse incompatible formats, precluding seamless integration and interoperability. We provide a comprehensive quality-controlled dataset of occurrence records of the Class Demospongiae, which comprises 81% of the entire Porifera phylum. Demosponges are ecologically significant as they structure rich habitats and play a key role in nutrient cycling within marine benthic communities. The dataset aggregates occurrence records from multiple sources, employs dereplication and taxonomic curation techniques, and is flagged for potentially incorrect records based on expert knowledge regarding each species’ bathymetric and geographic distributions. It yields 417,626 records of 1,816 accepted demosponge species (of which 321,660 records of 1,495 species are flagged as potentially correct), which are provided under the FAIR principle of Findability, Accessibility, Interoperability and Reusability in the Darwin Core Standard. This dataset constitutes the most up-to-date baseline for studying demosponge diversity at the global scale, enabling researchers to examine biodiversity patterns (e.g., species richness and endemicity), and forecast potential distributional shifts under future scenarios of climate change.

Keywords: Marine biodiversity, Foundational biodiversity information, Global biogeography, Demospongiae occurrence records, Sponges


Specifications Table

Subject Biodiversity
Specific subject area Marine macroecology, marine biogeography, biodiversity data, marine conservation and management, climate change assessments
Data format Excel files (Raw, Filtered)
Type of data Table, Chart, Graph, Figure
Data collection Georeferenced occurrence records of the class Demospongiae, dereplicated, taxonomically curated, flagged for potentially incorrect entries regarding each species’ bathymetric and geographic distributions based on expert knowledge available in major databases of biological traits, and standardized with Darwin Core Standard. Data were processed using R statistical computing software, version 4.2.2 (2023).
Data source location Institution: CCMAR- Centre of Marine Sciences
City/Town/Region: Faro, Algarve
Country: Portugal
Occurrence records of demosponge species compiled from the biodiversity information facilities:
(1) Ocean Biodiversity Information System (https://obis.org)
(2) Global Biodiversity Information Facility (https://www.gbif.org)
(3) Deep-Sea Coral & Sponge Map Portal, National Oceanic and Atmospheric Administration (https://www.ncei.noaa.gov/maps/deep-sea-corals/mapSites.htm)
(4) National Biodiversity Network, NBN atlas (https://nbnatlas.org/)
(5) Vulnerable Marine Ecosystems, International Council for the Exploration of the Sea (https://vme.ices.dk/download.aspx)
(6) PANGAEA – Data Publisher for Earth & Environmental Science (https://www.pangaea.de)
(7) BioTIME, A database of biodiversity time series for the Anthropocene (https://biotime.st-andrews.ac.uk)
(8) Integrated Digitized Biocollections (https://www.idigbio.org)
(9) European Marine Observation and Data Network (EMODnet) – Data Ingestion Portal (https://www.emodnet-ingestion.eu/)
(10) Aquamaps, a global online database containing standardized distribution maps for marine species (https://aquamaps.org)
Expert knowledge of demosponge species compiled from the biodiversity information facility:
(10) Aquamaps, a global online database containing standardized distribution maps for marine species (https://aquamaps.org)
(11) SeaLifeBase, a global online database of information about marine life (https://www.sealifebase.ca)
Data accessibility Repository name:
Data identification number: 10.6084/m9.figshare.24591012
Direct URL to data: https://doi.org/10.6084/m9.figshare.24591012

1. Value of the Data

  • The most up-to-date dataset of demosponge distribution records at a global scale. Marine sponges are keystone components of marine benthic communities, promoting biodiversity thought the provisioning of habitat for numerous organisms, and influencing nutrient cycling [1]. Additionally, they constitute a valuable source of natural products with various applications in biomedical research, pharmaceuticals, and biotechnology [2]. Yet, sponges face numerous threats from environmental changes and human activities, including deep-sea industrialization and fishing. Considering their ecological role and sensitivity to human disturbances, sponges are considered indicator species of Vulnerable Marine Ecosystems (VMEs) in the deep sea [3].

  • The dataset is curated, ensuring that records are dereplicated and standardized taxonomically. It includes flags for potentially incorrect records and it is made available under the FAIR principle in Darwin Core Standard. This facilitates smooth integration into statistical analyses and promotes interoperability across biodiversity datasets.

  • The dataset serves as a foundational reference for describing species distributions at the global scale and exploring niche-related inquiries, which comprise projections of climate-induced range shifts across space and time [4]. It can also be used in modelling applications to identify suitable habitats of overlooked species and assist in locating VME in poorly known regions [3,5].

  • The dataset can assist researchers in tackling priority questions associated with demosponges macroecology, biogeography and climate change responses and impacts. It can assist in unveiling biodiversity patterns such as endemicity centers and species richness hotspots [6], which together can support the implementation of well-informed strategies for conserving, managing, and restoring marine biodiversity.

2. Background

Macroecology, biogeography and conservation research rely heavily on complete and precise occurrence data describing the distribution of species [7]. Although open-access biodiversity databases like the Ocean Biodiversity Information System [8] provide access to such information, they often contain spatial and taxonomic errors and can be incomplete. Additionally, the presence of duplicated data in various formats hampers seamless integration and interoperability [9]. Here, we provide a dataset of demosponge distribution records at the global scale, comprising dereplicated records of 1816 taxonomically standardized species and incorporating a quality control system flagging potentially incorrect records [10]. Data are made available under the FAIR principle of Findability, Accessibility, Interoperability and Reusability in the Darwin Core Standard [11].

3. Data Description

The dataset of occurrence records of species belonging to the class Demospongiae is provided in Excel format. Rows refer to occurrence records and columns are compatible with the data fields of Darwin Core Standard [11], with a focus on the date, source, location of records, taxonomy, and finally quality flag of records (Table 1).

Table 1.

Data fields of the global dataset of demosponge distribution records (Additional information on Darwin Core Standard [11]: https://dwc.tdwg.org).

Field Description
aphiaID Identifier of the taxon, linked to the World Register of Marine Species
scientificName Name of the taxon, as originally reported
acceptedName Accepted name of the taxon, retrieved from the World Register of Marine Species
kingdom Higher taxonomic classification
phylum Higher taxonomic classification
class Higher taxonomic classification
order Higher taxonomic classification
family Higher taxonomic classification
genus Higher taxonomic classification
decimalLongitude Geographical longitude in decimal degrees of the record's location
decimalLatitude Geographical latitude in decimal degrees of the record's location
coordinateUncertaintyInMeters Distance (in meters) from the decimal Latitude and decimal Longitude that describes the center of the circle containing the record's location
depthAccuracy Depth uncertainty of the record (in meters), as originally reported
locality Name of the record's location
minimumDepthInMeters Minimum depth of the record (in meters), as originally reported
maximumDepthInMeters Maximum depth of the record (in meters), as originally reported
year Four-digit year in which the observation occurred
month Two-digit month in which the observation occurred
day Two-digit day in which the observation occurred
bibliographicCitation Bibliographic reference of the record
license “A legal document giving official permission to do something with the resource”
georeferenceProtocol A description or reference to the methods used to determine the spatial footprint, coordinates, and uncertainties.
scientificNameAuthorship Authorship information for the scientificName
taxonomicStatus The status of the use of the scientificName as a label for a taxon.
coordinatePrecision A decimal representation of the precision of the coordinates given in the decimalLatitude and decimalLongitude.
country The name of the country or major administrative unit in which the Location occured
individualCount The number of individuals represented present at the time of the Occurrence.
basisOfRecord The specific nature of the data record.
measurementOrFact Quality control based on the flagging system: flagGeographicRange ‘-1’ for records outside the known geographic distribution of species flagVerticalRange ‘-1’ for records outside the known depth range of species flagLand ‘-1’ for records over land

At first, 4776,338 records of occurrence of species belonging to the class Demospongiae were gathered from online biodiversity databases. Records were taxonomically standardized using the World Register of Marine Species, and duplicated and non-georeferenced records were removed. This resulted in a dataset with 417,626 records of 1816 species. Expert knowledge on the bathymetric and geographical distribution of species belonging to the class Demospongiae was gathered from the SeaLifeBase [12], an online database with information about marine life, and Aquamaps [13], a database providing expert-curated species range maps. Only species with current expert knowledge were further considered. Occurrence records falling outside the known bathymetric and geographical distribution, as well as on land, were then flagged as potentially incorrect, resulting in a pruned dataset with 321,660 records of 1495 species belonging to 257 genera, 86 families and 21 orders of the Class Demospongiae (Table 2, Fig. 1), and covering the period from 1776 to 2023 (Fig. 2) and a depth range from 0 to 4820 m [14].

Table 2.

Number of species, records and flagged records falling (1) over land or out of the known (2) bathymetric and (3) geographical distribution. Numbers in parentheses represent percentages.

Order Species Records Flagged
On land Bathymetric range Geographical range
Agelasida 26 5368 3 (0.06) 2184 (40.69) 103 (1.92)
Axinellida 116 29,015 108 (0.37) 4047 (13.95) 3338 (11.5)
Biemnida 30 1692 3 (0.18) 314 (18.56) 185 (10.93)
Bubarida 38 4994 21 (0.42) 687 (13.76) 356 (7.13)
Chondrillida 17 5909 38 (0.64) 539 (9.12) 639 (10.81)
Chondrosiida 5 4202 72 (1.71) 45 (1.07) 171 (4.07)
Clionaida 72 46,730 369 (0.79) 2246 (4.81) 2316 (4.96)
Dendroceratida 27 4627 44 (0.95) 511 (11.04) 1025 (22.15)
Desmacellida 9 89 - 19 (21.35) 29 (32.58)
Dictyoceratida 107 41,168 223 (0.54) 4137 (10.05) 4061 (9.86)
Haplosclerida 292 37,001 443 (1.2) 8683 (23.47) 2951 (7.98)
Merliida 9 77 16 (20.78) 15 (19.48)
Poecilosclerida 518 69,566 241 (0.35) 7598 (10.92) 8297 (11.93)
Polymastiida 34 19,977 54 (0.27) 2623 (13.13) 1845 (9.24)
Scopalinida 13 2659 4 (0.15) 689 (25.91) 209 (7.86)
Sphaerocladina 1 13 - 7 (53.85)
Suberitida 122 73,795 575 (0.78) 5111 (6.93) 8948 (12.13)
Tethyida 46 6694 60 (0.9) 1306 (19.51) 1140 (17.03)
Tetractinellida 303 51,453 89 (0.17) 11,380 (22.12) 12,308 (23.92)
Trachycladida 2 170 2 (1.18) 2 (1.18) 143 (84.12)
Verongiida 29 12,427 173 (1.39) 2575 (20.72) 1768 (14.23)
Total 1816 471,626 2522 (6.03) 54,719 (13.10) 49,847 (11.94)

Fig. 1.

Fig. 1

Global map of demosponge records. Points in orange represent occurrences that are flagged as correct, while points in purple indicate potentially inaccurate records based on their known vertical and bathymetric ranges and/or on land.

Fig. 2.

Fig. 2

Number of demosponge (a) records and (b) species available in the demosponge dataset per year (data are available since the year 1776. To improve visualization, the few records before 1900 were removed from the graph).

The global dataset of demosponge distribution records [10] is publicly available in a permanent repository (https://doi.org/10.6084/m9.figshare.24591012) containing 2 main Excel files:

  • (1)

    The flagged database, comprising all records.

  • (2)

    The pruned database, comprising only records flagged as correct based on each species' known geographic and bathymetric distribution range, and over land.

4. Experimental Design, Materials and Methods

The collection and curation steps of the global dataset of demosponge distribution records follow previous studies [9,15] and are detailed below.

  • Step 1.

    Collating the list of sponge species belonging to the Class Demospongiae

The taxonomy of sponges covers a broad spectrum of species. The scope of this dataset is focused on marine species of the class Demospongiae, the largest sponge class comprising 81% of all sponges [10]. A list of taxonomically accepted species of the class Demospongiae was collated from the World Register of Marine Species (WoRMS) [16] and was used to search for occurrence records. WoRMS is an authoritative reference system for marine species that offers a unique identification code (aphiaID) associated with a standardized accepted name, and related taxonomic information.

  • Step 2.

    Acquisition of occurrence records

Occurrence records of the targeted species were collected from 10 major online biodiversity databases: (1) Ocean Biodiversity Information System [8], (2) Global Biodiversity Information Facility [17], (3) Deep-Sea Coral & Sponge Map Portal, National Oceanic and Atmospheric Administration [18], (4) National Biodiversity Network, NBN atlas [19], (5) Vulnerable Marine Ecosystems, International Council for the Exploration of the Sea [20], (6) PANGAEA – Data Publisher for Earth & Environmental Science [21], (7) BioTIME, A database of biodiversity time series for the Anthropocene [22], (8) Integrated Digitized Biocollections [23], (9) European Marine Observation and Data Network, Data Ingestion Portal [24], (10) Aquamaps [13]. The original source of each record is reported in the respective fields of the Darwin Core Standard.

The dataset exclusively contains occurrence records that are either copyright-free and unrestricted for use or allow any use with appropriate attribution (e.g., CC0 or CC BY, www.creativecommons.org).

  • Step 3.

    Taxonomic curation

Taxonomic standardization was performed for each entry with the WoRMS [16]. Entries with status other than accepted were matched with the currently valid species names. Records were also checked to belong to the Demospongiae class, and if not, they were discarded from the dataset.

  • Step 4.

    Pruning of occurrence records

Records lacking coordinated information were discarded from the dataset. Additionally, duplicate records of the same species, and sharing the same spatial (longitude, latitude, depth) and temporal information (year, month, day) were discarded from the dataset.

  • Step 5.

    Quality control flagging of occurrence records

The large volume of records requires the establishment of a quality control system that can flag potentially incorrect records, which could inadvertently be propagated across repositories via automatic interoperability, despite their source being considered reliable [9]. To address this concern, a quality control protocol, as outlined by Assis et al., 2020 [9,15], was applied to flag records on land and/or with geographical and depth distributions outside currently known species information.

Records over land were identified with a polygon provided by Natural Earth [25], a public domain map that encompasses different scales. Here, the 1:10 m scale layer was employed as a reference. The criterion for flagging records was based on a 1 km Euclidean distance from the ocean, as in Assis et al., 2020 [9].

Additionally, the depth of each record was extracted based on the General Bathymetric Chart of the Oceans, a global terrain model providing elevation data, in meters, on a 15 arc-second interval grid [26]. The depth values were compared to the known bathymetric distribution of the corresponding species based on expert knowledge information provided by SeaLifeBase [12] and Aquamaps [13]. More specifically, records were flagged when their depth values fell out of their known bathymetric range. Likewise, the validation of geographical locations, based on longitude and latitude, was compared to the expert knowledge information for the corresponding species provided by SeaLifeBase [12] and Aquamaps [13]. Known geographical locations were reported in the form of Food and Agriculture Organization (FAO) Major Fishing Areas [27].

  • Step 6.

    Dataset format standardization

The dataset was aligned with the Darwin Core Standard, which provides a framework comprising identifiers, labels, and specific definitions to facilitate the exchange of information about biodiversity [11]. The dataset provides standardized information for each record, on source, taxonomy, date, location, depth and quality flag (Table 1).

Limitations

The dataset may contain the following limitations. Firstly, its taxonomic curation was based on the information available in WoRMS [16]. However, considering that taxonomic statuses may change as new species are continually being discovered and described, WoRMS may not yet contain all recent updates. Secondly, the quality control flagging was based on expert knowledge information provided by SeaLifeBase [12] and Aquamaps [13]. However, these may change as more information becomes available.

Ethics statement

The present work complies with ethical requirements and does not involve human subjects, animal experiments, or any data collected from social media platforms. No permission was required to use the primary data sources, as they were either copyright-free and unrestricted to use or allowed any use with appropriate attribution.

CRediT authorship contribution statement

Ariadni Vafeiadou: Conceptualization, Data curation, Writing – original draft. Eliza Fragkopoulou: Conceptualization, Writing – original draft. Jorge Assis: Conceptualization, Data curation, Writing – original draft, Supervision.

Acknowledgments

Acknowledgements

This study was funded by (1) the Horizon Europe Framework Programme through project MPAEurope (HORIZON-CL6-2021-BIODIV-01-12) and (2) the Portuguese National Funds from FCT – Foundation for Science and Technology through projects UIDB/04326/2020 (DOI:10.54499/UIDB/04326/2020), UIDP/04326/2020 (DOI:10.54499/UIDP/04326/2020), LA/P/0101/2020 (DOI:10.54499/LA/P/0101/2020), PTDC/BIA-CBI/6515/2020 (DOI:10.54499/PTDC/BIA-CBI/6515/2020), the Individual Call to Scientific Employment Stimulus 2022.00861.CEECIND/CP1729/CT0003 (DOI:10.54499/2022.00861.CEECIND/CP1729/CT0003) to J.A., and the fellowship SFRH/BD/144878/2019 to E.F.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Contributor Information

Eliza Fragkopoulou, Email: efragkopoulou@ualg.pt.

Jorge Assis, Email: jorgemfa@gmail.com.

Data Availability

References

  • 1.Bell J.J. The functional roles of marine sponges. Estuar. Coast. Shelf Sci. 2008;79(3):341–353. [Google Scholar]
  • 2.Mehbub M.F., Lei J., Franco C., Zhang W. Marine sponge derived natural products between 2001 and 2010: trends and opportunities for discovery of bioactives. Mar. Drugs. 2014;12(8):4539–4577. doi: 10.3390/md12084539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Ardron J.A., Clark M.R., Penney A.J., Hourigan T.F., Rowden A.A., Dunstan P.K., Watling L., Shank T.M., Tracey D.M., Dunn M.R., Parker S.J. A systematic approach towards the identification and protection of vulnerable marine ecosystems. Mar. Policy. 2014:49. doi: 10.1016/j.marpol.2013.11.017. [DOI] [Google Scholar]
  • 4.Boavida J., Assis J., Silva I., Serrão E.A. Overlooked habitat of a vulnerable gorgonian revealed in the Mediterranean and Eastern Atlantic by ecological niche modelling. Sci. Rep. 2016;6 doi: 10.1038/srep36460. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Assis J., Serrão E.A., Claro B., Perrin C., Pearson G.A. Climate-driven range shifts explain the distribution of extant gene pools and predict future loss of unique lineages in a marine brown alga. Mol. Ecol. 2014;23:2797–2810. doi: 10.1111/mec.12772. [DOI] [PubMed] [Google Scholar]
  • 6.Fragkopoulou E., Serrão E.A., de Clerck O., Costello M.J., Araújo M.B., Duarte C.M., Krause-Jensen D., Assis J. Global biodiversity patterns of marine forests of brown macroalgae. Glob. Ecol. Biogeogr. 2022;31:636–648. doi: 10.1111/geb.13450. [DOI] [Google Scholar]
  • 7.Aubry K.B., Raley C.M., McKelvey K.S. The importance of data quality for generating reliable distribution models for rare, elusive, and cryptic species. PLoS One. 2017:12. doi: 10.1371/journal.pone.0179152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.OBIS, Ocean Biodiversity Information System. https://obis.org (Accessed March 2023).
  • 9.Assis J., Fragkopoulou E., Frade D., Neiva J., Oliveira A., Abecasis D., Faugeron S., Serrão E.A. A fine-tuned global distribution dataset of marine forests. Sci. Data. 2020;7:119. doi: 10.1038/s41597-020-0459-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Vafeiadou A., Fragkopoulou E., Assis J. Global demosponge diversity dataset. Figshare Dataset. 2023 doi: 10.6084/m9.figshare.24591012. [Dataset] [DOI] [Google Scholar]
  • 11.Darwin Core Maintenance Group List of Darwin Core terms. 2021 http://rs.tdwg.org/dwc/doc/list/2021-07-15 Biodiversity Information Standards (TDWG) [Google Scholar]
  • 12.M.L.D Palomares and D. Pauly. SeaLifeBase. World Wide Web electronic publication. www.sealifebase.org, version (04/2023).
  • 13.K. Kaschner, K. Kesner-Reyes, C. Garilao, J. Segschneider, J. Rius-Barile, T. Rees, R. Froese, AquaMaps: Predicted range maps for aquatic species. Retrieved from https://www.aquamaps.org. (2019, October).
  • 14.Morrow C., Cárdena & P. Proposal for a revised classification of the Demospongiae (Porifera) Front. Zool. 2015;12:1–27. doi: 10.1186/s12983-015-0099-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Balogh V., Fragkopoulou E., Serrão E.A., Assis J. A dataset of cold-water coral distribution records. Data Br. 2023;48 doi: 10.1016/j.dib.2023.109223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.WoRMS Editorial Board . 2023. World Register of Marine Species.https://www.marinespecies.org Available from. at VLIZ (Accessed September 2023) [DOI] [Google Scholar]
  • 17.GBIF: The Global Biodiversity Information Facility . 2023. What is GBIF?https://www.gbif.org/what-is-gbif Available from. (Accessed March 2023) [Google Scholar]
  • 18.Deep-Sea Coral & Sponge Map Portal, National Oceanic and Atmospheric Administration. https://www.ncei.noaa.gov/maps/deep-sea-corals/mapSites.htm (Accessed January 2023).
  • 19.National Biodiversity Network, NBN atlas. https://nbnatlas.org/ (Accessed January 2023).
  • 20.Vulnerable Marine Ecosystems, International Council for the Exploration of the Sea. https://vme.ices.dk/download.aspx (Accessed January 2023).
  • 21.PANGAEA – Data Publisher for Earth & Environmental Science. https://www.pangaea.de (Accessed January 2023). [DOI] [PMC free article] [PubMed]
  • 22.BioTIME, A database of biodiversity time series for the Anthropocene. https://biotime.st-andrews.ac.uk (Accessed January 2023). [DOI] [PMC free article] [PubMed]
  • 23.Integrated Digitized Biocollections. https://www.idigbio.org (Accessed January 2023).
  • 24.European Marine Observation and Data Network (EMODnet) – Data Ingestion Portal (https://www.emodnet-ingestion.eu/) (Accessed January 2023).
  • 25.Natural Earth. https://www.naturalearthdata.com/ (Accessed May 2023).
  • 26.GEBCO Compilation Group . 2022. GEBCO 2022 Grid. (Accessed May 2023). [DOI] [Google Scholar]
  • 27.FAO . 2023. FAO Major Fishing Areas, Fisheries and Aquaculture Division [online] Romeaccessed May 2023. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement


Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES