Skip to main content
. 2022 May 25;2022:baac035. doi: 10.1093/database/baac035

Table 1.

Desired features of a mapping standard, with examples of cases where the desired feature is met and examples where the desired feature is not met (negative examples)

Feature Why Examples Negative example
Explicit relationship types Applications that demand highly accurate results require mapping relations with explicit precision and semantics EC:2.2.1.2 exactMatch GO:0004801 (transaldolase activity) Two-column file that maps FMA ‘limb’ to Uberon ‘limb’, hiding differences in species-specificity
Explicit confidence Different use cases require different levels of confidence and accuracy A mapping tool assigns a confidence score based on the amount of evidence that is explicitly recorded Without the confidence score we cannot filter out automated mappings with low confidence
Provenance Understanding how a mapping was created (e.g. automatically or by a human expert curator) is crucial to interpreting it Mapping file that automated mappings with link to tool used; curated mapping file with curators’ ORCIDs provided Two-column mapping file with no indication of how the mapping was made, and no supplementary metadata file
Explicit declaration of completeness Must be able to distinguish between absence due to lack of information vs deliberate omission Mapping file where rejected mappings are explicitly recorded Mapping file where absence of a mapping can mean either explicitly rejected mapping OR the mapping was not considered/ reviewed
FAIR principles Mappings should be Findable, Accessible, Interoperable and Reusable Mapping file available on the web with clear licensing conditions, in standard format, with full metadata and a persistent identifier Mapping files exchanged via email
Unambiguous identifiers Mapping should make use of standard, globally unambiguous identifiers such as CURIEs or IRIs Standard ontology CURIEs like UBERON:0002101 for entities, with prefixes registered in a registry or as part of the metadata Identifiers are used without explicitly defined prefixes; mappings are created between strings rather than identifiers
Allows composability Mappings from different sources should be combinable and should be possible to chain mappings together Defined mapping predicates (relations) such that reasoning about chains A-> B-> C is possible (where allowed by semantics of the predicate) Two mapping files with implicit or undefined relationships -> unclear whether these can be combined or composed
Follows Linked Data principles Allows interoperation with semantic data tooling, facilitates data merging All mapped entities have URIs, and metadata elements also have defined URIs; available in JSON-LD/RDF No reuse of existing vocabularies for metadata or for relating mapped entities
Well-described data model Allows interoperation and standard tooling Data model provided in both human and machine-readable form Ad hoc file format with unclear semantics
Tabular representation Ease of curation and rapid analysis A mapping available as a TSV that is directly usable in common data science frameworks; may complement a richer serialization Ad hoc flat-file format requiring a custom parser