Explicit relationship types |
Applications that demand highly accurate results require mapping relations with explicit precision and semantics |
EC:2.2.1.2 exactMatch GO:0004801 (transaldolase activity) |
Two-column file that maps FMA ‘limb’ to Uberon ‘limb’, hiding differences in species-specificity |
Explicit confidence |
Different use cases require different levels of confidence and accuracy |
A mapping tool assigns a confidence score based on the amount of evidence that is explicitly recorded |
Without the confidence score we cannot filter out automated mappings with low confidence |
Provenance |
Understanding how a mapping was created (e.g. automatically or by a human expert curator) is crucial to interpreting it |
Mapping file that automated mappings with link to tool used; curated mapping file with curators’ ORCIDs provided |
Two-column mapping file with no indication of how the mapping was made, and no supplementary metadata file |
Explicit declaration of completeness |
Must be able to distinguish between absence due to lack of information vs deliberate omission |
Mapping file where rejected mappings are explicitly recorded |
Mapping file where absence of a mapping can mean either explicitly rejected mapping OR the mapping was not considered/ reviewed |
FAIR principles |
Mappings should be Findable, Accessible, Interoperable and Reusable |
Mapping file available on the web with clear licensing conditions, in standard format, with full metadata and a persistent identifier |
Mapping files exchanged via email |
Unambiguous identifiers |
Mapping should make use of standard, globally unambiguous identifiers such as CURIEs or IRIs |
Standard ontology CURIEs like UBERON:0002101 for entities, with prefixes registered in a registry or as part of the metadata |
Identifiers are used without explicitly defined prefixes; mappings are created between strings rather than identifiers |
Allows composability |
Mappings from different sources should be combinable and should be possible to chain mappings together |
Defined mapping predicates (relations) such that reasoning about chains A-> B-> C is possible (where allowed by semantics of the predicate) |
Two mapping files with implicit or undefined relationships -> unclear whether these can be combined or composed |
Follows Linked Data principles |
Allows interoperation with semantic data tooling, facilitates data merging |
All mapped entities have URIs, and metadata elements also have defined URIs; available in JSON-LD/RDF |
No reuse of existing vocabularies for metadata or for relating mapped entities |
Well-described data model |
Allows interoperation and standard tooling |
Data model provided in both human and machine-readable form |
Ad hoc file format with unclear semantics |
Tabular representation |
Ease of curation and rapid analysis |
A mapping available as a TSV that is directly usable in common data science frameworks; may complement a richer serialization |
Ad hoc flat-file format requiring a custom parser |