Table 4.
BeCAS [36] | Whatizit [38] | ConceptMapper [21] | Neji [40] | |
---|---|---|---|---|
Modularity/configuration options | Semantic types (i.e. types of entities to annotate) | pre-built pipelines for several biomedical types (see Specific features) | Text processing pipeline; Term matching options and strategies |
Modular text processing pipeline |
Disambiguation of terms | No information available | Not supported | Not supported | Instead of through WSD, it uses a set of heuristics rules to identify and remove annotations of lower importance |
Vocabulary (terminology) | Custom built vocabulary by using concepts from multiple sources, such as UMLS, NCBI BioSystems, ChEBI, and the Gene Ontology. | The use of the vocabulary depends on the type of entity a pipeline is specialized for (e.g. NCBI KB for species, or Gene Ontology for genes) | General purpose dictionary lookup tool, not tied to any vocabulary | Not tied to any particular vocabulary |
Speeda | Suitable for real-time processing | Suitable for real-time processing | Suitable for real-time processing | Suitable for real-time processing |
Implementation form | Software (Python) library; RESTful Web service; Javascript widget |
SOAP Web service | Software (Java) library; part of the UIMA NLP framework [28] | RESTful Web service |
Availability | open source; available under Attribution-NonCommercial 3.0 Unported license |
closed source, but freely available |
open source; available under Apache License, v.2.0 |
open source; available under Attribution-NonCommercial 3.0 Unported license |
Specific features | Primarily aimed for annotation of biomedical research papers; focused on annotation of several (11) types of biomedical entities, including species, microRNAs, enzymes, chemicals, drugs, diseases, etc. | Offers several pre-built pipelines for specific entity types; e.g. whatizitGO identifies proteins based on the Gene Ontology (GO), while whatizitChemical annotates chemical entities based on ChEBI | Not specifically developed for the biomedical domain, but is a general purpose dictionary lookup tool | Includes modules for both ML and dictionary-based annotation; can automatically combine annotations generated by different modules |
URL | http://bioinformatics.ua.pt/becas/ | http://www.ebi.ac.uk/webservices/whatizit | https://uima.apache.org/sandbox.html#concept.mapper.annotator | https://github.com/BMDSoftware/neji |
aNote that speed estimates are based on the experimental results reported in the literature; those experiments were done with corpora of up to 200 documents (paper abstracts or clinical notes); the given estimates might not hold for significantly larger corpora