Skip to main content
. 2018 Mar 20;20(4):1477–1491. doi: 10.1093/bib/bby015

Table 2.

Summary of IR algorithms

Algorithm Scoring Global WPM Remarks
tf-idf Term frequency No No Frequent resources in the collection have a low score. In ontologies, a common term does not necessarily mean less relevant. Frequent terms can be a product of reuse by other ontologies
BM25 Term frequency Yes No Suffers from the same issue has tf-idf, but the cumulative score ranks domain ontologies higher
VSM Vector similarity No No Uses tf-idf to weight vectors and also considers the tf-idf of the query, aggravating the tf-idf drawback
PageRank Links between ontologies Yes No Ranks based on popularity, which may lead to popular but less relevant resources, being ranked higher
CMM Coverage of the set of queries Yes Yes Ontologies with a large number of partial matches will be scored higher than ontologies with few exact matches
SMM Closeness between ontological resources Yes No Although this algorithm can be useful when considering similarity among the matched resources of two or more query terms of a multi-keyword query, it performs poorly on single-word queries

Note: Scoring summarizes the main scoring method of the algorithm. Global indicates if the score attributed by the algorithm is per resource or per ontology. WPM (weights partial matches) shows if the ontology distinguishes between partial and exact matches.