Skip to main content
. 2023 Feb 14;3:1107467. doi: 10.3389/fbinf.2023.1107467

TABLE 2.

The strategies to measure CASBERT performance. There are three query sets and eight retrieval methods including BM25 as the gold standard.

Q-E type # Of query-entities Description
PMR-CA BioModels-CA
noPredicate 338 834 The original query-entity pairs extracted from the PMR and the BioModels-CA.
withPredicate 534 1541 The expanded noPredicate set by randomly adding terms in composite annotation predicate to the associated existing query terms
combine 509 1777 Combination of noPredicate and withPredicate where the data used for QC model training is removed
Retrieval method Terms used to generate query embedding List of entity embeddings used
macro whole query terms E 1(w p = 0)
macroWP whole query terms E 2(0 < w p < 1)
micro ontology class concept phrase E 1(w p = 0)
microWP ontology class concept phrase E 2(0 < w p < 1)
mixed whole query terms & ontology class concept phrases E 1(w p = 0)
mixedWP whole query terms & ontology class concept phrases E 2(0 < w p < 1)
mixedCl whole query terms & ontology class concept phrases select between L 1 or L 2
BM25 Retrieval uses a bag-of-words method, BM25