Table 3.
Evaluation results on UMNSRS datasets.
Method | Corpus | UMNSRS-Sim | UMNSRS-Rel | ||||
---|---|---|---|---|---|---|---|
# | Pearson | Spearman | # | Pearson | Spearman | ||
Mikolov et al.1 | Google news | 336 | 0.421 | 0.409 | 329 | 0.359 | 0.347 |
Pyysalo et al.21 | PubMed + PMC | 493 | 0.549 | 0.524 | 496 | 0.495 | 0.488 |
Chiu et al.8 | PubMed | 462 | 0.662 | 0.652 | 467 | 0.600 | 0.601 |
BioWordVec (win20) | PubMed | 521 | 0.665 | 0.654 | 532 | 0.608 | 0.607 |
BioWordVec (win20) | PubMed + MeSH | 521 | 0.667 | 0.657 | 532 | 0.619 | 0.617 |
“#” denotes the number of the term pairs that can be mapped by the different word embeddings. “Pearson” and “Spearman” denote the Pearson’s correlation coefficient score and Spearman’s correlation coefficient score, respectively. “win20” denotes the BioWordVec was trained by setting the context window size as 20. The highest value is shown in bold.