Table 4.
Same-Stem Pairs |
Different-Stem Pairs |
||
---|---|---|---|
Data set | Average Precision (95% CI) | Data set | Average Precision (95% CI) |
word2vec-GloVe* | 0.0370 (0.0190–0.0564) | fastText-GloVe | 0.0699 (−0.0026 to 0.1416) |
word2vec-FastText* | 0.0619 (0.0321–0.0942) | fastText-word2vec | 0.0453 (−0.0242 to 0.1157) |
GloVe-FastText | 0.0249 (−0.0087–0.0594) | GloVe-word2vec | 0.0246 (−0.0171 to 0.0649) |
Note: There were 10 000 bootstrap resamplings for each data set. For each sample, we computed the average precision for each of the 3 embedding methods. This allowed us to compare the performance of the methods as the difference in average precision on each sample and compute the mean and 95% confidence interval for each comparison. We found that word2vec is better than GloVe or fastText on the same-stem data. *Significant difference at the 5% level.
CI: confidence interval.