Table 2.
Name | Corpus | Dimension | Vocab. size |
---|---|---|---|
GloVea | Wikipedia and English Gigaword | 200 | 400 000 |
fastTextb | Wikipedia | 300 | 2 519 370 |
nlplabc | PubMed and PMC | 200 | 2 231 684 |
word2vecGN | Google News | 300 | 3 000 000 |
Numberbatchd | Hybrid of ConceptNet, word2vecGN and GloVe | 300 | 417 194 |
BioWordVece | PubMed and MIMIC-III | 200 | 16 545 451 |
word2vecMIMIC | MIMIC-III | 300 | 320 313 |
ConcatenatedVec | Hybrid of GloVe, fastText, and word2vecMIMIC | 700 | 228 763 |
AddedVec | Hybrid of fastText and word2vecMIMIC | 300 | 46 404 |
PurifiedVec | Postprocessed GloVe vectors | 200 | 400 000 |
MIMIC-III: Medical Information Mart for Intensive Care III; PMC: PubMed Central.
This is the embedding we used during the n2c2 ADME track, which is available at http://evexdb.org/pmresources/vec-space-models/.