Skip to main content
. 2019 Jul 23;27(1):47–55. doi: 10.1093/jamia/ocz120

Table 2.

The pretrained word embeddings used in the study

Name Corpus Dimension Vocab. size
GloVea Wikipedia and English Gigaword 200 400 000
fastTextb Wikipedia 300 2 519  370
nlplabc PubMed and PMC 200 2 231  684
word2vecGN Google News 300 3 000  000
Numberbatchd Hybrid of ConceptNet, word2vecGN and GloVe 300 417 194
BioWordVece PubMed and MIMIC-III 200 16 545  451
word2vecMIMIC MIMIC-III 300 320 313
ConcatenatedVec Hybrid of GloVe, fastText, and word2vecMIMIC 700 228 763
AddedVec Hybrid of fastText and word2vecMIMIC 300 46 404
PurifiedVec Postprocessed GloVe vectors 200 400 000