Skip to main content
. 2019 Mar 28;2(2):246–253. doi: 10.1093/jamiaopen/ooz007

Table 1.

The number of semantically similar terms identified by human experts based on 40 top-ranked terms by word2vec for each 14 DS from 7 corpora

Time span of clinical notes for 7 corpora
3 months 6 months 9 months 12 months 15 months 18 months 21 months
Vocabulary size 214 948 312 557 388 891 454 459 520 127 577 362 635 176
Semantic variants 12 14 13 13 11 10 9
Brand names 7 9 8 9 6 7 5
Misspellings 4 8 10 14 13 14 21
Total 23 31 31 36 30 31 35
MAP 0.313 0.294 0.356 0.247 0.242 0.280 0.263

MAP: mean average precision; DS: dietary supplements.