Skip to main content
. 2017 Dec 1;17:155. doi: 10.1186/s12911-017-0556-8

Table 2.

Dimension of feature sets using different data representations

Dimension of the feature set iDASH MGH
Bag-of-words (Vocabulary size) 8704 145,991
UMLS concepts 4751 25,457
UMLS concepts restricted to five semantic groups 4532 24,458
UMLS concepts restricted to 15 semantic types 3635 18,521
Bag-of-words + UMLS concepts 13,455 171,448
Bag-of-words + UMLS concepts restricted to five semantic groups 13,236 170,449
Bag-of-words + UMLS concepts restricted to 15 semantic types 12,339 164,512
Paragraph vector (distributed memory model) 600 600