Skip to main content
. 2017 Jan 31;18:75. doi: 10.1186/s12859-017-1476-4

Table 3.

Feature set description

Feature Description
Bag of Words Bag of Words in a 5-word window.
Part of Speeches Part of Speeches in a 7-word window.
Capitalization Convert all alphabetic characters of the words to uppercase [31]. The window size is 5.
Case pattern The patterns are generated by the following steps. Similar to [32], any uppercase alphabetic character is replaced by “A” and any lowercase one is replaced by “a”. In the same way, any number is replaced by “0”. The window size is 3.
Word representation We use word2vec to acquire 700 clusters from the unlabeled clinical narratives and give each cluster a different serial number. Then we take the serial number of the clusters as a feature. The window size is 3.