TABLE 5.
The 20 most frequent tokens, including stop words and punctuation marks, within various window sizes around entities incorrectly labeled by human reviewers. Words that are neither stop words nor punctuation words are in boldface.
Window size = 1 | Window size = 3 | Window size = 5 | ||||
---|---|---|---|---|---|---|
# | Token | Count | Token | Count | Token | Count |
1 | — | 27 | — | 49 | — | 74 |
2 | — | 14 | And | 21 | — | 28 |
3 | And | 14 | Mg | 19 | And | 28 |
4 | with | 8 | — | 18 | Mg | 23 |
5 | ( | 7 | ) | 13 | ) | 21 |
6 | Is | 7 | With | 10 | Of | 18 |
7 | Of | 6 | Of | 10 | ( | 17 |
8 | Was | 6 | ( | 9 | with | 13 |
9 | Or | 4 | Is | 9 | to | 12 |
10 | 300 | 4 | Or | 7 | the | 11 |
11 | oral | 3 | once/day | 7 | a | 11 |
12 | Has | 3 | A | 7 | in | 10 |
13 | To | 3 | The | 6 | is | 9 |
14 | [ | 3 | Was | 6 | or | 9 |
15 | dose | 3 | To | 6 | daily | 8 |
16 | intravenous | 2 | treatment | 6 | was | 8 |
17 | In | 2 | In | 5 | treatment | 8 |
18 | may | 2 | 300 | 5 | ] | 8 |
19 | 500 | 2 | Be | 4 | once/day | 8 |
20 | A | 2 | Treated | 4 | were | 7 |