Table 2. Basic statistics of the two utilized datasets of the DDI corpus.
MEDLINE | DrugBank | |||||
---|---|---|---|---|---|---|
Test | Train | Total | Test | Train | Total | |
Documents | 33 | 142 | 175 | 158 | 572 | 730 |
Sentences | 326 | 1301 | 2308 | 973 | 5675 | 6648 |
Drug Names | 426 | 1836 | 2308 | 2512 | 12,929 | 15,441 |
True DDI candidates | 95 | 232 | 327 | 884 | 3788 | 4672 |
False DDI candidates | 356 | 1555 | 1911 | 4381 | 22,217 | 26,598 |
Candidates with clause connectors | 126 | 478 | 604 | 2067 | 9215 | 11,282 |
Number of Tokens | 14,358 | 61,525 | 75,883 | 244,658 | 1,163,072 | 1,407,730 |
DDI Candidates with negation | 43 | 316 | 359 | 1367 | 4558 | 5925 |
Total number of DDI candidates | 482 | 2033 | 1787 | 5265 | 31,432 | 36,697 |