Table 1.
Corpus | # Documents | # Sentences | # Entities |
---|---|---|---|
i2b2-Pittsburgh | 477 | 27,627 | Problem: 12,586 |
Treatment: 9,343 | |||
Test: 9,225 | |||
| |||
i2b2-Beth | 73 | 8,798 | Problem: 4,187 |
Treatment: 3,072 | |||
Test: 3,036 | |||
| |||
i2b2-Partners | 97 | 7,517 | Problem: 2,885 |
Treatment: 1,768 | |||
Test: 1,570 | |||
| |||
GENIA | 2,000 | 18,546 | protein: 24,966 |
DNA: 8,557 | |||
RNA: 719 | |||
cell type: 6,221 | |||
cell line: 3,663 |