Table 2. Descriptive statistics for the SimConcept corpus. The numbers of composite mentions (of different types) are first listed followed by the numbers of individual mentions after decomposition in parentheses.
Concept | # of abstracts | Five types of composite mentions | |||||
---|---|---|---|---|---|---|---|
All | CR | C | I | IA | OA | ||
Gene | 694 | 810 (1895) | 14 (60) | 101 (246) | 442 (1089) | 253 (534) | 41 (107) |
Disease | 793 | 1012 (2293) | 2 (18) | 245 (583) | 303 (809) | 486 (1045) | 52 (123) |
Chemical | 937 | 1012 (2944) | 99 (505) | 201 (771) | 496 (1389) | 302 (716) | 0 (0) |