Skip to main content
. 2023 Jan 2;102:107808. doi: 10.1016/j.compbiolchem.2022.107808

Table 3.

CHEMDNER dataset characteristics.

Dataset Statistics Dataset for Training Dataset for Development Dataset for Testing Total Dataset
Abstracts 3500 3500 3000 10,000
No. of Characters 4,883,753 4,864,558 4,199,068 13,947,379
No. of Tokens 770,855 766,331 662,571 2,199,757
Abstracts with classes 2916 2907 2478 8301
No. of Mentions 29,478 29,526 25,351 84,355
No. of Chemicals 8520 8677 7563 19,805
No. of Journals 193 188 188 203