Skip to main content
. 2020 May 19;8(5):e17643. doi: 10.2196/17643

Table 2.

Statistics of the ChemProt corpus.

Annotations Data set

Training, n Development, n Test, n
Document 1020 612 800
Chemicals 13,017 8004 10,810
Proteins 12,752 7567 10,019
CPRa:3 768 550 665
CPR:4 2254 1094 1661
CPR:5 173 116 195
CPR:6 235 199 293
CPR:9 727 457 644
Evaluated CPIsb 4157 2416 3458
Evaluated CPIs in one sentence 4122 2412 3444

aCPR: chemical-protein relation.

bCPI: chemical-protein interaction.