Skip to main content
. 2019 Dec 5;19(Suppl 5):235. doi: 10.1186/s12911-019-0933-6

Table 3.

Distribution of entities in two datasets

Dataset of CCKS 2018 Dataset of CCKS 2017
Training set (600) Test set (400) Entity type Training set (300) Test set (100)
Anatomical Part 7838 (52%) 6339 (63%) Body Part 10,719 (36%) 3021 (32%)
Symptom Description 2066 (14%) 918 (9%) Symptom 7831 (26%) 2311 (24%)
Independent Symptom 3055 (20%) 1327 (13%) Diagnosis 722 (2%) 553 (6%)
Drug 1005 (7%) 813 (8%) Test 9546 (32%) 3143 (33%)
Operation 1116 (7%) 735 (7%) Treatment 1048 (4%) 465 (5%)