TABLE 1.
Train | Valid | Test | All | |||||
---|---|---|---|---|---|---|---|---|
|
|
|
|
|||||
#Articles | Label (%) | #Articles | Label (%) | #Articles | Label (%) | #Articles | Label (%) | |
LitCovid BioCreative (7 labels) | 24960 | - | 6239 | - | 2500 | - | 33699 | - |
Case Report | 2063 | (8.27%) | 482 | (7.73%) | 197 | (7.88%) | 2742 | (8.14%) |
Diagnosis | 6193 | (24.81%) | 1546 | (24.78%) | 722 | (28.88%) | 8461 | (25.11%) |
Epidemic Forecasting | 645 | (2.58%) | 192 | (3.08%) | 41 | (1.64%) | 878 | (2.61%) |
Mechanism | 4438 | (17.78%) | 1073 | (17.2%) | 567 | (22.68%) | 6078 | (18.04%) |
Prevention | 11102 | (44.48%) | 2750 | (44.08%) | 926 | (37.04%) | 14778 | (43.85%) |
Transmission | 1088 | (4.36%) | 256 | (4.1%) | 128 | (5.12%) | 1472 | (4.37%) |
Treatment | 8717 | (34.92%) | 2207 | (35.37%) | 1035 | (41.4%) | 11959 | (35.49%) |
Hoc (10 labels) | 1108 | - | 157 | - | 315 | - | 1580 | - |
Activating invasion & metastasis | 199 | (17.96%) | 35 | (22.29%) | 57 | (18.1%) | 291 | (18.42%) |
Avoiding immune destruction | 77 | (6.95%) | 14 | (8.92%) | 17 | (5.40%) | 108 | (6.84%) |
Cellular energetics | 76 | (6.86%) | 10 | (6.37%) | 19 | (6.03%) | 105 | (6.65%) |
Enabling replicative immortality | 82 | (7.40%) | 15 | (9.55%) | 18 | (5.71%) | 115 | (7.28%) |
Evading growth suppressors | 174 | (15.7%) | 22 | (14.01%) | 46 | (14.60%) | 242 | (15.32%) |
Genomic instability & mutation | 239 | (21.57%) | 24 | (15.29%) | 70 | (22.22%) | 333 | (21.08%) |
Inducing angiogenesis | 97 | (8.75%) | 15 | (9.55%) | 31 | (9.84%) | 143 | (9.05%) |
Resisting cell death | 302 | (27.26%) | 45 | (28.66%) | 83 | (26.35%) | 430 | (27.22%) |
Sustaining proliferative signal | 338 | (30.51%) | 41 | (26.11%) | 83 | (26.35%) | 462 | (29.24%) |
Tumor promoting inflammation | 162 | (14.62%) | 24 | (15.29%) | 54 | (17.14%) | 240 | (15.19%) |
Articles: the number of articles; Label (%): the proportion of the articles with a specific label; some label names of the hoc dataset are shortended for representation purpose.