Table 1. Summary statistics of datasets: the number of training sets, validation sets and test sets of three kinds of datasets, the number of categories, the average length, the number of nodes and the number of edges of each dataset.
| Dataset | Num of samples Training/Validation/Test/Vocabulary |
Num of categories | Average length | Num of nodes | Num of edges (million) PMI* of words |
|---|---|---|---|---|---|
| IFLYTEK | 12,133/2,599/0/250,862 | 119 | 120 | 265,594 | 26.096 |
| ChnSentiCorp | 9,600/1,200/1,200/58,932 | 2 | 109 | 70,932 | 6.126 |
| Toutiao-S | 15,000/2,500/2,500/36,246 | 5 | 25 | 56,246 | 1.761 |