Table 1.
Statistical information of datasets used in our experiments. Number stands for the total token size in training, validation and testing of three datasets.
Dataset | Train | Valid. | Test |
---|---|---|---|
PTB | 887,521 | 70,390 | 78,669 |
Yelp | 3,063,578 | 380,877 | 424,879 |
WikiText-2 | 2,088,628 | 217,646 | 245,569 |