Table 1.
The number of viral sequences of various sizes from viral genomes discovered before January 2014, between January 2014 and May 2015, and after May 2015
Length | Training (Before 1/2014) | Validation (1/2014–5/2015) | Test (After 5/2015) | Total |
---|---|---|---|---|
150 bp | 505,259 | 164,918 | 355,204 | 705,697 |
300 bp | 252,630 | 82,458 | 177,416 | 512,504 |
500 bp | 154,640 | 50,350 | 106,298 | 311,288 |
1000 bp | 77,014 | 25,087 | 52,956 | 155,057 |
3000 bp | 25,263 | 8,246 | 17,385 | 50,894 |
The three parts of the dataset partitioned by dates were used for training, validation, and testing, respectively.