Table 2.
Composition of the datasets used in this work
Dataset | Number of different sites | Number of different genomes | Number of (site, genome) pairs |
---|---|---|---|
Experimental dataset | 494a | 2861b | 66,704 |
Control dataset 1 | 899 | 3407 | 3,062,893 |
Control dataset 2 | 899 | 4021 | 3,614,879 |
aR-M systems encoded in the genomes of the known phage hosts recognize 494 among all 899 known RS. bOnly 2861 phages among 3407 have known host species with available data on the encoded R-M systems