Table 1:
Sequence | Group | Length (bp) | Description |
---|---|---|---|
GGA18 | Aves | 11,373,140 | Access. CM000110 – Gallus gallus chromosome 18 |
MGA20 | Aves | 10,730,484 | Access. CM000981 – Meleagris gallopavo isolate NT-WF06-2002-E0010 breed Aviagen turkey brand Nicholas breeding stock chromosome 20 |
GGA14 | Aves | 16,219,308 | Access. CM000106 – G. gallus chromosome 14 |
MGA16 | Aves | 14,878,991 | Access. CM000977 – Meleagris gallopavo isolate NT-WF06-2002-E0010 breed Aviagen turkey brand Nicholas breeding stock chromosome 16 |
HS12 | Mammalia | 133,275,309 | Access. NC_000012 – Homo sapiens chromosome 12, GRCh38.p13 Primary Assembly |
PT12 | Mammalia | 130,995,916 | Access. NC_036891 – Pan troglodytes isolate Yerkes chimp pedigree #C0471 (Clint) chromosome 12 |
PXO99A | Bacteria | 5,238,555 | Access. CP000967 – Xanthomonas oryzae pv. oryzae causes the major disease of bacterial blight of rice (Oryza sativa L.). X. oryzae pv. oryzae PXO99A strain is virulent toward a large number of rice varieties representing diverse genetic sources of resistance [25] |
MAFF 311018 | Bacteria | 4,940,217 | Access. AP008229 – X. oryzae pv. oryzae MAFF 311018 is a Japanese race 1 strain [26] |
ScVII | Fungi | 1,090,940 | Access. NC_001139 – Saccharomyces cerevisiae S288C chromosome VII |
SpVII | Fungi | 1,105,967 | Access. CP020299 – Saccharomyces paradoxus strain UFRJ50816 chromosome VII |
RefS | Synthetic | 1,500 | It consists of 3 segments of 500 bp size. |
TarS | Synthetic | 1,500 | To build TarS, segment I is mutated 2%, II is inversely repeated, and III is duplicated. |
RefM | Synthetic | 100,000 | It has 4 segments of 25 kb size. |
TarM | Synthetic | 100,000 | For building TarM, segment I of RefM (out of total 4) is inversely repeated, II is mutated 90%, III is duplicated, and IV is mutated 3% |
RefL | Synthetic | 5,000,000 | It includes 2 segments, 2,500,000 bp each |
TarL | Synthetic | 5,000,000 | Segment I is inversely repeated, and II is mutated 2% for building TarL |
RefXL | Synthetic | 100,000,000 | It is made of 4 segments, 25,000,000 bp each |
TarXL | Synthetic | 100,000,000 | Segment I is mutated 1%, segments II and III are inversely repeated, and segment IV is duplicated to make TarXL |
RefMut | Synthetic | 60,000 | It includes 60 segments of 1 kb size |
TarMut | Synthetic | 60,000 | To build TarMut, the first segment (I) is mutated 1%, the second segment is mutated 2%, the third one is mutated 3%, and so on |
RefComp | Synthetic | 1,000,000 | It consists of 10 segments of 100 kb |
TarComp | Synthetic | 1,000,000 | To build it, the first segment (I) of RefComp is duplicated, and the second, third, and fourth segments are mutated 1%, 2%, and 3%, respectively. Segments V, VI,and VII of RefComp are inversely repeated, then mutated 4%, 5%, and 6%, respectively. Finally, segments VIII, IX, and X are mutated 7%, 8%, and 9%, respectively. |
RefPerm | Synthetic | 3,000,000 | It includes 3 segments of 1 Mb size. In addition to the original sequence, it is permutated, using GOOSE toolkit, by blocks of sizes 450 kb, 30 kb, 1 kb and 30 bp. |
TarPerm | Synthetic | 3,000,000 | To build TarPerm, the first segment is mutated 1%, the second segment is inversely repeated, and the third one is mutated 2%. |
The real dataset can be download from NCBI via accession number (access.) provided in the descriptions.