Table 3. Lengths of initial genomic sequence and sequence selected into training process after data pre-processing steps (repeat masking and subsequent filtering of short contigs); sizes of the initial set of introns mapped by RNA-Seq read aligner (UnSplicer) to the full genome and the set of introns mapped to the reduced genome.
Species | Genome size (Mb) | Sequence in training (Mb) | Introns mapped to genome | Introns in training | % of introns |
---|---|---|---|---|---|
Aedes aegypti | 1384 | 415 | 57 684 | 55 702 | 96.6 |
Anopheles gambiae | 273 | 201 | 68 827 | 59 698 | 86.7 |
Anopheles stephensi | 208 | 97 | 28 869 | 20 418 | 70.7 |
Culex quinquefasciatus | 528 | 195 | 57 579 | 56 621 | 98.3 |
Drosophila melanogaster | 120 | 97 | 70 077 | 56 678 | 80.9 |