Skip to main content
. 2022 Mar 21;15:8. doi: 10.1186/s13040-022-00291-0

Table 1.

Datasets of ncRNAs and CDS of microalgae

Group of microalgae Types of sequences Training dataset Training dataset after balancing Test dataset
Diatom ncRNAs 1234 1125a 308
CDS 356 1125* 88
Golden algae ncRNAs 168 1125* 41
CDS 60 1125* 15
Green algae ncRNAs 1973 1125a 493
CDS 6818 1125a 1704
Cyanobacteria ncRNAs 13,116 3375a 3280
CDS 5448 3375a 1363

aData generated by random selection; * Data generated by SMOTE