Skip to main content
. 2022 Jul 7;11(7):1024. doi: 10.3390/biology11071024

Table 1.

Selected features (n = 16) identified using the most parsimonious and performant model (prediction accuracy = 90.9%).

Id Selected Features Description
Genome bp_genome_total Genome size
bp_genA Total number of Adenines (within the genome)
bp_genT Total number of Thymines (within the genome)
fr_genG Frequency of Guanines (number of Guanines divided by DNA total length) within the genome
genomic_shannon_score Shannon’s Entropy of total genome sequence
CDS n_cds_total Total number of CDS elements (Coding DNA Sequences)
bp_cds_total Total number of CDS nucleotides
bp_cdsA Total number of CDS Adenines
bp_cdsG Total number of CDS Cytosines
bp_cdsT Total number of CDS Thymines
cds_chargaff_score_ct Chargaff’s Second Parity rule score of total CDS sequence (ct method)
cds_chargaff_score_pf Chargaff’s Second Parity rule score of total CDS sequence (pf method)
cds_shannon_score Shannon Entropy value of total CDS sequence
tRNA tRNA_chargaff_score_ct Chargaff’s Second Parity rule score of total tRNA sequence (ct method)
tRNA_chargaff_score_pf Chargaff’s Second Parity rule score of total tRNA sequence (pf method)
tRNA_shannon_score Shannon’s Entropy value of total tRNA sequence