Skip to main content
. 2022 Jan 18;12:805656. doi: 10.3389/fgene.2021.805656

TABLE 1.

Four major genomic feature groups characterizing the SNVs in gold standard datasets.

Feature group Description Example
Structural and genomic context features Characterizing sequence attributes of the mutation location. These features estimate the disruption in the mutations surrounding sequence both in coding and non-coding regions Percentage of GC in a ±75 bp window
Epigenetic features Describing epigenetic changes such as histone modifications and methylation alterations Maximum H3K4 methylation level from Encode
Genomic distance features Measuring the distance between a given SNV and critical functional and structural genomic elements such as transcription start and end sites Minimum distance to Transcribed Sequence Start (TSS)
Genomic conservation features Measuring the evolutionary conservation at the mutation alignment sites in an effort to help the training models learn the relationships between the measurements and pathogenicity of the SNVs Scores from PhastCons and Phylop