TABLE 1.
Feature group | Description | Example |
---|---|---|
Structural and genomic context features | Characterizing sequence attributes of the mutation location. These features estimate the disruption in the mutations surrounding sequence both in coding and non-coding regions | Percentage of GC in a ±75 bp window |
Epigenetic features | Describing epigenetic changes such as histone modifications and methylation alterations | Maximum H3K4 methylation level from Encode |
Genomic distance features | Measuring the distance between a given SNV and critical functional and structural genomic elements such as transcription start and end sites | Minimum distance to Transcribed Sequence Start (TSS) |
Genomic conservation features | Measuring the evolutionary conservation at the mutation alignment sites in an effort to help the training models learn the relationships between the measurements and pathogenicity of the SNVs | Scores from PhastCons and Phylop |