Figure 3.
Variant predicates and genotype classifiers
(A) Variant predicate tests. GPSEA provides predicate functions that test whether a variant, such as c.373G>C (GenBank: NM_181486.4) (p.Gly125Arg) in TBX5 (MIM: 601620), meets a criterion from one of three evidence groups: allele, functional annotation, or protein. For instance, the predicate checks if the variant is a deletion and whether it overlaps with a specific exon or with a protein region of interest.
(B) Boolean algebra. Variant predicates can be combined using AND, OR, and NOT operators of Boolean algebra to test complex criteria. For instance, a predicate for a point mutation can be formulated as a “missense mutation affecting one reference base and change length of zero” (no sequence loss or gain). A predicate for a loss-of-function mutation can be defined as a mutation leading to a transcript ablation, frameshift, introduction of a premature stop codon, or the start codon loss. A predicate for a structural deletion can test whether the variant is either an imprecise chromosomal deletion or a deletion involving 50 or more base pairs (or other thresholds).33
(C) Genotype classifiers. Each classifier splits a cohort into two or more classes to enable genotype-phenotype comparisons. GPSEA ships with five built-in classifiers to classify the cohort members using their sex, diagnosis, a fixed count of alleles of different types (mono-allelic and bi-allelic), or by a different allele count of the same type (allele count).
