Skip to main content
. 2018 Dec 4;9:1437. doi: 10.3389/fphar.2018.01437

Table 3.

Tools for the prediction of variant effects on splicing, transcript levels or translation.

Algorithm Application Basis of decision Model training or evaluation References
NMD Classifier NMD Prediction of NMD for a given transcript based on comparison to most similar coding transcript Simulation-based evaluation based on screening artificial transcript structure-altering events Hsu et al., 2017
NNSplice Splicing (splice sites) Sequence splice site analysis using HMM Distinguish splice site sequences from sequences in the neighborhood of real splice sites Reese et al., 1997
MaxEntScan Splicing (splice sites) Splice site analysis by modeling short sequence motifs using the maximum entropy principle with constraints estimated from available data. 1,821 transcripts unambiguously aligned across the entire coding region, spanning a total of 12,715 introns Yeo and Burge, 2004
GeneSplicer Splicing (splice sites) Splice site prediction using maximal dependence decomposition with the addition of markov model to capture dependencies among neighboring bases Annotated genes from the Exon-Intron Database Pertea et al., 2001
SplicePort Splicing (splice sites) Splice site prediction using C-modified least squares learning based on positional and compositional sequence features Training on 4,000 pre-mRNA human RefSeq sequences and test on B2Hum data set Dogan et al., 2007
Skippy Splicing (regulatory sequences) Prediction of variants causing exon skipping, exon inclusion or ectopic splice site activation based on sequence information, proximity to splice junctions and evolutionary constraint of the peri-variant region Multiple exonic splicing regulatory elements datasets as positive data and HapMap variants as splicing-neutral variants Woolfe et al., 2010
MutPred Splice Splicing (regulatory sequences) Prediction of auxiliary splice sequences using multiple variant-, flanking exon- and gene-based features Splicing variants from HGMD as pathogenic set and non-splicing variants from both HGMD and 1000G as neutral controls Mort et al., 2014
scSNVEL Splicing (splice sites) Ensemble prediction using 8 algorithms using random forest learning Splice variants from HGMD, SpliceDisease database and DBASS as pathogenic set and variants not implicated in splicing from both HGMD and 1000G as controls Jian et al., 2014b
SPANR Splicing (splice sites and splice regulatory sequences) Integrating 1,393 sequence features from each exon and its neighboring introns and exons to identify splice sites as well as intronic and exonic splice regulators 10,689 exons that displayed evidence of alternative splicing Xiong et al., 2015
CryptSplice Splicing (splice sites) Prediction of cryptic splice-site activation using an SVM model Sequences from the annotated NN269 and HS3D splice datasets with positive sequence in splice sites and control sequence outside splice sites Lee et al., 2017
Corvelo et al. Splicing (branch points) Analysis of splice site sequence conservation and position bias using SVM A set of 8,156 conserved putative branch point sequences from 7 mammalian species Corvelo et al., 2010
BPP Splicing (branch points) Identification of branch point motifs by integrating information on the branch point sequence and the polypyrimidine tract Intron sequences longer than 300 nucleotides Zhang et al., 2017
TurboFold Splicing (pre-mRNA structure) Probabilistic method that integrates comparative sequence analyses with thermodynamic folding models Thorough benchmarking against three methods that estimate base pairing probabilities and eight tools for structural predictions based on known RNA structures Harmanci et al., 2011
CentroidFold Splicing (pre-mRNA structure) RNA secondary structure prediction using the γ-centroid estimator Validation based on 151 RNA experimentally determined RNA structures Sato et al., 2009
mrSNP miRNA binding miRNA binding energy calculations for reference and variant containing sequence and report of binding difference Evaluation based on variants that map to miRNA targets predicted by TargetScan Deveci et al., 2014
PinPor RBP binding Bayesian network approach that incorporates information about sequence features, stabilization of RNA secondary structure and evolutionary conservation Inframe indels from HGMD as pathogenic and common indels from 1000G as neutral controls Zhang et al., 2014

HGMD, Human Gene Mutation Database; 1000G = 1000 Genomes Project; DBASS, Database for Aberrant Splice Sites; NMD, nonsense-mediated decay; HMM, hidden Markov model; RBP, RNA binding protein.