Skip to main content
. 2022 Jun 13;9:817517. doi: 10.3389/fmolb.2022.817517

TABLE 2.

ORF prediction and evaluation related calculation tools.

Name Characteristics Website References
CPC Use sequence features and support vector machines (SVM) to evaluate the protein coding potential of transcripts; assessing the scope, quality, integrity of ORFs http://cpc.cbi.pku.edu.cn Kong et al. (2007)
sORF finder Package for identifying sORF with high encoding potential http://evolver.psc.riken.jp/ Hanada et al. (2010)
PhyloCSF Based on the formal statistical comparison of phylogenetic codon models, the nucleotide sequence alignment of multiple species is analyzed to determine whether it may represent a conserved protein coding region; it can delimit likely protein-coding ORFs within transcript models that include untranslated regions http://compbio.mit.edu/PhyloCSF Lin et al. (2011)
RNAcode Comparison of conserved regions in coding and non-coding regions in sequence data and evaluation of coding potential; analysis of sORF or bifunctional RNAs http://wash.github.com/rnacode Washietl et al. (2011)
CNCI Classification of protein-coding and long non-coding transcripts using sequence intrinsic composition (adjacent nucleotide triplets) (SVM-based) http://www.bioinfo.org/software/cnci Sun L et al. (2013)
CPAT The coding potential assessment tool uses a permutation-free logistic regression model that can ORFs size and coverage to be assessed http://code.google.com/p/cpat/ Wang T et al. (2013)
iSeeRNA Identification of long intergenic non-coding RNA (lincRNA) transcripts in transcriptome sequencing data (SVM-based) http://www.myogenesisdb.org/iSeeRNA Sun K et al. (2013)
PLEK Efficient alignment-free computational tool for differentiating coding and non-coding transcripts in RNA-seq transcriptomes of species lacking a reference genome (SVM-based) https://sourceforge.net/projects/plek/files/ Li et al. (2014)
LncRNA-ID The tool calculates the coding potential of transcripts based on a machine learning model (random forest) and multiple features https://github.com/zhangy72/LncRNA-ID Achawanantakun et al. (2015)
lncRNA-MFDL By fusing multiple features and using deep learning classification algorithms to identify human lncRNA, coding and long non-coding RNA can be quickly distinguished http://compgenomics.utsa.edu/lncRNA_MDFL/ Fan and Zhang, (2015)
COME A multi-feature-based coding potential calculation tool for lncRNA coding potential assessment https://github.com/lulab/COME Hu et al. (2017)
CPC2 A fast and accurate coding potential calculator based on intrinsic sequence features for ORF feature evaluation (SVM-based) http://cpc2.cbi.pku.edu.cn Kang et al. (2017)
CNIT A tool for identifying protein coding and long non-coding transcripts based on intrinsic sequence composition (upgraded version of CNCI) http://cnit.noncode.org/CNIT Guo et al. (2019)
ORF Finder A software provided by NCBI that performs six-frame translation of a nucleotide sequence, allowing all possible ORFs to be inferred https://www.ncbi.nlm.nih.gov/orffinder/ Sayers et al. (2021)