Skip to main content
. 2021 Feb 2;19:1145–1153. doi: 10.1016/j.csbj.2021.01.041

Table 1.

Sequence-based protein domain identification methods.

Category Method Description Year URL Reference
Homology-based CHOP Search target sequences against PDB, Pfam-A, and SWISS-PROT to find templates. 2004 http://www.rostlab.org/services/CHOP/ [6]
DomPred Combine homology and secondary structure element alignment to find templates. 2005 http://bioinf.cs.ucl.ac.uk/software.html [15]
SSEP-Domain Based on the secondary structure elements alignment and profile-profile alignment. 2006 http://www.bio.ifi.lmu.de/SSEP/ [16]
ThreaDom Deduce domain boundary locations based on multiple threading alignments. 2013 http://zhanglab.ccmb.med.umich.edu/ThreaDom/ [18]
CLADE Identifie domains by a multi-source strategy which combining multiple HMMs profile. 2016 http://www.lcqb.upmc.fr/CLADE [13]
MetaCLADE A multi-source domain annotation tool for metagenomic dataset. 2018 http://www.lcqb.upmc.fr/metaclade [14]
Ab initio methods Domain Guess by Size Detect domain boundaries based on the distributions of chain and domain lengths. 2000 [26]
CHOPnet Feed-forward neural network that uses amino acid composition and secondary structure and solvent accessibility as features. 2004 [27]
PPRODO Feed-forward neural network that uses position-specific scoring matrix (PSSM) generated by PSI-BLAST as features. 2005 http://gene.kias.re.kr/~jlee/pprodo/ [28]
DOMpro RNN uses secondary structure and solvent accessibility as features. 2005 http://www.igb.uci.edu/servers/psss.html [31]
KemaDom Combine three SVM classifiers that use different features as inputs to predict domain boundaries. 2006 http://www.iipl.fudan.edu.cn/lschen/kemadom.htm [30]
DomainDiscovery SVM uses inter-domain linker index, PSSM, secondary structural, and solvent accessibility as features. 2006 [33]
IGRN An improved general regression network model that is trained by the information of PSSM, interdomain linker index, secondary structure, and solvent accessibility. 2008 [32]
DomSVR Sequence is encoded by physicochemical and biological properties. SVR uses encoded sequence to predict domain boundary. 2010 [35]
DoBo SVM uses evolutionary domain boundary signals embedded in homologous proteins as input features. 2011 http://sysbio.rnet.missouri.edu/dobo/ [34]
DROP An SVM to predict domain linkers using 25 optimal features selected from a set of 3000 features. 2011 http://tuat.ac.jp/~domserv/DROP.html [37]
DomHR Identify domain boundaries in proteins by defining the edge of domain and boundary regions as a hinge region. 2013 http://cal.tongji.edu.cn/domain/ [39]
PDP-CON Combine predicted results from six single domain boundary prediction methods. 2016 https://cmaterju.org/cmaterbioinfo/ [38]
ConDo Use long-range, coevolutionary features to train neural networks. 2018 https://github.com/gicsaw/ConDo.git [40]
DNN-Dom Combine CNN and BGRU to predict domain boundary by combining amino acid composition information, PSSM, solvent accessibility, and secondary structure. A balanced Random Forest is used to solve the imbalance samples problem. 2019 http://isyslab.info/DNN-Dom/ [42]
DeepDom Use sequences information encoded by physical–chemical properties to train a bidirectional LSTM model to predict domain boundaries. 2019 https://github.com/yuexujiang/DeepDom [41]
FuPred Predict protein domain boundaries using predicted contact maps generated by ANN. 2020 https://zhanglab.ccmb.med.umich.edu/FUpred [44]