Skip to main content
. 2010 Nov 27;39(Database issue):D730–D735. doi: 10.1093/nar/gkq1229

Table 1.

Sources of DOMINE database contents

Method/source Number of DDIs Description
iPfam 4030 iPfam contains a collection of DDIs that are observed in PDB entries. Data, dated 17 February 2007, were used.
3dida 6066 3did is a collection of DDIs in proteins for which high-resolution 3D structures are known. Data, downloaded in September 2010, were used.
ME 2391 ME refers to a Bayesian approach that integrates DDIs predicted using a maximum likelihood estimation approach on yeast, worm, fruit fly and human PPI networks with gene ontology and domain fusion data.
RCDP 960 The RCDP approach uses sequence coevolution to predict the domain pair that is most likely to mediate a given PPI. Given a PPI, RCDP predicts the domain pair with the highest degree of co-evolution to be the mediating domain pair. Set of DDIs predicted from 1180 yeast PPIs (Raghavachari data set) was used.
P-value 596 P-value refers to the statistical approach that assigns P-values to pairs of SCOP domain superfamilies based on the strength of evidence within a set of PPIs. These P-values for domain pairs were used to predict 705 DDIs between SCOP domains from protein complexes in the Protein Quaternary Structure (PQS) database, which were converted to 596 DDIs between Pfam domains.
Fusion 2768 DDIs inferred using domain fusion hypothesis as reported in the Interdom database (v1.1) were used.
DPEA 1812 DPEA is a statistical approach to infer DDIs from PPI networks from many organisms. It uses an expectation–maximization algorithm to obtain probability of interaction for each potentially interacting domain pair, and computes the change in likelihood, expressed as a log odds score, by excluding this domain pair from being considered as a potentially interacting domain pair. DPEA was applied on PPI networks from 69 organisms (Riley data set), and the set of DDIs only between Pfam-A domains with log odds score ≥3.0 was used.
PE 2588 PE is an optimization approach based on the assumption that the set of true DDIs are well approximated by the minimum set of DDIs that can justify every PPI in a PPI network. Given a PPI network, the PE approach uses linear programming to compute the LP score for every domain pair that could possibly justify interaction between two proteins, and a P-score to account for false positives in the PPI network. PE was applied on the Riley data set, and the set of DDIs only between Pfam-A domains with LP score ≥0.5 and P-score ≤0.1 was used.
GPEb 1563 GPE builds upon the PE approach by unifying domains that always occur together in a protein as a singular ‘supra-domain', and uses the linear programming framework as used by PE. GPE was applied on the redefined Riley data set (Guimaraes data set), and the set of DDIs only between Pfam-A domains with LP score ≥0.60 and pw-score ≤0.01 was used. Supra-domains were expanded back to individual Pfam-A domains.
DIPDb 2157 DIPD constructs feature vectors for each protein pair within the sets of PPIs (Riley data set) and non-PPIs, and uses a discriminative classifier to identify the minimum set of domain pairs/triplets that can discriminate PPIs and non-PPIs. Each selected feature (domain pair) is a putative DDI. The sets of predictions on Raghavachari, Riley and Guimaraes data sets were used.
RDFF 2475 Chen and Liu's Random Decision Forest Framework (RDFF) approach explores all possible DDIs and predicts PPIs based on protein domains. The decision tree-based model is used to infer DDIs for each correctly predicted PPI. The set of DDIs only between Pfam-A domains was used.
K-GIDDIb 386 K-GIDDI uses gene ontology information to construct an initial DDI network using the top s% of DDIs inferred from cross-species PPI networks, and then expands the DDI network by predicting additional DDIs using a graph theoretical approach based on a parameter b. The latter allows for prediction of DDIs that are otherwise not predictable by methods that rely solely on PPI data. The set of DDIs predicted using s = 10 and b = 50 was used.
Insiteb 2408 Insite uses a naïve Bayes model to build upon features in DPEA. Its novel formulation of evidence models for PPIs and DDIs helps address noise (false positives) generated by high-throughput assays.
DomainGAb 459 DomainGA is a genetic algorithm-type machine learning approach based on multi-parameter optimization. It uses the available PPI data to compute a score for domain pairs, which are then used to predict PPIs. Yeast PPI data set was used to identify 867 putative DDIs between domains defined based on information derived from the Interpro database. The set of 459 DDIs only between Pfam domains was used.
DIMA 8012 DIMA predicts DDIs based on phylogenetic profiling of presence/absence of domains in many organisms.

aUpdated dataset.

bNew dataset.