. 2009 Nov 20;10:381. doi: 10.1186/1471-2105-10-381

Table 1.

Performance of HM-SVM versus other methods on all data sets

Data set	Method	Specificity⁺(random)^a	Sensitivity⁺(random)^b	F1	Accuracy	MCC	AUC	Time (s)^c
Hetero-complex I^d	ANN	37.6% (28.1%)	59.4% (16.7%)	46.0%	60.9%	18.9%	64.5%	326
	SVM	38.4% (28.1%)	59.8% (16.8%)	46.8%	61.8%	20.2%	65.4%	179461
	CRF	42.6% (28.1%)	55.2% (15.5%)	48.0%	66.5%	24.4%	65.3%	12151
	HM-SVM	44.9% (28.1%)	56.0% (15.7%)	49.8%	68.3%	27.4%	69.5%	356

Homo-complex I	ANN	39.0% (27.0%)	58.4% (15.8%)	46.6%	63.9%	22.1%	67.0%	586
	SVM	39.6% (27.0%)	61.9% (16.7%)	48.3%	64.2%	24.2%	68.6%	224979
	CRF	45.1% (27.0%)	59.2% (16.0%)	51.2%	69.5%	30.2%	67.6%	16961
	HM-SVM	45.4% (27.0%)	60.0% (16.2%)	51.7%	69.7%	30.9%	72.2%	588

Mix^eI	ANN	40.3% (27.5%)	51.4% (14.1%)	44.7%	65.4%	20.8%	65.8%	1242
	SVM	39.5% (27.5%)	61.5% (16.9%)	48.1%	63.6%	23.3%	67.6%	831579
	CRF	44.3% (27.5%)	57.5% (15.8%)	49.9%	68.4%	28.0%	66.8%	28364
	HM-SVM	45.5% (27.5%)	58.0% (15.9%)	51.0%	69.4%	29.7%	71.2%	891

Hetero-complex II^f	ANN	45.9% (34.9%)	60.5% (21.1%)	52.1%	61.3%	21.3%	65.8%	604
	SVM	47.9% (34.9%)	61.6% (21.5%)	53.9%	63.2%	24.6%	67.7%	160625
	CRF	51.6% (34.9%)	57.6% (20.1%)	54.3%	66.3%	28.0%	67.3%	13441
	HM-SVM	54.0% (34.9%)	56.7% (19.8%)	55.3%	68.0%	30.5%	70.7%	464

Homo-complex II	ANN	43.9% (32.3%)	66.7% (21.5%)	52.8%	61.5%	24.1%	68.1	856
	SVM	47.1% (32.3%)	63.1% (20.4%)	54.0%	65.2%	27.7%	70.2%	554054
	CRF	52.5% (32.3%)	59.7% (19.3%)	55.9%	69.5%	32.9%	68.7%	18124
	HM-SVM	53.3% (32.3%)	60.1% (19.4%)	56.5%	70.1%	34.0%	73.4%	851

Mix II	ANN	46.5% (33.3%)	53.4% (17.9%)	49.4%	63.7%	21.7%	65.8%	1260
	SVM	47.5% (33.3%)	62.3% (20.8%)	53.9%	64.5%	26.5%	69.2%	1316103
	CRF	52.2% (33.3%)	58.6% (19.5%)	55.2%	68.3%	30.9%	68.1%	856765
	HM-SVM	53.6% (33.3%)	58.6% (19.6%)	56.0%	69.3%	32.6%	72.4%	1320

Specificity⁺= TP/(TP+FP); Sensitivity⁺= TP/(TP+FN); F1 = 2 × Specificity⁺× Sensitivity⁺/(Specificity⁺+Sensitivity⁺); Accuracy = (TP+TN)/(TP+TN+FP+FN); MCC = (TP × TN-FP × FN)/ Inline graphic ; AUC: Area Under ROC Curve [61]. Where TP is the number of true positives (residues predicted to be interface residues that actually are interface residues); FP the number of false positives (residues predicted to be interface residues that are in fact not interface residues); TN the number of true negatives; FN the number of false negatives.

^aValues in parentheses are randomly predicted values. The specificity⁺of random prediction is calculated as: the total number of interaction sites residues/the total number of residues.

^bValues in parentheses are randomly predicted values. The sensitivity⁺of random prediction is calculated as: the total number of predicted residues as interaction sites by each method/the total number of residues.

^cThe total running time (second) for 5-fold cross-validation, including training and testing.

^dType I data set with minor interface as negative samples.

^eThe mixed data set of hetero-complexes and homo-complexes.

^fType II data set with minor interface as positive samples.