. 2018 Dec 10;35(15):2535–2544. doi: 10.1093/bioinformatics/bty1017

Table 1.

Comparison between DeepIsoFun and MILP/iMILP on different expression datasets in terms of AUC and AUPRC values

	AUC			AUPRC
Dataset method	DeepIsoFun	MILP	iMILP	DeepIsoFun	MILP	iMILP
Dataset#1	0.742	0.620	0.648	0.368	0.271	0.342
Dataset#2	0.734	0.574	0.635	0.270	0.224	0.235
Dataset#3	0.720	0.540	0.674	0.331	0.294	0.311

Note: Dataset#1 was generated from 1735 RNA-Seq experiments by using Kallisto (Bray et al., 2016). Dataset#2 and Dataset#3 were obtained from Eksi et al. (2013) and Li et al. (2014), respectively. The benchmark positive and negative instances of each GO term used in testing were defined by following the procedure in Li et al. (2014). The unlabeled instances were ignored in testing. Both Dataset#1 and Dataset#2 were divided based on read length to create different ‘study groups’. There are 24, 24 and 29 study groups in Dataset#1, Dataset#2 and Dataset#3, respectively. On the average, each study group consists of 71, 16 and 17 SRA experiments in Dataset#1, Dataset#2 and Dataset#3, respectively. As done in Li et al. (2014), a selection algorithm was employed by iMILP to choose a subset of study groups on each dataset optimize its performance.