Skip to main content
. 2023 Jun 1;9:e1286. doi: 10.7717/peerj-cs.1286

Table 7. Comparison with datasets available in literature.

Dataset Number of functions Evaluated task Open source
BinKit Kim et al. (2020) 75,230,573 Yes
In nomine function Artuso et al. (2021) 8,861,407 Yes
α Diff Liu et al. (2018) 4,979,586 Yes
BinBench 4,408,191 ①, ②, ③, ④, ⑤ Yes
SAFE Massarelli et al. (2019b) 548,133 ①, 581,640 ②, 1,587,648 ③ ①, ②, ③ Yes
Graph embedding NNs Massarelli et al. (2019a) 95,535 ①, 2,040,246 ③ ①, ③ Yes
Toolchain provenance Rosenblum, Miller & Zhu (2011) 955,000 No
Asm2Vec Ding, Fung & Charland (2019) 139,936 No
Gemini Xu et al. (2017) 129,365 No
Eklavya Chua et al. (2017) 119,352 Yes
NERO David, Alon & Yahav (2020) 67,246 Yes
Debin He et al. (2018) 238 No

Note:

Evaluated tasks: ①, binary similarity; ②, function search; ③, compiler provenance; ④, function naming; ⑤, signature recovery.