Skip to main content
. 2024 Oct 17;10:e2270. doi: 10.7717/peerj-cs.2270

Table 1. The description of the BugHunter dataset before and after preprocessing.

Name of the project in the BugHunter Total # of instances before prep. # of faulty instances before prep. # of non-faulty instances before prep. Faulty ratio (%) Imb. ratio # of software metrics after prep. Total # of instances after prep. # of faulty instances after prep. # of non- faulty instances after prep.
ceylon-ide-eclipse 2,087 508 1,579 24.34 3.11 58 2,972 1,393 1,579
BroadleafCommerce 4,709 1,025 3,684 21.77 3.59 61 6,824 3,140 3,684
hazelcast 32,973 12,093 20,880 36.68 1.73 61 39,923 19,043 20,880
elasticsearch 35,862 11,950 23,912 33.32 2 62 45,497 21,585 23,912
MapDB 1,456 480 976 32.97 2.03 59 1,842 866 976
netty 11,171 2,434 8,737 21.79 3.59 59 16,207 7,470 8,737
orientdb 9,445 2,589 6,856 27.41 2.65 61 12,911 6,055 6,856
neo4j 7,030 1,841 5,189 26.19 2.82 59 9,704 4,515 5,189
titan 785 168 617 21.4 3.67 61 1,147 530 617
mcMMO 1,184 411 773 34.71 1.88 55 1,493 720 773
Android-Universal-Image-Loader 325 103 222 31.69 2.16 51 415 193 222
antlr4 840 102 738 12.14 7.24 56 1,350 612 738
junit 462 87 375 18.83 4.31 57 695 320 375
mct 105 25 80 23.81 3.2 53 143 63 80
oryx 810 77 733 9.51 9.52 55 1,350 617 733