Table 1. The description of the BugHunter dataset before and after preprocessing.
| Name of the project in the BugHunter | Total # of instances before prep. | # of faulty instances before prep. | # of non-faulty instances before prep. | Faulty ratio (%) | Imb. ratio | # of software metrics after prep. | Total # of instances after prep. | # of faulty instances after prep. | # of non- faulty instances after prep. |
|---|---|---|---|---|---|---|---|---|---|
| ceylon-ide-eclipse | 2,087 | 508 | 1,579 | 24.34 | 3.11 | 58 | 2,972 | 1,393 | 1,579 |
| BroadleafCommerce | 4,709 | 1,025 | 3,684 | 21.77 | 3.59 | 61 | 6,824 | 3,140 | 3,684 |
| hazelcast | 32,973 | 12,093 | 20,880 | 36.68 | 1.73 | 61 | 39,923 | 19,043 | 20,880 |
| elasticsearch | 35,862 | 11,950 | 23,912 | 33.32 | 2 | 62 | 45,497 | 21,585 | 23,912 |
| MapDB | 1,456 | 480 | 976 | 32.97 | 2.03 | 59 | 1,842 | 866 | 976 |
| netty | 11,171 | 2,434 | 8,737 | 21.79 | 3.59 | 59 | 16,207 | 7,470 | 8,737 |
| orientdb | 9,445 | 2,589 | 6,856 | 27.41 | 2.65 | 61 | 12,911 | 6,055 | 6,856 |
| neo4j | 7,030 | 1,841 | 5,189 | 26.19 | 2.82 | 59 | 9,704 | 4,515 | 5,189 |
| titan | 785 | 168 | 617 | 21.4 | 3.67 | 61 | 1,147 | 530 | 617 |
| mcMMO | 1,184 | 411 | 773 | 34.71 | 1.88 | 55 | 1,493 | 720 | 773 |
| Android-Universal-Image-Loader | 325 | 103 | 222 | 31.69 | 2.16 | 51 | 415 | 193 | 222 |
| antlr4 | 840 | 102 | 738 | 12.14 | 7.24 | 56 | 1,350 | 612 | 738 |
| junit | 462 | 87 | 375 | 18.83 | 4.31 | 57 | 695 | 320 | 375 |
| mct | 105 | 25 | 80 | 23.81 | 3.2 | 53 | 143 | 63 | 80 |
| oryx | 810 | 77 | 733 | 9.51 | 9.52 | 55 | 1,350 | 617 | 733 |