Table 2.
Evaluation benchmark, artificial faults: the Fault Type can be real (R) or artificial (A); the fault Id identifies subject MNIST mutants (M), CIFAR mutants (C), Reuters mutants(R), Udacity mutants (U), and Speaker Recongnition (S); Source shows the dataset of origin; the models are divided into two groups of classification (C) or regression (R) task
| Fault Type | Id | SO Post # | Source | Task | Faults |
|---|---|---|---|---|---|
| /Subject | |||||
| A | M1 | MN | DeepCrime | C | Wrong weights initialisation (0) |
| A | M2 | MN | DeepCrime | C | Wrong activation function (7) |
| A | M3 | MN | DeepCrime | C | Wrong learning rate |
| A | C1 | CF10 | DeepCrime | C | Wrong activation function (2) |
| A | C2 | CF10 | DeepCrime | C | Wrong number of epochs |
| A | C3 | CF10 | DeepCrime | C | Wrong weights initialisation (2) |
| A | R1 | RT | DeepCrime | C | Wrong weights regularisation (0) |
| A | R2 | RT | DeepCrime | C | Wrong activation function (2) |
| A | R3 | RT | DeepCrime | C | Wrong learning rate |
| A | R4 | RT | DeepCrime | C | Wrong loss function |
| A | R5 | RT | DeepCrime | C | Wrong optimiser |
| A | R6 | RT | DeepCrime | C | Wrong weights initialisation (0) |
| A | R7 | RT | DeepCrime | C | Wrong activation function (2) |
| A | U1 | UD | DeepCrime | R | Wrong loss function |
| A | U2 | UD | DeepCrime | R | Wrong optimiser |
| A | S1 | SR | DeepCrime | C | Wrong loss function |
| A | S2 | SR | DeepCrime | C | Wrong number of epochs |