Table 2.
Evaluation benchmark, artificial faults: the Fault Type can be real (R) or artificial (A); the fault Id identifies subject MNIST mutants (M), CIFAR mutants (C), Reuters mutants(R), Udacity mutants (U), and Speaker Recongnition (S); Source shows the dataset of origin; the models are divided into two groups of classification (C) or regression (R) task
Fault Type | Id | SO Post # | Source | Task | Faults |
---|---|---|---|---|---|
/Subject | |||||
A | M1 | MN | DeepCrime | C | Wrong weights initialisation (0) |
A | M2 | MN | DeepCrime | C | Wrong activation function (7) |
A | M3 | MN | DeepCrime | C | Wrong learning rate |
A | C1 | CF10 | DeepCrime | C | Wrong activation function (2) |
A | C2 | CF10 | DeepCrime | C | Wrong number of epochs |
A | C3 | CF10 | DeepCrime | C | Wrong weights initialisation (2) |
A | R1 | RT | DeepCrime | C | Wrong weights regularisation (0) |
A | R2 | RT | DeepCrime | C | Wrong activation function (2) |
A | R3 | RT | DeepCrime | C | Wrong learning rate |
A | R4 | RT | DeepCrime | C | Wrong loss function |
A | R5 | RT | DeepCrime | C | Wrong optimiser |
A | R6 | RT | DeepCrime | C | Wrong weights initialisation (0) |
A | R7 | RT | DeepCrime | C | Wrong activation function (2) |
A | U1 | UD | DeepCrime | R | Wrong loss function |
A | U2 | UD | DeepCrime | R | Wrong optimiser |
A | S1 | SR | DeepCrime | C | Wrong loss function |
A | S2 | SR | DeepCrime | C | Wrong number of epochs |