Local explanations |
Explains the model’s behaviour in a local area of interest. Operates on instance-level explanations. |
Explanations do not generalize on a global scale. Small perturbations might result in very different explanations. Not easy to define locality. Some approaches face stability issues. |
Examples |
Representative examples provide insights about the model’s internal reasoning. Some of the algorithms uncover the most influential training data points that led the model to its predictions. |
Examples require human inspection. They do not explicitly state what parts of the example influence the model. |
Feature relevance |
They operate on an instance level, calculating the importance of each feature in the model’s decision. A number of the proposed approaches come with appealing theoretical guarantees. |
They are sensitive in cases where the features are highly correlated. In many cases the exact solutions are approximated, leading to undesirable side effects, such as the ordering affecting the outcome. |
Simplification |
Simple surrogate models explain the opaque ones. Resulting explanations, such as rules, are easy to understand. |
Surrogate models may not approximate the original models well. Surrogate models come with their own limitations. |
Visualizations |
Easier to communicate to non technical audience. Most of the approaches are intuitive and not hard to implement. |
There is an upper bound on how many features we can consider at once. Humans need to inspect the resulting plots in order to produce explanations. |