Skip to main content
. Author manuscript; available in PMC: 2024 Oct 24.
Published in final edited form as: Proc Conf. 2024 Jun;2024:7193–7210. doi: 10.18653/v1/2024.naacl-long.399

Figure 1: Inherently “interpretable” approaches to prediction.

Figure 1:

Typically, ‘interpretable’ models trade off between the expressiveness of intermediate representations and the faithfulness of the resulting interpretability to the models’ true mechanisms. Our approach (D) manages to use very expressive intermediate representations in the form of abstractive natural language evidence while still maintaining true transparency during aggregation of this evidence. See Table 1 for more details.