Skip to main content
. 2019 Aug 20;14(8):e0221152. doi: 10.1371/journal.pone.0221152

Fig 1. Self-attention weights for the classification token of the trained BERT model for a sample post.

Fig 1

Each color represents a different attention head, and the lightness of the color represents the amount of attention. For instance, the figure indicates that nearly all attention heads focus heavily on the term ‘we’.