Skip to main content
. 2020 Aug 27;15(8):e0237861. doi: 10.1371/journal.pone.0237861

Fig 3. Waseem-samples’ embeddings analysis before and after fine-tuning.

Fig 3

To investigate the impact of information included in different layers of BERT, sentence embeddings are extracted from all the layers of the pre-trained BERT model fine-tuning, using the bert-as-service tool. Embedding vectors of size 768 are visualized to a two-dimensional visualization of the space of all Waseem-dataset samples using PCA method. For sake of clarity, we just include visualization of the first 4 layers (1-4), which are close to the training output, and the last 4 layers (9-12), which are close to the word embedding, of the pre-trained BERT model before and after fine-tuning.