Skip to main content
. 2020 Apr 21;15(4):e0231189. doi: 10.1371/journal.pone.0231189

Fig 8. Word embeddings map words in a corpus of text to vector space.

Fig 8

Linear combinations of dimensions in vector space correlate with the semantic and syntactic roles of the words in the corpus. For illustration purposes, dimension d1 in the figure has a high positive correlation with living beings. A properly tuned word embedding model will map words with similar semantic or syntactic roles to adjacent regions in vector space. This property can be visualized through dimensionality reduction techniques such as t-SNE or PCA (see upper right quadrant of the figure). Cultural concepts are also apparent in vector space as consistent offsets between vector representations of words sharing a particular relationship. For instance, in the bottom right of the figure, the dotted vector represents a gender regularity that goes from masculinity to femininity.