. Author manuscript; available in PMC: 2019 Nov 1.

Published in final edited form as: J Biomed Inform. 2018 Sep 12;87:12–20. doi: 10.1016/j.jbi.2018.09.008

TABLE IX:

Pearson correlation coefficient between the similarity scores computed by word embeddings using different dimensions (d) and those assigned by human experts on four datasets.

Dataset	EHR (d=20)	EHR (d=60)	EHR (d=100)	MedLit (d=20)	MedLit (d=60)	MedLit (d=100)	GloVe (d=50)	GloVe (d=100)	Google News (d=300)
Pedersen’s	0.390	0.542	0.632	0.304	0.569	0.363	0.334	0.403	0.357
Hliaoutakis’s	0.333	0.417	0.482	0.117	0.311	0.164	0.159	0.247	0.243
MayoSRS	0.192	0.296	0.412	0.177	0.300	0.154	0.001	0.082	0.084
UMNSRS	0.310	0.375	0.440	0.295	0.404	0.396	0.190	0.177	0.154