. 2018 Jun 27;34(13):i565–i573. doi: 10.1093/bioinformatics/bty273

Table 1.

Precision, Recall and F1 scores using strict tokenwise evaluation for toponym detection where the NER was trained on D_train and tested on D_test

Configuration	Word embedding	P	R	F1
FFNN 1-layer	No pre-training	0.97	0.65	0.779
	Glove	0.89	0.87	0.883
	Wiki-pm-pmc	0.92	0.82	0.878
FFNN 2-layers	Glove	0.92	0.86	0.891
FFNN 2-layers	Wiki-pm-pmc	0.93	0.88	0.906
FFNN 2-layers + features	Glove	0.94	0.87	0.903
FFNN 2-layers + features	Wiki-pm-pmc	0.96	0.86	0.910
Random forest + features	Wiki-pm-pmc	0.82	0.91	0.862
SVM + features	Wiki-pm-pmc	0.83	0.92	0.875

Bold indicates highest scores in the performance measure.