. 2020 Jun 11;12223:3–22. doi: 10.1007/978-3-030-52683-2_1

Table 6.

Granular test results from model with case features and without lexicon. Scores are over each possible label for the model. Label Count describes how many instances of that particular label is present in the test set, and Prediction Count describes how many predictions the model produces for a particular label.

Label	F1	Recall	Precision	Prediction count	Label count
B-versionEndIncluding	0.7817	0.7817	0.7817	875	875
B-version	0.8573	0.8618	0.8527	2655	2627
B-versionStartIncluding	0.7415	0.7238	0.76	100	105
B-product	0.8711	0.8774	0.8649	4840	4771
O	0.9935	0.9931	0.9938	184649	184768
B-versionEndExcluding	0.7987	0.7922	0.8053	303	308
B-vendor	0.9126	0.8951	0.9308	2715	2823
I-version	0.4396	0.3509	0.5882	34	57
B-versionStartExcluding	0	0	0	2	1
I-product	0.8549	0.8812	0.8302	3787	3568
I-vendor	0.5714	0.5	0.6667	111	148
I-versionEndExcluding	0	0	0	0	1
I-versionEndIncluding	0.2581	0.16	0.6667	6	25
I-versionStartExcluding	0	0	0	0	0
I-versionStartIncluding	0	0	0	0	0