Table 10. COSMIC variant extraction coverage result when only articles with variants in the reference set as well as at least one result identified by the information extraction tool are considered.
The table shows the number of common articles (PMIDs), the number of variants in the reference set (Total), the number of matched variants by the mutation extraction tool (Matched), the proportion of matched variants (Recall), the number of variants matched when the gene is not considered (M NG) and the proportion of matched variants when the gene is not considered (Rec NG). The data sets considered are MEDLINE abstracts (medline), Open Access PMC articles (pmc.ft), PDF articles when no Open Access PMC articles are available (pdf), PDF representation for all the articles (pdf.all), tables available from the Open Access PMC Articles’ XML (table), supplementary material (sup) and the combination from all the sources (all). The tools are Extractor of Mutations (EMU), OpenMutationMiner (OMM), MutationFinder (MF), tmVar and SNP Extraction Tool for Human Variations (SETH). The row with tool value as All indicates the result when the variants extracted by all the tools are merged.
Data set | Tool | PMIDs | Total | Matched | Recall | M NG | Rec NG |
---|---|---|---|---|---|---|---|
medline
medline medline medline medline |
EMU
OMM MF SETH tmVar |
128
109 104 16 114 |
627
483 498 80 515 |
146
140 126 25 139 |
0.2329
0.2899 0.2530 0.3125 0.2699 |
157
147 137 26 145 |
0.2504
0.3043 0.2751 0.3250 0.2816 |
medline | All | 137 | 676 | 156 | 0.2308 | 169 | 0.2500 |
pmc.ft
pmc.ft pmc.ft pmc.ft pmc.ft |
EMU
OMM MF SETH tmVar |
339
299 282 67 308 |
31742
31405 30415 1425 31391 |
726
697 632 141 655 |
0.0229
0.0222 0.0208 0.0989 0.0209 |
758
726 658 148 682 |
0.0239
0.0231 0.0216 0.1039 0.0217 |
pmc.ft | All | 351 | 31848 | 814 | 0.0256 | 853 | 0.0268 |
pdf
|
EMU
OMM MF SETH tmVar |
61
8 16 2 32 |
474
64 131 10 294 |
34
1 5 6 4 |
0.0717
0.0156 0.0382 0.6000 0.0136 |
47
1 6 6 5 |
0.0992
0.0156 0.0458 0.6000 0.0170 |
all | 64 | 505 | 34 | 0.0673 | 47 | 0.0931 | |
pdf.all
pdf.all pdf.all pdf.all pdf.all |
EMU
OMM MF SETH tmVar |
439
341 330 83 379 |
33134
32428 31774 1555 32745 |
1094
1132 989 246 1049 |
0.0330
0.0349 0.0311 0.1582 0.0320 |
1114
1137 996 247 1060 |
0.0336
0.0351 0.0313 0.1588 0.0324 |
pdf.all | All | 446 | 33295 | 1304 | 0.0392 | 1327 | 0.0399 |
table
table table table table |
EMU
OMM MF SETH tmVar |
197
166 146 38 90 |
1946
1765 1505 509 1128 |
580
597 462 179 176 |
0.2980
0.3382 0.3070 0.3517 0.1560 |
681
699 564 207 233 |
0.3499
0.3960 0.3748 0.4067 0.2066 |
table | All | 211 | 2019 | 694 | 0.3437 | 831 | 0.4116 |
sup
sup sup sup sup |
EMU
OMM MF SETH tmVar |
77
86 76 37 73 |
27888
31156 30897 28088 30285 |
19177
20054 1286 21052 7763 |
0.6876
0.6437 0.0416 0.7495 0.2563 |
19217
20116 1308 21089 7782 |
0.6891
0.6457 0.0423 0.7508 0.2570 |
sup | All | 106 | 31564 | 22756 | 0.7209 | 22829 | 0.7233 |
all
all all all all |
EMU
OMM MF SETH tmVar |
450
353 344 109 388 |
33409
32887 32623 29047 33008 |
20203
20960 2087 21335 8724 |
0.6047
0.6373 0.0640 0.7345 0.2643 |
20284
21040 2133 21379 8762 |
0.6071
0.6398 0.0654 0.7360 0.2655 |
all | All | 458 | 33676 | 23859 | 0.7085 | 23969 | 0.7118 |