Table 10. COSMIC variant extraction coverage result when only articles with variants in the reference set as well as at least one result identified by the information extraction tool are considered.
The table shows the number of common articles (PMIDs), the number of variants in the reference set (Total), the number of matched variants by the mutation extraction tool (Matched), the proportion of matched variants (Recall), the number of variants matched when the gene is not considered (M NG) and the proportion of matched variants when the gene is not considered (Rec NG). The data sets considered are MEDLINE abstracts (medline), Open Access PMC articles (pmc.ft), PDF articles when no Open Access PMC articles are available (pdf), PDF representation for all the articles (pdf.all), tables available from the Open Access PMC Articles’ XML (table), supplementary material (sup) and the combination from all the sources (all). The tools are Extractor of Mutations (EMU), OpenMutationMiner (OMM), MutationFinder (MF), tmVar and SNP Extraction Tool for Human Variations (SETH). The row with tool value as All indicates the result when the variants extracted by all the tools are merged.
| Data set | Tool | PMIDs | Total | Matched | Recall | M NG | Rec NG | 
|---|---|---|---|---|---|---|---|
| medline medline medline medline medline | EMU OMM MF SETH tmVar | 128 109 104 16 114 | 627 483 498 80 515 | 146 140 126 25 139 | 0.2329 0.2899 0.2530 0.3125 0.2699 | 157 147 137 26 145 | 0.2504 0.3043 0.2751 0.3250 0.2816 | 
| medline | All | 137 | 676 | 156 | 0.2308 | 169 | 0.2500 | 
| pmc.ft pmc.ft pmc.ft pmc.ft pmc.ft | EMU OMM MF SETH tmVar | 339 299 282 67 308 | 31742 31405 30415 1425 31391 | 726 697 632 141 655 | 0.0229 0.0222 0.0208 0.0989 0.0209 | 758 726 658 148 682 | 0.0239 0.0231 0.0216 0.1039 0.0217 | 
| pmc.ft | All | 351 | 31848 | 814 | 0.0256 | 853 | 0.0268 | 
| pdf | EMU OMM MF SETH tmVar | 61 8 16 2 32 | 474 64 131 10 294 | 34 1 5 6 4 | 0.0717 0.0156 0.0382 0.6000 0.0136 | 47 1 6 6 5 | 0.0992 0.0156 0.0458 0.6000 0.0170 | 
| all | 64 | 505 | 34 | 0.0673 | 47 | 0.0931 | |
| pdf.all pdf.all pdf.all pdf.all pdf.all | EMU OMM MF SETH tmVar | 439 341 330 83 379 | 33134 32428 31774 1555 32745 | 1094 1132 989 246 1049 | 0.0330 0.0349 0.0311 0.1582 0.0320 | 1114 1137 996 247 1060 | 0.0336 0.0351 0.0313 0.1588 0.0324 | 
| pdf.all | All | 446 | 33295 | 1304 | 0.0392 | 1327 | 0.0399 | 
| table table table table table | EMU OMM MF SETH tmVar | 197 166 146 38 90 | 1946 1765 1505 509 1128 | 580 597 462 179 176 | 0.2980 0.3382 0.3070 0.3517 0.1560 | 681 699 564 207 233 | 0.3499 0.3960 0.3748 0.4067 0.2066 | 
| table | All | 211 | 2019 | 694 | 0.3437 | 831 | 0.4116 | 
| sup sup sup sup sup | EMU OMM MF SETH tmVar | 77 86 76 37 73 | 27888 31156 30897 28088 30285 | 19177 20054 1286 21052 7763 | 0.6876 0.6437 0.0416 0.7495 0.2563 | 19217 20116 1308 21089 7782 | 0.6891 0.6457 0.0423 0.7508 0.2570 | 
| sup | All | 106 | 31564 | 22756 | 0.7209 | 22829 | 0.7233 | 
| all all all all all | EMU OMM MF SETH tmVar | 450 353 344 109 388 | 33409 32887 32623 29047 33008 | 20203 20960 2087 21335 8724 | 0.6047 0.6373 0.0640 0.7345 0.2643 | 20284 21040 2133 21379 8762 | 0.6071 0.6398 0.0654 0.7360 0.2655 | 
| all | All | 458 | 33676 | 23859 | 0.7085 | 23969 | 0.7118 |