Table 9. InSiGHT variant extraction coverage result.
The table shows the number of variants in the reference set (Total), the number of matched variants by the mutation extraction tool (Matched) and the proportion of matched variants (Recall). When relaxing the gene matching (M NG), the results do not change, thus this data is not shown. The data sets considered are MEDLINE abstracts (medline), Open Access PMC articles (pmc.ft), PDF articles when no Open Access PMC articles are available (pdf), PDF representation for all the articles (pdf.all), tables available from the Open Access PMC Articles’ XML (table), supplementary material (sup) and the combination from all the sources (all). The tools are Extractor of Mutations (EMU), OpenMutationMiner (OMM), MutationFinder (MF), tmVar and SNP Extraction Tool for Human Variations (SETH). The row with tool value as All indicates the result when the variants extracted by all the tools are merged.
Data set | Tool | Total | Matched | Recall |
---|---|---|---|---|
medline
medline medline medline medline |
EMU
OMM MF SETH tmVar |
252
252 252 252 252 |
1
4 0 1 4 |
0.0040
0.0159 0.0000 0.0040 0.0159 |
medline | All | 252 | 5 | 0.0198 |
pmc.ft
pmc.ft pmc.ft pmc.ft pmc.ft |
EMU
OMM MF SETH tmVar |
252
252 252 252 252 |
23
5 1 3 22 |
0.0913
0.0198 0.0040 0.0119 0.0873 |
pmc.ft | All | 252 | 26 | 0.1032 |
pdf
|
EMU
OMM MF SETH tmVar |
252
252 252 252 252 |
7
7 7 0 7 |
0.0278
0.0278 0.0278 0.0000 0.0278 |
All | 252 | 8 | 0.0317 | |
pdf.all
pdf.all pdf.all pdf.all pdf.all |
EMU
OMM MF SETH tmVar |
252
252 252 252 252 |
41
43 13 33 64 |
0.1627
0.1706 0.0516 0.1310 0.2540 |
pdf.all | All | 252 | 85 | 0.3373 |
table
table table table table |
EMU
OMM MF SETH tmVar |
252
252 252 252 252 |
39
31 1 36 30 |
0.1548
0.1230 0.0040 0.1429 0.1190 |
table | All | 252 | 74 | 0.2937 |
sup
sup sup sup sup |
EMU
OMM MF SETH tmVar |
252
252 252 252 252 |
88
0 0 0 92 |
0.3492
0.0000 0.0000 0.0000 0.3651 |
sup | All | 252 | 92 | 0.3651 |
all | All | 252 | 157 | 0.6230 |