Table 11. InSiGHT variant extraction coverage result when only articles with variants in the reference set as well as at least one result identified by the information extraction tool are considered.
The table shows the number of common articles (PMIDs), the number of variants in the reference set (Total), the number of matched variants by the mutation extraction tool (Matched) and the proportion of matched variants (Recall). When relaxing the gene matching (M NG), the results do not change, thus this data is not shown. The data sets considered are MEDLINE abstracts (medline), Open Access PMC articles (pmc.ft), PDF articles when no Open Access PMC articles are available (pdf), PDF representation for all the articles (pdf.all), tables available from the Open Access PMC Articles’ XML (table), supplementary material (sup) and the combination from all the sources (all). The tools are Extractor of Mutations (EMU), OpenMutationMiner (OMM), MutationFinder (MF), tmVar and SNP Extraction Tool for Human Variations (SETH). The row with tool value as All indicates the result when the variants extracted by all the tools are merged.
Data set | Tool | PMIDs | Total | Matched | Recall |
---|---|---|---|---|---|
medline
medline medline medline medline |
EMU
OMM MF SETH tmVar |
2
2 0 1 2 |
2
16 0 1 16 |
1
4 0 1 4 |
0.5000
0.2500 0.0000 1.0000 0.2500 |
medline | All | 3 | 17 | 5 | 0.2941 |
pmc.ft
pmc.ft pmc.ft pmc.ft pmc.ft |
EMU
OMM MF SETH tmVar |
8
4 2 2 6 |
179
148 132 56 149 |
23
5 1 3 22 |
0.1285
0.0338 0.0076 0.0536 0.1477 |
pmc.ft | All | 9 | 234 | 26 | 0.1111 |
pdf
|
EMU
OMM MF SETH tmVar |
3
2 2 0 3 |
18
14 14 0 18 |
7
7 7 0 7 |
0.3889
0.5000 0.5000 0.0000 0.3889 |
All | 3 | 18 | 8 | 0.4444 | |
pdf.all
pdf.all pdf.all pdf.all pdf.all |
EMU
OMM MF SETH tmVar |
11
8 6 2 11 |
251
241 185 56 251 |
41
43 13 33 64 |
0.1633
0.1784 0.0703 0.5893 0.2550 |
pdf.all | All | 11 | 251 | 85 | 0.3386 |
table
table table table table |
EMU
OMM MF SETH tmVar |
4
4 3 1 2 |
197
197 142 55 118 |
39
31 1 36 30 |
0.1980
0.1574 0.0070 0.6545 0.2542 |
table | All | 4 | 197 | 74 | 0.3756 |
sup
sup sup sup sup |
EMU
OMM MF SETH tmVar |
1
1 1 0 1 |
103
103 103 0 103 |
88
0 0 0 92 |
0.8544
0.0000 0.0000 0.0000 0.8932 |
sup | All | 1 | 103 | 92 | 0.8932 |
all
all all all all |
EMU
OMM MF SETH tmVar |
12
8 6 2 11 |
252
241 185 56 251 |
127
43 13 37 131 |
0.5040
0.1784 0.0703 0.6607 0.5219 |
all | All | 12 | 252 | 157 | 0.6230 |