Skip to main content
. 2010 Feb 11;11:85. doi: 10.1186/1471-2105-11-85

Table 4.

Performance evaluation of LINNAEUS species tagging on different evaluation sets

Set Level Main set TP FP FN Recall Prec.
NCBI taxonomy Doc. MEDLINE 6,888 10,032 (1,807) 0.7922 (0.4071)
PMC OA abs 15 20 (6) 0.7143 (0.4286)
PMC OA full (abs) 16 166 (3) 0.8421 (0.0791)
PMC OA full (all) 22 196 (4) 0.8462 (0.1010)

MeSH Doc. MEDLINE 5,073,147 4,577,293 2,315,811 0.6866 0.5257
PMC OA abs 36,641 49,151 (14,797) 0.7123 (0.4271)
PMC OA full (abs) 46,484 291,872 (2,219) 0.9544 (0.1374)
PMC OA full (all) 54,814 346,071 (2,880) 0.9201 (0.1367)

Entrez gene Doc. MEDLINE 346,989 171,001 (139,702) 0.7130 (0.6699)
PMC OA abs 6,946 4,110 (2,357) 0.7466 (0.6283)
PMC OA full (abs) 8,184 38,275 (470) 0.9457 (0.1762)
PMC OA full (all) 9,662 42,209 (628) 0.9390 (0.1863)

EMBL Doc. MEDLINE 158,462 183,950 (235,745) 0.4020 (0.4627)
PMC OA abs 4,807 4,360 (7,902) 0.3782 (0.5244)
PMC OA full (abs) 6,601 34,447 (3,859) 0.6311 (0.1608)
PMC OA full (all) 9,433 40,212 (5,613) 0.6269 (0.1900)

PMC linkouts Doc. MEDLINE (27,259) (23,377) (122,596) (0.1819) (0.5383)
PMC OA abs (30,315) (27,192) (141,735) (0.1762) (0.5272)
PMC OA full (abs) 110,288 156,012 61,656 0.6414 0.4141
PMC OA full (all) 11,2069 163,052 61,671 0.6450 0.4073

Whatizit-Organisms Doc. PMC OA abs 64,686 29,222 12,930 0.8334 0.6888
PMC OA full (abs) 308,410 67,171 100,079 0.7550 0.8211
PMC OA full (all) 344,445 73,489 109,668 0.7585 0.8242

Mention PMC OA abs 139,077 147,426 39,351 0.7794 0.4854
PMC OA full (xml) 1,164,799 1,596,615 527,284 0.6883 0.4218
PMC OA full (all) 1,304,620 2,398,321 1,133,018 0.5352 0.3523

Manual Doc. PMC OA abs 101 0 3 0.9712 1.0
PMC OA full (abs) 421 46 9 0.9791 0.9015
PMC OA full (all) 462 49 9 0.9809 0.9041

Mention PMC OA abs 326 3 19 0.9449 0.9909
PMC OA full (xml) 3,190 92 222 0.9350 0.9720
PMC OA full (all) 3,973 120 241 0.9428 0.9707

Values in parentheses are for comparisons between document sets of different type (for example, evaluation tag sets based on full text compared against species tags generated on abstracts) or when the evaluation set is likely to exclude a large number of species mentions. PMC OA full (all) shows accuracy for all full-text documents. PMC OA full (abs) shows accuracy for all full-text documents with an abstract that can be extracted, allowing comparison of document-level accuracy between full-text and abstract. PMC OA full (xml) shows accuracy for all full-text documents with XML abstract, allowing comparison of mention-level accuracy between full-text and abstracts.