Skip to main content
. 2006 Sep 26;7(Suppl 2):S3. doi: 10.1186/1471-2105-7-S2-S3

Table 3.

Scalability of MM term evaluation for chemical names when applied to a large corpus, in this case approximately 13.1 million MEDLINE records that contain approximately 7.4 million abstracts. Using these estimates, the overall precision for chemical term entry into the database is 82.7%.

Cutoff Sample 1 FP Sample 2 FP Sample 3 FP Avg. Precision Stdev # Records Errors (est.)
1–2 42% 46% 48% 54.7% 3.1% 203,985 92,473
2–5 27% 25% 22% 75.3% 2.5% 319,000 78,687
5–10 5% 3% 5% 95.7% 1.2% 202,655 8,782
10–20 2% 0% 1% 99.0% 1.0% 164,286 1,643
21+ 0% 0% 0% 100.0% 0.0% 162,728 -
Weighted Average 82.7% Total 1,052,654 181,584