Skip to main content
. 2011 Jun 2;12:223. doi: 10.1186/1471-2105-12-223

Table 7.

Overall accuracy on the data set

Data set NB AEC JDI MRD 2-MRD
Abbreviation Set 0.9716 0.9090 0.8759 0.8501
Abbreviation Subset 0.9760 0.9218 0.6725 0.8838 0.8725

Term Set 0.8980 0.7462 0.7148 0.6773
Term Subset 0.8991 0.7448 0.6209 0.7132 0.6609

Term/Abbreviation Set 0.9384 0.8879 0.8801 0.9356
Term/Abbreviation Subset 0.9360 0.9026 0.6899 0.8715 0.9350

Overall MSH WSD Set 0.9386 0.8383 0.8070 0.7799
Overall MSH WSD Subset 0.9413 0.8448 0.6551 0.8118 0.7837

NLM WSD 0.8830 0.6836 0.6389 0.5500
NLM WSD Subset 0.9063 0.6932 0.7475 0.6526 0.5800

NB stands for Naïve Bayes, AEC stands for Automatic Extracted Corpus, MRD stands for Machine Readable dictionary, 2-MRD stands for 2nd Order Co-occurrence MRD, and JDI stands for Journal Descriptor Indexing. The term set stands for all the ambiguous words in the category while subset indicates that only the words that the JDI method can use are considered. Results on the NLM WSD set have been included.