Skip to main content
. 2013 Mar 13;20(5):931–939. doi: 10.1136/amiajnl-2012-001453

Table 3.

Out-of-the-box POS tagger performance

WSJ corpus (%) IHC clinical corpus (%) Pitt clinical corpus (%) Combined IHC and Pitt clinical corpus (%)
OpenNLP tagger (maximum entropy—WSJ) 97.1 87.6 82.5 84.9
OpenNLP tagger (maximum entropy—Mayo clinical model—cTAKES) 96.9 87.9 88.4 88.1
Stanford tagger (maximum entropy—bi-directional WSJ) 97.1 85.7 86.8 86.2
Stanford tagger (maximum entropy—left 3 words WSJ) 97.1 85.7 88.6 87.2
LBJ tagger (winnow neural network—WSJ) 97.3 87.3 81.8 84.3
LingPipe tagger (HMM—GENIA) 78.5 81.9 81.4 81.6
LingPipe tagger (HMM—Medpost) 74.4 83.1 82.9 83.0

cTAKES, clinical text analysis and knowledge extraction system; HMM, hidden Markov model; WSJ, Wall Street Journal.