Skip to main content
. 2013 Mar 13;20(5):931–939. doi: 10.1136/amiajnl-2012-001453

Table 2.

Top performing taggers with out-of-the-box models

POS tagger Algorithm Model Training corpus description
OpenNLP tagger Maximum entropy en-pos-maxent.bin Penn Treebank WSJ
OpenNLP tagger Maximum entropy postagger.model.bin.gz Mayo Clinical Model—cTAKES
Stanford tagger Maximum entropy english-bidirectional-distsim.tagger Penn Treebank WSJ
Stanford tagger Maximum entropy english-left3words-distsim.tagger Penn Treebank WSJ
LBJ tagger Winnow neural network N/A English Penn Treebank WSJ
LingPipe tagger HMM pos-en-bio-genia.HiddenMarkovModel GENIA (MEDLINE abstracts w/MeSH terms: human, blood cells, and transcription factors)
LingPipe tagger HMM pos-en-bio-medpost.HiddenMarkovModel Medpost (MEDLINE biological abstracts)

cTAKES, clinical text analysis and knowledge extraction system; HMM, hidden Markov model; WSJ, Wall Street Journal.