Table 2.
POS tagger | Algorithm | Model | Training corpus description |
---|---|---|---|
OpenNLP tagger | Maximum entropy | en-pos-maxent.bin | Penn Treebank WSJ |
OpenNLP tagger | Maximum entropy | postagger.model.bin.gz | Mayo Clinical Model—cTAKES |
Stanford tagger | Maximum entropy | english-bidirectional-distsim.tagger | Penn Treebank WSJ |
Stanford tagger | Maximum entropy | english-left3words-distsim.tagger | Penn Treebank WSJ |
LBJ tagger | Winnow neural network | N/A | English Penn Treebank WSJ |
LingPipe tagger | HMM | pos-en-bio-genia.HiddenMarkovModel | GENIA (MEDLINE abstracts w/MeSH terms: human, blood cells, and transcription factors) |
LingPipe tagger | HMM | pos-en-bio-medpost.HiddenMarkovModel | Medpost (MEDLINE biological abstracts) |
cTAKES, clinical text analysis and knowledge extraction system; HMM, hidden Markov model; WSJ, Wall Street Journal.