Table 3.
WSJ corpus (%) | IHC clinical corpus (%) | Pitt clinical corpus (%) | Combined IHC and Pitt clinical corpus (%) | |
---|---|---|---|---|
OpenNLP tagger (maximum entropy—WSJ) | 97.1 | 87.6 | 82.5 | 84.9 |
OpenNLP tagger (maximum entropy—Mayo clinical model—cTAKES) | 96.9 | 87.9 | 88.4 | 88.1 |
Stanford tagger (maximum entropy—bi-directional WSJ) | 97.1 | 85.7 | 86.8 | 86.2 |
Stanford tagger (maximum entropy—left 3 words WSJ) | 97.1 | 85.7 | 88.6 | 87.2 |
LBJ tagger (winnow neural network—WSJ) | 97.3 | 87.3 | 81.8 | 84.3 |
LingPipe tagger (HMM—GENIA) | 78.5 | 81.9 | 81.4 | 81.6 |
LingPipe tagger (HMM—Medpost) | 74.4 | 83.1 | 82.9 | 83.0 |
cTAKES, clinical text analysis and knowledge extraction system; HMM, hidden Markov model; WSJ, Wall Street Journal.