Skip to main content
. 2013 Aug 1;6(Suppl 1):51–62. doi: 10.4137/BII.S11770

Table 5.

Caramba: best additional groups of patterns at second iteration (training set).

P R F Description
0.7622 0.6904 0.7245 *Lemma, from TreeTagger
0.7684 0.6851 0.7244 *B: Brown Beth_Partners unigrams
0.7589 0.6898 0.7227 *Normalized token
0.7637 0.6857 0.7226 *Specialist Lexicon syntactic category, with normalized token
0.7624 0.6852 0.7217 B: Specialist Lexicon syntactic category, with normalized token
0.7575 0.6876 0.7209 *TreeTagger POS, with normalized token
0.7595 0.6856 0.7206 B: lemma, from TreeTagger
0.7648 0.6811 0.7205 B: Brown UMLS unigrams
0.7640 0.6796 0.7194 *Section identifier
0.7632 0.6799 0.7192 *Digit
0.7578 0.6837 0.7189 B: Charniak-McClosky POS unigrams, bigrams, trigrams
0.7590 0.6805 0.7176 B: TreeTagger POS, with normalized token
0.7570 0.6819 0.7175 UMLS first or two Semantic Types
0.7487 0.6887 0.7175 *Date
0.7561 0.6821 0.7172 *Alphabetic or case
0.7579 0.6802 0.7169 B: Wmatch
0.7627 0.6757 0.7166 B: section identifier
0.7561 0.6810 0.7166 *B: TreeTagger chunk, BIO
0.7607 0.6772 0.7165 B: Wmatch, BIO
0.7522 0.6775 0.7129 B: date
0.7527 0.7119 0.7317 Subset 2: Subset 1 + all of the above
0.7761 0.6957 0.7337 Subset 3: Subset 1 + starred feature groups only

Notes: B: bigram of classes. Each pattern group is added independently to the pool of Iteration 1 (ie, Subset 1).