. 2013 Aug 1;6(Suppl 1):51–62. doi: 10.4137/BII.S11770

Table 5.

Caramba: best additional groups of patterns at second iteration (training set).

P	R	F	Description
0.7622	0.6904	0.7245	*Lemma, from TreeTagger
0.7684	0.6851	0.7244	*B: Brown Beth_Partners unigrams
0.7589	0.6898	0.7227	*Normalized token
0.7637	0.6857	0.7226	*Specialist Lexicon syntactic category, with normalized token
0.7624	0.6852	0.7217	B: Specialist Lexicon syntactic category, with normalized token
0.7575	0.6876	0.7209	*TreeTagger POS, with normalized token
0.7595	0.6856	0.7206	B: lemma, from TreeTagger
0.7648	0.6811	0.7205	B: Brown UMLS unigrams
0.7640	0.6796	0.7194	*Section identifier
0.7632	0.6799	0.7192	*Digit
0.7578	0.6837	0.7189	B: Charniak-McClosky POS unigrams, bigrams, trigrams
0.7590	0.6805	0.7176	B: TreeTagger POS, with normalized token
0.7570	0.6819	0.7175	UMLS first or two Semantic Types
0.7487	0.6887	0.7175	*Date
0.7561	0.6821	0.7172	*Alphabetic or case
0.7579	0.6802	0.7169	B: Wmatch
0.7627	0.6757	0.7166	B: section identifier
0.7561	0.6810	0.7166	*B: TreeTagger chunk, BIO
0.7607	0.6772	0.7165	B: Wmatch, BIO
0.7522	0.6775	0.7129	B: date
0.7527	0.7119	0.7317	Subset 2: Subset 1 + all of the above
0.7761	0.6957	0.7337	Subset 3: Subset 1 + starred feature groups only

Notes: B: bigram of classes. Each pattern group is added independently to the pool of Iteration 1 (ie, Subset 1).