Table 4.
IHC clinical corpus | Pitt clinical corpus | Combine IHC and Pitt clinical corpus | |||||||
---|---|---|---|---|---|---|---|---|---|
Known word (%) | Unknown words (%) | Total (%) | Known word (%) | Unknown words (%) | Total (%) | Known word (%) | Unknown word (%) | Total (%) | |
Easy Adapt (source only) | 89.7 | 65.6 | 84.5 | 91.3 | 51.3 | 78.3 | 90.5 | 56.4 | 81.1 |
Easy Adapt (target only) | 87.8 | 70.7 | 85.1 | 91.2 | 74.3 | 89.0 | 89.6 | 74.4 | 87.9 |
Easy Adapt (source+target) | 89.7 | 74.0 | 88.3 | 92.1 | 80.1 | 91.0 | 90.4 | 75.5 | 89.3 |
ClinAdapt—base tagger (target only) | 94.7 | 89.6 | 89.8 | 97.4 | 91.4 | 92.1 | 95.9 | 90.6 | 91.1 |
ClinAdapt (w/lexicon) | |||||||||
Step 1: base tagging (source only) | 89.1 | 80.1 | 87.6 | 85.6 | 62.0 | 82.5 | 87.0 | 68.4 | 84.9 |
Step 2: lexicon | 90.5 | 82.3 | 89.2 | 85.3 | 71.8 | 83.6 | 87.4 | 76.1 | 86.1 |
Step 3: transformation-based learner (target only) | 95.9 | 82.8 | 93.8 | 97.1 | 72.8 | 93.9 | 94.9 | 76.1 | 93.2 |
ClinAdapt (wo/lexicon) | |||||||||
Step 1: base tagging (source only) | 89.1 | 80.1 | 87.6 | 85.5 | 62.7 | 82.5 | 87.0 | 67.9 | 84.9 |
Step 2: transformation-based learner (target only) | 95.6 | 80.2 | 93.2 | 97.0 | 63.2 | 92.6 | 94.8 | 68.0 | 91.8 |
IHC, Intermountain Healthcare; Pitt, University of Pittsburgh.