Skip to main content
. 2013 Oct 2;13:112. doi: 10.1186/1472-6947-13-112

Figure 2.

Figure 2

Phases of the Scrubber annotation pipeline. Lexical Phase: split document into sentences, tag part of speech for each token. Frequency Phase: calculate term frequency with and without part of speech tag. Dictionary Phase: search for each word/phrase in ten medical dictionaries. Known PHI Phase: match US census names and textual patterns for each PHI type.