Skip to main content
. 2017 Jul 31;7:6918. doi: 10.1038/s41598-017-07111-0

Figure 2.

Figure 2

Illustrated methods of annotation, sub-sequence search, and regular expression induction. EMR narratives are tokenized, annotated, and transformed into text fragments (n-gram) prior to association testing. Syntactically similar n-grams are then (optionally) grouped into regular expressions with the aim to aggregate conceptually similar features improve overall recall.