Skip to main content
. 2011 Jun 23;6(6):e21132. doi: 10.1371/journal.pone.0021132

Figure 2. Outline of how search terms are mapped against ICD-9 code descriptions, using malignant lymphoma as an example.

Figure 2

Firstly the search term (S1) is broken down into words which are matched against the target phrase (S2) using regular expressions. Starting with every word in S1, S1 and S2 are compared and if there is no match, words are repeatedly removed from the match expression until only one word remains. If no match is found, a Levenshtein distance function is used to compare the terms for equality and if it scores lower that a threshold Inline graphic the terms are considered as matching.