Skip to main content
. Author manuscript; available in PMC: 2010 Jul 10.
Published in final edited form as: J Neurochem. 2007 Aug 30;103(4):1491–1505. doi: 10.1111/j.1471-4159.2007.04858.x
‘…detected ankyrin G,
  which was…’
get raw text instance
‘detected’,‘ankyrin’, split into words, trim non-alphanumeric
  characters, discard stop words, and
  single letters
‘ankyrin’ match as substring to name dictionary:
  118 hits for ‘ankyrin’, no hit for ‘detected’
…ankyrin-3lankyrin G… use rules to generate spelling alternatives
…ankyrin 3lankyrin-
  3lankyrin Glankyrin-G…
match each alternative in full length
  to the raw text
‘ …detected ankyrin G ‘ankyrin G’ matches
‘ankyrin G’, Q12955,
  12507143
recognized protein name, primary
  accession number, and PubMed
  identifier
(Text sample from Kretschmer et al. 2002)