Table 1:
NLP Stage | Definition | Example |
---|---|---|
Tokenization | Breaking text down into its components | The kidney helps the body maintain homeostasis. --> The | kidney | helps | the | body | maintain | homeostasis. |
Remove stop words | Removing common words (e.g. “the”, “a”, “and”) that do not provide information | the | kidney | helps | the | body | maintain | homeostasis --> | kidney | helps | body | maintain | homeostasis |
Part of speech tagging | Assigning a grammatical role to a word used in a sentence. These are generally: noun, pronoun, adjective, verb, adverb, preposition, conjunction, interjection | kidney: noun helps: verb body: noun maintain: verb homeostasis: noun |
Stemming/lem matization | Reducing inflected or derived words into their stem words or base words | kidney: kidney helps: help body: body maintain: maintain homeostasis: homeostasis |
Named-entity recognition | Identify and locate named entities such as names, organization, and locations. | Belding Hibbard Scribner (Person) was an American (Location) physician and a pioneer in kidney dialysis. |
Negation detection | The task of determining the presence of absence of a finding. | Mrs. Nephron did not (negation detection) require dialysis. |