Skip to main content
. Author manuscript; available in PMC: 2023 Sep 1.
Published in final edited form as: Adv Chronic Kidney Dis. 2022 Sep;29(5):465–471. doi: 10.1053/j.ackd.2022.07.001

Table 1:

Common preprocessing stages of natural language processing with definition and examples.

NLP Stage Definition Example
Tokenization Breaking text down into its components The kidney helps the body maintain homeostasis. -->
The | kidney | helps | the | body | maintain | homeostasis.
Remove stop words Removing common words (e.g. “the”, “a”, “and”) that do not provide information the | kidney | helps | the | body | maintain | homeostasis --> | kidney | helps | body | maintain | homeostasis
Part of speech tagging Assigning a grammatical role to a word used in a sentence. These are generally: noun, pronoun, adjective, verb, adverb, preposition, conjunction, interjection kidney: noun
helps: verb
body: noun
maintain: verb
homeostasis: noun
Stemming/lem matization Reducing inflected or derived words into their stem words or base words kidney: kidney
helps: help
body: body
maintain: maintain
homeostasis: homeostasis
Named-entity recognition Identify and locate named entities such as names, organization, and locations. Belding Hibbard Scribner (Person) was an American (Location) physician and a pioneer in kidney dialysis.
Negation detection The task of determining the presence of absence of a finding. Mrs. Nephron did not (negation detection) require dialysis.