. 2020 Aug 12;22(8):e17478. doi: 10.2196/17478

Table 2.

Description of preprocessing steps and options used in traditional classifiers.

Preprocessing steps	Descriptions	Options^a
placeholder_remove	Remove textual placeholders such as _mention_, _hashtag_, _unicode_, and _url_	True, false
emoji_remove	Remove textual descriptions that denote emojis	True, false
negation_expand	Expand negative contractions, for example, “don’t” is expanded to “do not” and “can’t” is expanded to “cannot”	True, false
punctuation_remove	Remove all punctuation symbols	True, false
digits_remove	Remove all numeric digits (0-9)	True, false
negation_mark	Mark words that occur between a negation trigger and a punctuation mark with the NEG prefix [28]	True, false
normalize	Reduce to 2 characters all consecutive characters that appear more than twice, for example, “happppy” is reduced to “happy”	True, false
stemming	Reduce inflection in words (eg, troubled, troubles) to their root form (eg, trouble) using the Porter Stemmer [29]	True, false
stopwords_remove	Remove common words such as “the,” “a,” “on,” “is,” and “all” that are listed in the Natural Language Toolkit English stop words list [30]	True, false
lowercase	Change the case of all characters to lowercase	True, false

^aIf the option for a step is set to true, the corresponding preprocessing step will be applied in the preprocessing pipeline; if the option is set to false, the corresponding preprocessing step will be skipped in the pipeline.