. 2012 Nov-Dec;19(6):1011–1018. doi: 10.1136/amiajnl-2012-000881

Table 1.

VaeTM components and their functions

Components and subcomponents	Functions
1. Pre-processor	Prepares the text for the main processing.
1.1 Sentence tokenizer	Splits text into sentences using a period.
1.2 Word tokenizer	Splits each sentence into tokens.
1.3 Normalizer	Removes punctuation marks and converts text to lowercase (1st normalization step).
	Removes the tokens tagged as ‘Unimportant’ after their tagging by the semantic tagger (2nd normalization step).
	Removes an irrelevant tagged token that disrupts the contiguous tokens of a feature (3rd normalization step).
2. VAERS dictionary	Includes 55 000 entries (each entry includes to a term and its tag that corresponds to a semantic type).
3. Semantic tagger	Tags the tokens based on the dictionary entries.
4. Grammar rules	Define the relationships between tags (ie, the semantic types).
5. Rule-based parser	Parses the text by executing the grammar rules after: (1) the 2nd normalization step, and (2) the 3rd normalization step.
6. Features extractor	Extracts the predefined features.

VAERS, vaccine adverse event reporting system; VaeTM, vaccine adverse event text mining.