Skip to main content
. 2012 Nov-Dec;19(6):1011–1018. doi: 10.1136/amiajnl-2012-000881

Table 1.

VaeTM components and their functions

Components and subcomponents Functions
1. Pre-processor Prepares the text for the main processing.
1.1 Sentence tokenizer Splits text into sentences using a period.
1.2 Word tokenizer Splits each sentence into tokens.
1.3 Normalizer Removes punctuation marks and converts text to lowercase (1st normalization step).
Removes the tokens tagged as ‘Unimportant’ after their tagging by the semantic tagger (2nd normalization step).
Removes an irrelevant tagged token that disrupts the contiguous tokens of a feature (3rd normalization step).
2. VAERS dictionary Includes 55 000 entries (each entry includes to a term and its tag that corresponds to a semantic type).
3. Semantic tagger Tags the tokens based on the dictionary entries.
4. Grammar rules Define the relationships between tags (ie, the semantic types).
5. Rule-based parser Parses the text by executing the grammar rules after: (1) the 2nd normalization step, and (2) the 3rd normalization step.
6. Features extractor Extracts the predefined features.

VAERS, vaccine adverse event reporting system; VaeTM, vaccine adverse event text mining.