Table 1.
Feature | Description | DocTimeRel | Event-Event Relations | Event-Time Relations |
---|---|---|---|---|
Tokens | The first and the last word of each concept, all words covered by a concept as a bag, bag-of-words around each concept for a window of [−3, 3], bag-of-words between 2 concepts, and the number of words between 2 concepts (for the THYME corpus, the headword event is expanded to the immediately enclosing NP and the NP becomes the anchor for the token features) | ✓ | ✓ | ✓ |
Part-of-speech tags | The Penn Treebank POS tags of each concept as a bag | ✓ | ||
Event attributes | All event-related attributes such as polarity, modality, and type. Note that DocTimeRel is also an event attribute, and is used for reasoning on the within-sentence relations. | ✓ | ✓ | ✓ |
UMLS feature | UMLS semantic types of each concept as features | ✓ | ✓ | |
Dependency path | The dependency path between 2 concepts and the number of dependency nodes in-between | ✓ | ||
Overlapped head | If 2 concepts share the same headword | ✓ | ||
Temporal attributes | The class type of a time expression, eg, “Date,” “Time,” “Duration,” etc. | ✓ | ||
Special words | Any special words from the time lexicon developed by the NRCC24 that the concepts or the context in-between contain | ✓ | ||
Nearest flag | If the event-time pair in question is the closest among all pairs in the same sentence | ✓ | ||
Conjunction feature | If there is any conjunction word between the arguments | ✓ | ||
Nearby verb’s part-of-speech tag | The Penn Treebank POS tags of the verbs within the same sentence | ✓ | ||
Section ID | The header of the section containing the target concept | ✓ | ||
Closest verb | The tokens and Penn Treebank POS tags of the closest verb to the target concept within the same sentence | ✓ | ||
TimeX feature | The tokens and attributes of the closest time expression in the same sentence | ✓ |
Abbreviations: DocTimeRel, document time relation; NP, Noun Phrase; THYME, temporal histories of your medical events; POS, part-of-speech; UMLS, unified medical language system; NRCC, National Research Council Canada; ID, Identifier.