Table 1.
System | Characteristics |
Description | |||
---|---|---|---|---|---|
Linguistic filtering | Statistical filtering | Machine learning | Context | ||
NEURAL: Frantzi (1995) | ✓ | ✓ | Morphosyntactic patterns, list of suffixes, frequency, mutual information (Medicine), 70% recall | ||
CLARIT: Evans Evans (1996) | ✓ | ✓ | NP parsers, statistical disambig., sub-compound generation, 240 Mb News corpus, 82% recall | ||
TerMine: Frantzi et al. (2000) | ✓ | ✓ | ✓ | POS tagger; context defining words in the corpus, 75% precision within top 25% of terms | |
OntoLearn: Navigli and Velardi (2004) | ✓ | ✓ | ✓ | ✓ | Comprehensive system including term, definition extraction and disambiguation; Tourism domain 0.80 precision 0.55 recall (estimated) |
Text2Onto: Cimiano and Völker (2005) | ✓ | ✓ | ✓ | ✓ | Framework for ontology learning, algorithms for term and relation extraction |
Lee et al. (2006) | ✓ | ✓ | Dependency parsing for relationship extraction for sub-units of GO concepts low precision 3.5% added (recall) | ||
Wermter and Hahn (2006) | ✓ | ✓ | Comparison of statistics with filtering by frequency or linguistic information |
All methods use linguistic filtering, most methods statistical filtering, some methods use context information. The quality is given in terms of precision and recall (see ‘Methods’ section).