Table 1.
Comparative analysis MeSH vs. IPC. The hierarchical structures are similar, but MeSH terms are shorter and more likely to occur in text. The number of MeSH annotations per document far surpasses the number of classes per patent.
Property | MeSH | IPC |
---|---|---|
number of hierarchy entries | 54095 | 69487 |
number of unique entries | 26581 | 69487 |
number of hierarchy levels | 13 | 14 |
average string length main labels/class definitions | 18 | 50 |
string length longest main label/class definition | 104 | 596 |
string length shortest main label/class definition | 2 | 3 |
average number of synonyms | 8 | 0 |
occurrence of class labels in text | frequent | very rare |
average number of annotations per document | 9 | 2 |
number of unique annotations | 25646 | 56599 |
proportion of documents with multiple annotations | 86% | 53% |
proportion of documents with related annotations (same hierarchy tree) | 81% | 46% |