Table 2. Characteristics of the cluster solutions for the nine similarity approaches.
Approach | # Articles covered | % Coverage | # Clusters | Max Cluster Size |
tf-idf MeSH | 2,062,642 | 95.77% | 24,708 | 1517 |
LSA MeSH | 2,115,440 | 98.22% | 25,287 | 1021 |
BM25 MeSH | 2,011,339 | 93.39% | 26,864 | 1015 |
SOM MeSH | 2,153,169 | 99.97% | 29,941 | 3576 |
tf-idf TA | 1,796,349 | 83.41% | 21,388 | 657 |
LSA TA | 1,958,125 | 90.92% | 23,831 | 1827 |
BM25 TA | 2,022,694 | 93.91% | 28,858 | 764 |
Topics TA | 2,033,221 | 94.40% | 24,163 | 1422 |
PMRA | 2,029,564 | 94.23% | 28,963 | 921 |