Table 5. Twitter dataset area under the recall curves (Fig 12).
Bold entries indicate the best performing method. Minhash similarity (MS) is the best method in 8 cases, Agglomerative Clustering (AC) in 8 cases and Personalised PageRank (PPR) in none. A perfect community detector would score 0.5.
Tags | PPR | MS | AC |
---|---|---|---|
travel | 0.186 | 0.240 | 0.230 |
airline | 0.040 | 0.151 | 0.180 |
hotel brand | 0.160 | 0.294 | 0.285 |
cosmetics | 0.055 | 0.086 | 0.143 |
food and drink | 0.072 | 0.099 | 0.082 |
electronics | 0.035 | 0.069 | 0.059 |
alcohol | 0.069 | 0.199 | 0.229 |
model | 0.078 | 0.110 | 0.109 |
mixed martial arts | 0.317 | 0.363 | 0.386 |
cycling | 0.278 | 0.330 | 0.445 |
athletics | 0.219 | 0.285 | 0.365 |
adult actor | 0.269 | 0.347 | 0.397 |
american football | 0.240 | 0.371 | 0.240 |
baseball | 0.203 | 0.379 | 0.378 |
basketball | 0.252 | 0.380 | 0.353 |
football | 0.202 | 0.233 | 0.212 |