Table 2.
Performance of SVM source classifiers on binary classification for each source category.
|
|
Type of classification | Precision | Recall | F-Score |
| A. Approach 1, using short URLs (N=1000) | ||||
|
|
Media | 0.8873 | 0.8278 | 0.8477 |
|
|
Retail | 0.8723 | 0.7913 | 0.8117 |
|
|
Personal | 0.8755 | 0.7976 | 0.8200 |
| B. Approach 2, using unshortened URLs (N=1000) | ||||
|
|
Media | 0.8958 | 0.8639 | 0.8769 |
|
|
Retail | 0.8881 | 0.8155 | 0.8357 |
|
|
Personal | 0.9020 | 0.8572 | 0.8736 |
|
C. P values calculated using
t
test to asses statistical significance of differences in classifier performance (F-scores) | ||||
|
|
Approach 1 | Personal vs Media, P=.094; Personal vs Retail, P=.27, Media vs Retail, P=.07 | ||
|
|
Approach 2 | Personal vs. Media, P=.38; Personal vs Retail, P=.03a; Media vs Retail, P=.01a | ||
|
|
Approach 1 vs 2 | Personal1 vs. Personal2, P=.001a; Retail1 vs Retail2, P=.004a; Media1 vs Media2, P=.049a | ||
aValues that show statistically significant differences.