Table 1. Feature Statistics of our 811,683,028 tweet corpus.
| #Unique features | ||||
| User | Hashtag | Mention | Location | Term |
| 85,794,831 | 13,607,023 | 46,391,269 | 18,244,772 | 16,212,640 |
| Feature usage in #Tweets | ||||
| Feature | Max | Avg | Median | Most frequent |
| User | 10,196 | 8.67 | 2 | running_status |
| Hashtag | 1,653,159 | 13.91 | 1 | #retweet |
| Mention | 6,291 | 1.26 | 1 | tweet_all_time |
| Location | 10,848,224 | 9,562.34 | 130 | london |
| Term | 241,896,559 | 492.37 | 1 | rt |
| Feature usage by #Users | ||||
| Hashtag | 592,363 | 10.08 | 1 | #retweet |
| Mention | 26,293 | 5.44 | 1 | dimensionist |
| Location | 739,120 | 641.5 | 2 | london |
| Term | 1,799,385 | 6,616.65 | 1 | rt |
| Feature using #Hashtags | ||||
| User | 18,167 | 2 | 0 | daily_astrodata |
| Location | 2,440,969 | 1,837.79 | 21 | uk |