Table 1.
Corpus statistics for the MLT corpus, by year.
Year | Tweets | Words | Users |
---|---|---|---|
2006 | 8 | 135 | 7 |
2007 △ | 819 | 12,872 | 468 |
2008 △ | 5,903 | 96,665 | 3,551 |
2009 △ | 67,834 | 1,141,748 | 38,908 |
2010 △ | 142,509 | 2,310,289 | 76,713 |
2011 △ | 306,389 | 4,760,881 | 167,471 |
2012 △ | 427,428 | 6,296,131 | 241,584 |
2013 △ | 446,505 | 6,630,105 | 249,388 |
2014 ▿ | 345,150 | 5,254,932 | 190,181 |
2015 ▿ | 315,128 | 4,847,984 | 177,482 |
2016 ▿ | 240,793 | 3,741,744 | 132,867 |
2017 △ | 288,779 | 4,870,311 | 141,049 |
2018 △ | 292,966 | 6,863,834 | 143,607 |
Total | 2,880,211 | 46,827,631 | 1,226,109 |
For the (distinct) Users column, there is some overlap across years, because the same users may be active over multiple years (hence the number of distinct users per year does not match the total in the bottom row).