Table 2.
Top datasets (n = 20) | Bottom datasets (n = 20) | p | |
---|---|---|---|
Number of associated publications | 14 (9–22) | 2 (1–3) | 0.000 |
Number of task types | 2 (1–5) | 1 (1–2) | 0.007 |
Number of sub-benchmarks | 2 (1–8) | 1 (1–1) | 0.015 |
Dedicated leaderboard | 35% | 0% | 0.002 |
Proposed as part of competition | 10% | 15% | 0.322 |
Number of institutions | 2 (1–8) | 1 (1–6) | 0.310 |
First/last author affiliated with top company/university | 50% | 20% | 0.024 |
Datasets were sampled from NLP and computer vision datasets with first reported results in Papers With Code in 2018. Popularity was assessed by the number of publications that report benchmark results based on a dataset and are captured in the Papers With Code repository. Numeric attributes are reported as fractions or median values printed in bold. For median values, minimum and maximum values are shown in brackets.