Skip to main content
. 2016 Jan 5;3:150075. doi: 10.1038/sdata.2015.75

Table 1. Comparison Chart of Quantitative Datasets for Studying Historical Information.

Dataset Metric(s) Size Domains Includes domain classification Time # of Countries # of Languages
UNESCO Cultural Statistics17 Economic productivity, cultural participation 52 countries—feature films, 20 countries—employment survey Arts, Design, Media Yes (7 core domain headings) 2000-present >50 2,471
World Values Survey (http://www.worldvaluessurvey.org/) Values 85,000 respondents Religion, Institutions N/A 1981-present 87 20
Culturomics6 n-grams 5,195,769 books Arts, Politics, Sciences N/A 1800–2000 Not reported 7
Human Accomplishment5 Individuals and events 3,869 significant individuals, 1,560 significant events Arts & Sciences Yes (9 within Sciences, 4 within Arts) 800BC-1950 30* countries & regions 6* languages
Who’s Bigger8 Entities (Individuals, events, places, things) 843,790 individuals All domains known in English Wikipedia Yes (based on English Wikipedia categories) 2686 BC—2010 >200* 1 (English)
Popescu & Grefenstette7 Individuals, places 500,896 names All domains known in English Wikipedia Yes (based on English Wikipedia categories) <600 BC—2009 >200* 7
Networked Framework of Cultural History9 Individuals & Places Multiple datasets (FB, AKL, ULAN, WCEN, Google Ngrams) All domains Yes—# categories per dataset: AKL: 5, ULAN: 7, FB: 6 1–2012 >200* 2—(mainly English, German)
Pantheon 1.0 Individuals, Places 11,341 notable individuals All domains Yes (88 occupations, 27 industries, 8 domains) 4000BC—2010 194 277

*estimated (exact numbers not reported).