Skip to main content
. 2011 Apr 12;366(1567):1101–1107. doi: 10.1098/rstb.2010.0315

Table 2.

Sources of the corpora.

Basque twentieth Century Corpus of Basque, http://www.uzei.com/
Chilean Spanish Scott Sadowsky, LIFCACH, http://www2.udec.cl/~ssadowsky
Chinese (Mandarin) Lancaster Corpus of Mandarin Chinese, http://corpus.leeds.ac.uk
Czech Czech National Corpus, http://ucnk.ff.cuni.cz/english/kdejsme.php
English BNC, http://www.natcorp.ox.ac.uk
Estonian Corpus of Written Estonian, http://www.cl.ut.ee/korpused
Finnish Parole Corpus of Finnish, http://kaino.kotus.fi/sanat/taajuuslista/parole_5000.html
French Frantext, http://www.atilf.fr/frantext.htm
Greek HNC, http://hnc.ilsp.gr/en
Māori Māori Broadcasting Corpus, Boyce, M. T. 2006 A corpus of modern spoken Māori. Unpublished PhD thesis available in the library at Victoria University of Wellington.
Polish Polish National Corpus, http://nkjp.pl
Portuguese Mark Davies, http://www.corpusdoportugues.org/x.asp
Russian Sharoff, S. Corpus linguistics around the world (eds Archer, D., Wilson, A. & Rayson, P.), pp. 167–180 (Rodopi, Amsterdam, 2005), http://www.ruscorpora.ru/
Spanish Mark Davies, http://www.corpusdelespanol.org
Swahili Helsinki Corpus of Swahili, http://www.aakkl.helsinki.fi/cameel/corpus
Tok Pisin Slone Wantok Corpus, http://www.tokpisin.org
Turkish METU Turkish Corpus, http://www.ii.metu.edu.tr/corpus