Skip to main content
. 2016 Sep 23;14:363–370. doi: 10.1016/j.csbj.2016.09.002

Table 1.

The three corpora used in this analysis. Due to technical limitations, fewer documents were labelled with diseases (nd) than the total number of documents in each corpus (nt).

Corpus nt nd
Clinical studiesa 177,609 147,235
Modelling literatureb 215,097 85,676
Positive control literaturec 687 244

nt total number of documents; nd number of documents labelled with a disease.

a

clinicaltrials.gov.

b

Medline — text mining query for models.

c

BioModels database.