Skip to main content
. 2013 Apr 17;8(4):e55814. doi: 10.1371/journal.pone.0055814

Table 3. Overview of extraction statistics.

Abstracts Full text Total Entrez Gene Ensembl Genomes
Articles 6.4M 384K 6.8M
Sentences 54.8M 66.9M 121.6M
Gene/protein mentions 43.3M 33.3M 76.5M 28.8M (37.6%) 47.9M (62.6%)
Events 23.5M 16.7M 40.2M 16.3M (40.5%) 26.0M (64.7%)

Extraction statistics for PubMed abstracts and PubMed Central full texts with at least one identified gene/protein mention. The last two columns state the number of mentions/events that could be fully normalized to Entrez Gene identifiers or Ensembl Genomes families.