Table 2: Manually curated resources used to construct EMCON.
A total of seven resources that manually tag GeneID’s to articles within PubMed were integrated to serve as the initial dataset for building EMCON. Over 1.2 million articles make up the naive GeneID-MeSH network with over 7 million genes for over 14K species.
| Gene and gene product databases | Number of articles | Number of GeneIDs in articles | Number of species across GeneIDs |
|---|---|---|---|
| gene2pubmed [33] | 1,062,713 | 5,565,651 | 12,782 |
| Gene Reference into Function (generif) [45] | 705,441 | 90,329 | 1,913 |
| Comparative Toxicogenomics Database (CTD) [36] | 58,180 | 43,298 | 76 |
| Universal Protein Resource (UniProt/Swiss-Prot) [37] | 950,989 | 5,156,248 | 12,555 |
| Reactome [38] | 15,650 | 11,110 | 9 |
| Rat Genome Database (RGD) [40] | 834,585 | 87,874 | 7 |
| Mouse Genome Informatics (MGI) [43] | 181,519 | 42,020 | 1 |
| Total Unique | 1,238,879 | 7,074,406 | 14,126 |