Table 6.
Database profiling.
Graph entity | Type | Counts |
---|---|---|
PUBLICATION | Node | 30,419,056 |
AUTHOR | Node | 8,331,251 |
GENE | Node | 19,082 |
DISEASE | Node | 4818 |
MESH | Node | 29,133 |
CITED_BY | Relationship | 173,572,773 |
PUBLISHED | Relationship | 121,879,576 |
GENE_PMID | Relationship | 9,656,712 |
DISEASE_PMID | Relationship | 39,605,276 |
MESH_PMID | Relationship | 279,331,447 |
We loaded PubMed 2019 base-line into Neo4J, an open source graph-database management system. We introduced four node types (PUBLICATION, AUTHOR, GENE, DISEASE), and four edge types (PUBLISHED, from AUTHOR to PUBLICATION; CITED_BY between PUBLICATION, GENE_PMID_ASSOCIATION from GENE to PUBLICATION; and DISEASE_PMID_ASSOCIATION from DISEASE to PUBLICATION). Furthermore, PUBLICATION nodes have the following attributes: PMID, TITLE, ABSTRACT, AFFILIATIONS, IS_REVIEW, IS_CLINICAL_TRIAL, BIG_PHARMA, MEDIUM_PHARMA and DATE. The database is accessible at: https://mega.nz/file/4E8QjCaQ#oqtm7jof-lsG7ySget8uakh7m26bDLo1HrPu3mtdAV8.