Skip to main content
. 2021 Aug 3;11:15747. doi: 10.1038/s41598-021-94897-9

Table 6.

Database profiling.

Graph entity Type Counts
PUBLICATION Node 30,419,056
AUTHOR Node 8,331,251
GENE Node 19,082
DISEASE Node 4818
MESH Node 29,133
CITED_BY Relationship 173,572,773
PUBLISHED Relationship 121,879,576
GENE_PMID Relationship 9,656,712
DISEASE_PMID Relationship 39,605,276
MESH_PMID Relationship 279,331,447

We loaded PubMed 2019 base-line into Neo4J, an open source graph-database management system. We introduced four node types (PUBLICATION, AUTHOR, GENE, DISEASE), and four edge types (PUBLISHED, from AUTHOR to PUBLICATION; CITED_BY between PUBLICATION, GENE_PMID_ASSOCIATION from GENE to PUBLICATION; and DISEASE_PMID_ASSOCIATION from DISEASE to PUBLICATION). Furthermore, PUBLICATION nodes have the following attributes: PMID, TITLE, ABSTRACT, AFFILIATIONS, IS_REVIEW, IS_CLINICAL_TRIAL, BIG_PHARMA, MEDIUM_PHARMA and DATE. The database is accessible at: https://mega.nz/file/4E8QjCaQ#oqtm7jof-lsG7ySget8uakh7m26bDLo1HrPu3mtdAV8.