Skip to main content
. 2023 Sep 22;110(10):1661–1672. doi: 10.1016/j.ajhg.2023.08.018

Figure 4.

Figure 4

Comparison of PARMESAN’s gene-gene and drug-gene relationship predictions to manually curated relationships

We compared all of PARMESAN’s predictions to manually curated databases of drug-gene and gene-gene relationships. The accuracy, or “percent consistent,” has the same definition as it does in Figure 2A. We generated predictions from five knowledge bases: PARMESAN’s extractions from PubMed, PARMESAN’s extractions from PubMed Central, PARMESAN’s combined extractions from PubMed and PubMed Central, SemMedDB’s extractions, and the combined extractions from PARMESAN (PubMed and PubMed Central) and SemMedDB.

(A) The drug-gene relationship predictions were compared to the relationships presented in DGIdb. We take the top n predictions for a given number n (X axis) and observe the consistency in directionality with DGIdb. For example, PARMESAN (using PubMed alone) generated 453,892 predictions with scores above 2. Among the 255 predictions that scored above 2 and overlapped with DGIdb, 204 (80%) matched the directionality displayed by DGIdb. Therefore, the orange “PARMESAN (PubMed)” line contains the point at X = 453,892, Y = 0.8. The best predictions came from combining the extractions from PARMESAN and SemMedDB, although in this trial, the difference from using PARMESAN alone was not statistically significant.

(B) Gene-gene relationship predictions were compared to the gene-gene relationships presented in Reactome. This panel is formatted in the same way as (A). All prediction sets demonstrated increased accuracy with higher scores. In this setting, the combination of PARMESAN and SemMedDB showed the best predictive ability. Its differences from the other knowledge bases tested were all statistically significant.

(C) We compared PARMESAN’s genetic modifier predictions (using extractions from PubMed and PubMed Central combined) for ATXN1 and MAPT to corresponding modifier screens, and the consistent predictions outnumbered the contradicted ones at higher score thresholds.