# Supplementary file - Comparison FuzzyWuzzy, MetaNetX In the GPRuler GitHub within the folder furtherAnalyses we implemented the script MetaNetX_compounds.py (usage: *MetaNetX_compounds.py testModel* where the *testModel* parameter is chosen among: recon if the pipeline starts from Recon 3D model, y7 if the pipeline starts from Yeast 7 model, y8 if the pipeline starts from Yeast 8 model, hmr if the pipeline starts from HMRcore model) that for every ground truth model produces a tab-delimited file including for each metabolite (Name columns) a list of the inferred identifiers from our methodology as described in the Methods Section of the manuscript (Identifiers_classic column), a list of the inferred identifiers from only FuzzyWuzzy package (Identifiers_fuzzy column), a combination of the two approaches (Identifiers column), the corresponding identifiers in the MetaNetX.org's MNXref namespace (ID column), a list of the identifiers from our methodology and FuzzyWuzzy package processed to be compared with those coming from MetaNetX (fuzzy column), list of the identifiers from MetaNetX processed to be compared with those coming from our methodology and FuzzyWuzzy package (metaNet column). The processing of identifiers we mentioned before is necessary because identifiers retrieved by MetaCyc are enclosed in the pipe symbol (|) that have been already removed in the corresponding IDs relative to MetaCyc in MetaNetX. Another processing of identifiers is necessary in MetaNetX because IDs coming from ChEBI database are preceded by the "CHEBI:" or "chebi:" strings, as well as those coming from KEGG and MetaCyc databases are preceded, respectively, by, the "kegg.compound:" and "metacyc.compound:" strings. We removed these strings in order to compared IDs with those coming from our methodology. The output files of MetaNetX_compounds.py script for the 4 tested models are included in the same furtherAnalyses folder and named as follows: hmrCore_comparisonMetaNetX_FuzzyWuzzy.csv, recon3D_comparisonMetaNetX_FuzzyWuzzy.csv, yeast7_comparisonMetaNetX_FuzzyWuzzy.csv, yeast8_comparisonMetaNetX_FuzzyWuzzy.csv. We saw how both strategies, the one developed by us described in the manuscript and the MetaNetX, would be necessary to obtain the best outcome. To make this clearer, here as follows we reported quantitative data about different scenarios we observed: \ **HMRcore** | Scenario | # | |---|---| | Case 1: common identifiers for the same metabolite in addition to identifiers provided by our methodology not included in MetaNetX output and viceversa | 36 | | Case 2: common identifiers for the same metabolite in addition to identifiers provided by only our methodology not included in MetaNetX output | 54 | | Case 3: common identifiers for the same metabolite in addition to identifiers provided by only MetaNetX not included in our methodology output | 41 | | Case 4: perfect match between identifiers for the same metabolite provided by MetaNetX and our methodology | 19 | | Case 5: no intersection between identifiers for the same metabolite provided by MetaNetX and our methodology | 5 | | Case 6: identifiers for the same metabolite provided by only our methodology | 93 | | Case 7: identifiers for the same metabolite provided by only MetaNetX | 0 | **Recon3D** | Scenario | # | |---|---| | Case 1: common identifiers for the same metabolite in addition to identifiers provided by our methodology not included in MetaNetX output and viceversa | 227 | | Case 2: common identifiers for the same metabolite in addition to identifiers provided by only our methodology not included in MetaNetX output | 469 | | Case 3: common identifiers for the same metabolite in addition to identifiers provided by only MetaNetX not included in our methodology output | 396 | | Case 4: perfect match between identifiers for the same metabolite provided by MetaNetX and our methodology | 276 | | Case 5: no intersection between identifiers for the same metabolite provided by MetaNetX and our methodology | 36 | | Case 6: identifiers for the same metabolite provided by only our methodology | 1582 | | Case 7: identifiers for the same metabolite provided by only MetaNetX | 2 | **Yeast 7** | Scenario | # | |---|---| | Case 1: common identifiers for the same metabolite in addition to identifiers provided by our methodology not included in MetaNetX output and viceversa | 172 | | Case 2: common identifiers for the same metabolite in addition to identifiers provided by only our methodology not included in MetaNetX output | 190 | | Case 3: common identifiers for the same metabolite in addition to identifiers provided by only MetaNetX not included in our methodology output | 244 | | Case 4: perfect match between identifiers for the same metabolite provided by MetaNetX and our methodology | 78 | | Case 5: no intersection between identifiers for the same metabolite provided by MetaNetX and our methodology | 18 | | Case 6: identifiers for the same metabolite provided by only our methodology | 197 | | Case 7: identifiers for the same metabolite provided by only MetaNetX | 6 | **Yeast 8** | Scenario | # | |---|---| | Case 1: common identifiers for the same metabolite in addition to identifiers provided by our methodology not included in MetaNetX output and viceversa | 222 | | Case 2: common identifiers for the same metabolite in addition to identifiers provided by only our methodology not included in MetaNetX output | 243 | | Case 3: common identifiers for the same metabolite in addition to identifiers provided by only MetaNetX not included in our methodology output | 334 | | Case 4: perfect match between identifiers for the same metabolite provided by MetaNetX and our methodology | 106 | | Case 5: no intersection between identifiers for the same metabolite provided by MetaNetX and our methodology | 20 | | Case 6: identifiers for the same metabolite provided by only our methodology | 300 | | Case 7: identifiers for the same metabolite provided by only MetaNetX | 6 | In addition to the above, MetaNetX resulted more sensitive to how metabolite names are written compared to our methodology that mitigates the effect of cases such as the metabolite 3-Oxohexanoyl-ACP, which in the MetaNetX database is not found because stored as 3-Oxohexanoyl-[acp]. We further discussed within the Discussion section of the manuscript how including MetaNetX in a future version of GPRuler will certainly improve the tool.