. 2021 Oct 25;11:734416. doi: 10.3389/fcimb.2021.734416

Table 2.

Prediction results (first four columns of numbers) in terms of Spearman correlation for all metabolites to be predicted.

	Training (ENVIM)	Training (MelonnPan)	Testing (ENVIM)	Testing (MelonnPan)	Predictable metabolites (defined by MelonnPan)
ZOE 2.0 (NM = 503)
DNA only	356 (71%)	63 (13%)	124 (25%)	47 (9%)	70
RNA only	409 (81%)	157 (31%)	106 (21%)	68 (14%)	163
Both DNA and RNA	423 (84%)	146 (29%)	110 (22%)	73 (15%)	154
Mallick cohort (NM = 466)
DNA only	408 (88%)	239 (51%)	225 (48%)	178 (38%)	249
Lloyd-Price cohort (NM = 522)
DNA only	501 (96%)	271 (52%)	322 (62%)	193 (37%)	305
RNA only	521 (100%)	298 (57%)	393 (75%)	236 (45%)	318
Both DNA and RNA	518 (99%)	306 (59%)	381 (73%)	232 (44%)	323

Based on the “well-prediction” criterion, defined as Spearman correlation ≥0.3 between the observed and the predicted metabolites, the numbers of well-predicted metabolites with different prediction methods, datasets, and modality levels (DNA, RNA, and Both) are presented for comparing MelonnPan and ENVIM. NM is the number of metabolites to be predicted. Percentages in parentheses (%) represent the number of well-predicted metabolites divided by the total number of metabolites (NM) to be predicted in each study. The Mallick cohort has only metagenomics data available. The last column presents numbers of “predictable metabolites,” defined by MelonnPan, also seen in the Figure 2 legend. Bold in the column of in testing results represents the highest number of well-predicted metabolites among the three modalities (DNA, RNA, both DNA and RNA) in the ZOE2.0 cohort and the Lloyd-Price cohort.