Table 1.
Effect of missing spectra in the model input.
metric | full dataset | 4 spectra | 3 spectra | 2 spectra | 1 spectrum | |
---|---|---|---|---|---|---|
# test cases | 1000 | 413 | 65 | 483 | 39 | |
Avg. MW | 275.3 | 287.5 | 242.6 | 267.4 | 300.3 | |
Avg. SMILES length | 34.5 | 37.0 | 28.5 | 32.5 | 43.6 | |
correct molecules (↑) | (%) | 7.0 | 9.2 | 15.2 | 4.1 | 5.1 |
correct formulas (↑) | (%) | 39.3 | 45.1 | 46.9 | 34.8 | 20.5 |
DMW% (↓) | Min | 2.3 | 1.6 | 0.5 | 2.4 | 9.5 |
Avg | 6.3 | 5.5 | 3.9 | 6.6 | 14.6 | |
DMF% (↓) | Min | 9.2 | 6.5 | 8.1 | 10.8 | 21.1 |
Avg | 21.7 | 17.8 | 24.5 | 24.0 | 32.9 | |
Fngpcosine (↑) | Max | 0.53 | 0.56 | 0.57 | 0.50 | 0.45 |
Avg | 0.36 | 0.39 | 0.38 | 0.34 | 0.31 | |
MCSratio (↑) | Max | 0.68 | 0.70 | 0.72 | 0.66 | 0.57 |
Avg | 0.51 | 0.53 | 0.55 | 0.50 | 0.43 | |
(↑) | Max | 0.55 | 0.58 | 0.60 | 0.53 | 0.44 |
Avg | 0.38 | 0.39 | 0.41 | 0.36 | 0.30 | |
MCScoef (↑) | Max | 0.71 | 0.73 | 0.74 | 0.69 | 0.63 |
Avg | 0.54 | 0.55 | 0.58 | 0.53 | 0.48 |
Evaluation metrics when considering the entire test set and the test-data partitions that have available all 4, only 3, only 2, and only 1 spectrum. The arrows show the desired trend for each metric.