Skip to main content
. 2023 Jun 23;6:132. doi: 10.1038/s42004-023-00932-3

Table 1.

Effect of missing spectra in the model input.

metric full dataset 4 spectra 3 spectra 2 spectra 1 spectrum
# test cases 1000 413 65 483 39
Avg. MW 275.3 287.5 242.6 267.4 300.3
Avg. SMILES length 34.5 37.0 28.5 32.5 43.6
correct molecules () (%) 7.0 9.2 15.2 4.1 5.1
correct formulas () (%) 39.3 45.1 46.9 34.8 20.5
DMW% () Min 2.3 1.6 0.5 2.4 9.5
Avg 6.3 5.5 3.9 6.6 14.6
DMF% () Min 9.2 6.5 8.1 10.8 21.1
Avg 21.7 17.8 24.5 24.0 32.9
Fngpcosine () Max 0.53 0.56 0.57 0.50 0.45
Avg 0.36 0.39 0.38 0.34 0.31
MCSratio () Max 0.68 0.70 0.72 0.66 0.57
Avg 0.51 0.53 0.55 0.50 0.43
MCStan () Max 0.55 0.58 0.60 0.53 0.44
Avg 0.38 0.39 0.41 0.36 0.30
MCScoef () Max 0.71 0.73 0.74 0.69 0.63
Avg 0.54 0.55 0.58 0.53 0.48

Evaluation metrics when considering the entire test set and the test-data partitions that have available all 4, only 3, only 2, and only 1 spectrum. The arrows show the desired trend for each metric.