Table 2. Score with the BLEU Cumulative 4-Gram Metric Based on the Characteristics of the Molecules and Their Annotationsa.
| Number of Terms | ||||||
|---|---|---|---|---|---|---|
| term decomposition | 0–10 | 10–20 | 20–30 | 30–40 | 40–50 | 50–60 |
| 4-gram score | 0.59 | 0.73 | 0.78 | 0.79 | 0.75 | 0.66 |
| Atom Height Difference | ||||
|---|---|---|---|---|
| distance | <0.5 Å | <1.0 Å | <1.5 Å | ≥1.5 Å |
| 4-gram score | 0.79 | 0.62 | 0.62 | 0.50 |
The top section divides the scores into subsets based on the length of the string of terms into which the IUPAC name is broken down. The bottom section divides the test set score into subsets based on the maximum difference in height among atoms in the molecule (excluding hydrogens).