Figure 1.

Each pie chart represents one substance group; the number in the center denotes the group size, i.e., the number of substances in the chemical group; green denotes that a structure is available and black that no structure is available. The groups are arranged according to the structural availability percentage. Groups labelled in red comprised fewer than 10 substances with an available structure that were pooled together in a separate class termed “miscellaneous chemistry” for modelling purposes. All other groups were included as their own class in the model and possessed structural coverage ranging from 28% to 100% with a median of 80%.