Figure 2. Schematic representation of the reference dataset creation process for the Cu-case study.
First the atomic structure data in the form of atom types, charges and positions are converted to corresponding bispectra. The bispectra of different atoms are then combined to form features for learning. For intra-atomic Hamlitonian elements (E), the bispectra of individual atoms constitute the feature vector. For inter-atomic Hamiltonian elements (V), the bispectra are divided by the inter-atomic distances (R) to form the feature vector. The feature vectors along with their corresponding reference output data constitute the reference dataset.