Overview of central concepts in Gaussian process regression
(GPR)
machine-learning models of atomistic properties. Left: The models
discussed in the present review are based on atomistic structure,
and therefore, they require a suitable representation of atomic environments
up to a cutoff. The neighborhood is “encoded” using
a descriptor vector, ξ, and a kernel function, k, which is used to evaluate the similarity of two atomic
environments. Center: In the regression task, the goal is to infer
an unknown function from a limited number of observations or input
data (section 2). The
result, in GPR, is a function with quantifiable uncertainty. Right:
Applications of GPR. There are two main classes within the scope of
the present review. The first class of applications is the fitting
of atomic properties (section 3): these can be scalar, such as the isotropic chemical shift
in NMR, δiso, or vectors or higher-order tensors,
such as the polarizability, α. The second class
of applications is the construction of interatomic potentials or force
fields (section 4),
which describe atomic energies, εi , as well as interatomic forces, Fi. All these properties are fitted
as functions of the descriptor, ξ. The drawings
on the left are adapted from ref (18). Adapted by permission of The Royal Society
of Chemistry. Copyright 2020 The Royal Society of Chemistry.