Table 1. Glossary of Technical Terms and Concepts Relevant to GPRa.
Covariance | A measure for the strength of statistical correlation between two data values, y(x) and y(x′), usually expressed as a function of the distance between x and x′. Uncorrelated data lead to zero covariance. |
Descriptor | In the context of regression, descriptors (sometimes called “features”) encode the independent variables into a vector, x, on which the modeled variable, y, depends. |
Hyperparameter | A global parameter of an ML model that controls the behavior of the fit. Distinct from the potentially very large number of “free parameters” that are determined when the model is fitted to the data. Hyperparameters are estimated from experience or iteratively optimized using data. |
Kernel | A similarity measure between two data points, normally denoted k(x, x′). Used to construct models of covariance. |
Overfitting | A fit that is accurate for the input data but has uncontrolled errors elsewhere (typically because it has not been regularized appropriately). |
Prior | A formal quantification, as a probability distribution, of our initial knowledge or assumption about the behavior of the model, before the model is fitted to any data. |
Regularity | Here, we take a function to be regular if all of its derivatives are bounded by moderate bounds. Loosely interchangeable with “degree of smoothness”. |
Regularization | Techniques to enforce the regularity of fitted functions. In the context of GPR, this is achieved by penalizing solutions which have large basis coefficient values. The magnitude of the regularization may be taken to correspond to the “expected error” of the fit. |
Sparsity | In the context of GPR, a sparse model is one in which there are far fewer kernel basis functions than input data points, and the locations of these basis functions (which we call the representative set) need not coincide with the input data locations. |
Underfitting | A fit that does not reach the accuracy, on neither the training nor the test data, that would be possible to achieve by a better choice of hyperparameters. |
These definitions do not yet refer to physical properties, but they will be used in subsequent sections. For a comprehensive introduction to GPR, we refer the reader to ref (39).