Skip to main content
. 2022 May 26;48(5):949–957. doi: 10.1093/schbul/sbac038

Table 1.

Glossary of technical terms

Artificial Intelligence (AI) AI is a general term for computer systems that exhibit intelligent behavior and are able to learn, explain, and advise their users.
Machine Learning (ML) ML is a subset of AI that harnesses statistical algorithms to learn features of data and the associated importance of each (or simply the associated importance of user-defined features). Once the features and their weights, as well as other hyperparameters, are set, the model can predict some outcome or clinical classification on new, unseen data.
Natural Language Processing (NLP) NLP is another type of AI that incorporates both statistical and linguistic knowledge to understand human language.
Features A measurable property of the data (eg, the number of words spoken is a feature that can be measured from natural language data).
Hyperparameters Parameters that are set before the final training of a model rather than learned during the process (eg, the number of iterations of training used to train a model).
Classification A category of ML models that are trained to predict a category. It can be binary (e.g., mentally ill or healthy) or multiclass (eg, classifying speech as “schizophrenic-like”, “manic-like”, or non-disordered).
Regression A category of ML models that are trained to predict a numerical, continuous-valued output (eg, a clinical rating).
Neural Network A type of ML model that is a system of nodes, composed in layers. Each node learns some nonlinear equation on a subset of training data and when all are combined, a categorical or real valued output can be predicted. Modern neural networks have hundreds to thousands of nodes and layers and are trained on large datasets.
K-Means Clustering An unsupervised method of partitioning a dataset into K clusters with the goal of minimizing within-cluster variance. Each data point belongs to a single cluster determined by its nearest mean or centroid.
Edge Case Any situation that occurs near a decision boundary, at the extremes of the inputs, an exception to a learned rule, or anything that may require additional or special handling.
Overfitting A situation where a model too closely fits its training data. This is an issue because it may learn to fit to spurious correlations in the data rather than learning a generalized solution to the problem itself.