Table 1.
Glossary of technical terms
Artificial Intelligence (AI) | AI is a general term for computer systems that exhibit intelligent behavior and are able to learn, explain, and advise their users. |
---|---|
Machine Learning (ML) | ML is a subset of AI that harnesses statistical algorithms to learn features of data and the associated importance of each (or simply the associated importance of user-defined features). Once the features and their weights, as well as other hyperparameters, are set, the model can predict some outcome or clinical classification on new, unseen data. |
Natural Language Processing (NLP) | NLP is another type of AI that incorporates both statistical and linguistic knowledge to understand human language. |
Features | A measurable property of the data (eg, the number of words spoken is a feature that can be measured from natural language data). |
Hyperparameters | Parameters that are set before the final training of a model rather than learned during the process (eg, the number of iterations of training used to train a model). |
Classification | A category of ML models that are trained to predict a category. It can be binary (e.g., mentally ill or healthy) or multiclass (eg, classifying speech as “schizophrenic-like”, “manic-like”, or non-disordered). |
Regression | A category of ML models that are trained to predict a numerical, continuous-valued output (eg, a clinical rating). |
Neural Network | A type of ML model that is a system of nodes, composed in layers. Each node learns some nonlinear equation on a subset of training data and when all are combined, a categorical or real valued output can be predicted. Modern neural networks have hundreds to thousands of nodes and layers and are trained on large datasets. |
K-Means Clustering | An unsupervised method of partitioning a dataset into K clusters with the goal of minimizing within-cluster variance. Each data point belongs to a single cluster determined by its nearest mean or centroid. |
Edge Case | Any situation that occurs near a decision boundary, at the extremes of the inputs, an exception to a learned rule, or anything that may require additional or special handling. |
Overfitting | A situation where a model too closely fits its training data. This is an issue because it may learn to fit to spurious correlations in the data rather than learning a generalized solution to the problem itself. |