Skip to main content
. 2023 Mar 17;227(Suppl 1):S48–S57. doi: 10.1093/infdis/jiac293

Table 1.

Machine Learning Approaches for Clinical Phenotyping

Type of Machine Learning Algorithms Description
Supervised Machine learning methods that train predictive models on labeled data sets.
Categorical classification Logistic regression, decision trees, support vector machine, k-nearest neighbors Prediction of categorical outcomes using linear or nonlinear combinations of input data.
Continuous prediction Linear regression, ridge/LASSO models, support vector machine, Gaussian process regression Prediction of continuous outcomes. Input data can be continuous or categorical.
Unsupervised Machine learning methods that identify patterns in unlabeled data sets.
Clustering k -means, fuzzy c-means, expectation maximization, DBScan, Gaussian mixture model, hierarchical methods (divisive, agglomerative) Machine learning approaches that group unlabeled data based on similar patterns of features to identify latent classes and predict class membership of new data.
Dimensionality reduction Principal component analysis, factor analysis, linear discriminant analysis, backward feature elimination, random forests Methods to transform data sets with many features into lower-dimensional forms by selecting important features or combining features to capture variance in the data set while preserving data relationships as much as possible.
Deep learning Multilayer perceptron, convolutional neural networks, recurrent neural networks, autoencoders Machine learning methods that use training on artificial neural networks for representation learning; utilizes layers of nodes and edges resembling a simplified biological layered neural network to learn patterns or associations in large data sets by generating predictions from the input data and comparing them with ground truth annotations. The activation of a node or “neuron” depends on a weighted combination of inputs from the previous layer.