|
Linear regression: this method can be used to model the relationship between transcriptomic features and a continuous outcome, such as disease activity score or surrogate marker of disease activity like blood lipid levels |
Inouye et al. (2010) |
|
Logistic regression: this method can be used to model the relationship between transcriptomic features and a binary outcome, such as the presence or absence of a disease. For example, Chanrion et al. used logistic regression to predict recurrence of tamoxifen-treated primary breast cancer based on gene expression |
Chanrion et al. (2008); Chadeau-Hyam et al. (2013) |
|
Support vector machines (SVM): SVM is a kernel-based machine learning algorithm that can be used for classification or regression tasks with transcriptomic features |
Huang et al. (2018) |
|
Random forest: random forest is an ensemble machine learning method that creates many decision trees and combines their predictions to make a final prediction. It can be used for both regression and classification problems |
Boulesteix et al. (2012) |
|
Gradient boosting: this is an ensemble method that sequentially trains decision trees to improve upon the errors of the previous tree |
Ma et al. (2020) |
|
Feedforward neural networks: this is a type of neural network that consists of an input layer, one or more hidden layers and an output layer. The input layer takes the transcriptomic features as input, and the output layer produces a prediction, e.g. of disease activity |
Yu et al. (2019) |
|
Convolutional neural networks (CNNs): CNNs are a type of neural network that are particularly well suited for analysing image and other grid-structured data, such as genomic data |
Yu et al. (2019) |
|
Recurrent neural networks (RNNs): RNNs are a type of neural network that are designed to handle sequential data, such as time series. They have been applied to transcriptomic data for tasks such as gene expression prediction and disease diagnosis |
Yu et al. (2019) |