The benefits and pitfalls of machine learning for biomarker discovery

. 2023 Jul 27;394(1):17–31. doi: 10.1007/s00441-023-03816-z

Method	References
Linear regression: this method can be used to model the relationship between transcriptomic features and a continuous outcome, such as disease activity score or surrogate marker of disease activity like blood lipid levels	Inouye et al. (2010)
Logistic regression: this method can be used to model the relationship between transcriptomic features and a binary outcome, such as the presence or absence of a disease. For example, Chanrion et al. used logistic regression to predict recurrence of tamoxifen-treated primary breast cancer based on gene expression	Chanrion et al. (2008); Chadeau-Hyam et al. (2013)
Support vector machines (SVM): SVM is a kernel-based machine learning algorithm that can be used for classification or regression tasks with transcriptomic features	Huang et al. (2018)
Random forest: random forest is an ensemble machine learning method that creates many decision trees and combines their predictions to make a final prediction. It can be used for both regression and classification problems	Boulesteix et al. (2012)
Gradient boosting: this is an ensemble method that sequentially trains decision trees to improve upon the errors of the previous tree	Ma et al. (2020)
Feedforward neural networks: this is a type of neural network that consists of an input layer, one or more hidden layers and an output layer. The input layer takes the transcriptomic features as input, and the output layer produces a prediction, e.g. of disease activity	Yu et al. (2019)
Convolutional neural networks (CNNs): CNNs are a type of neural network that are particularly well suited for analysing image and other grid-structured data, such as genomic data	Yu et al. (2019)
Recurrent neural networks (RNNs): RNNs are a type of neural network that are designed to handle sequential data, such as time series. They have been applied to transcriptomic data for tasks such as gene expression prediction and disease diagnosis	Yu et al. (2019)