. 2020 Feb 26;10:3485. doi: 10.1038/s41598-020-60595-1

Table 1.

Summary of six studies for predicting AD using blood gene expression data.

Study	Data source/# of AD and CN (Training and test datasets)	Feature selection methods (Data used for feature selection)	Classifying method	Number of selected features	Performance
Booij et al.¹⁵	Not publicly available (Norway)/126 AD and 126 CN (Randomly dividing all data into training and test datasets by 3:1 ratio)	Jack-knife (training data)	PLSR	1239 genes	ACC: 0.87 AUC: 0.94
Lunnon et al.¹⁶	ANM/104 AD and 104 CN (Randomly dividing AD and CN data into training and test datasets by 3:1 ratio)	t-test RF with Meng score and backward elimination (training data)	RF	50 probes	ACC: 0.75
Sood et al.¹⁷	ANM1 and ANM2/49 AD and 64 CN, 40 AD and 71 CN (LOOCV)	Bayesian statistic (ULSAM Ageing data GEO:GSE60862)	kNN	150 probes	AUC: 0.73 (ANM1) AUC: 0.66 (ANM2)
Voyle et al.¹⁸	ANM1 and ANM+DCR/100 AD and 107 CN, 118 AD and 118 CN (ANM1 for training, ANM2 + DCR for test)	REF and pickSizeTolerance (Training data)	RF	13 probes (12 genes)	ACC: 0.657 AUC: 0.724
Li et al.¹⁹	ANM1 and ANM2/145 AD and 104 CN, 140 AD and 135 CN (ANM1 for training, ANM2 for test and vice versa)	Ref-REO (Training data)	Not described	1,145 gene pairs (ANM1: training data) 1,249 gene pairs (ANM2: training data)	AUC: 0.733 (ANM2: test set) AUC: 0.775 (ANM1: test set)
Li et al.²⁰	ANM1 and ANM2/143 AD and 104 CN, 102 AD and 78 CN (ANM1 for training, ANM2 for test and vice versa)	LASSO regression (ANM1 and ANM2)	Majority voting of SVM, RR and RF	6 genes (Full6set)	AUC: 0.866 (ANM2: test set) AUC: 0.864 (ANM1: test set)

AD: Alzheimer’s Disease; CN: healthy control; PLSR: partial least square regression; ACC: accuracy; AUC: area under the curve; ANM: AddNueroMed; RF: Random Forest; kNN: k-nearest neighbors; RFE: recursive feature elimination; pickSizeTolerance: a function in caret package²⁹; ULSAM: the Uppsala Longitudinal Study of Adult Men; LOOCV: leave-one-out cross-validation; LASSO: least absolute shrinkage and selection operator; SVM: support vector machine; RF: random forest; RR: logistic ridge regression.