Table 1.
Top | Random forest | XGBoost | CNN | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Accuracy | STD(±) | AUC | STD(±) | Accuracy | STD(±) | AUC | STD(±) | Accuracy | STD(±) | AUC | STD(±) | |
100 | 66.46 | 7.79 | 0.7137 | 0.0576 | 70.24 | 2.80 | 0.7266 | 0.0281 | 68.29 | 2.87 | 0.7216 | 0.0603 |
200 | 67.18 | 3.88 | 0.7175 | 0.0386 | 67.99 | 1.52 | 0.7166 | 0.0245 | 69.52 | 5.08 | 0.7182 | 0.0494 |
300 | 66.26 | 3.80 | 0.7098 | 0.0377 | 68.20 | 3.32 | 0.7029 | 0.0272 | 70.64 | 2.20 | 0.7250 | 0.0585 |
400 | 67.58 | 4.67 | 0.7074 | 0.0428 | 69.42 | 3.43 | 0.7177 | 0.0234 | 67.99 | 4.65 | 0.7167 | 0.0412 |
500 | 67.59 | 7.79 | 0.7111 | 0.0457 | 71.05 | 2.56 | 0.7381 | 0.0325 | 71.56 | 6.58 | 0.7411 | 0.0614 |
1000 | 68.31 | 5.22 | 0.7178 | 0.0445 | 73.08 | 2.89 | 0.7407 | 0.0372 | 73.91 | 3.87 | 0.7741 | 0.0444 |
2000 | 68.70 | 3.13 | 0.7372 | 0.0424 | 72.48 | 2.61 | 0.7509 | 0.0365 | 73.29 | 2.77 | 0.7782 | 0.0409 |
3000 | 67.78 | 3.59 | 0.7282 | 0.0351 | 69.62 | 4.27 | 0.7376 | 0.0328 | 73.80 | 2.40 | 0.7862 | 0.0282 |
4000 | 68.19 | 4.69 | 0.7263 | 0.0469 | 71.15 | 4.07 | 0.7412 | 0.0368 | 75.02 | 3.17 | 0.8157 | 0.0261 |
5000 | 66.25 | 5.41 | 0.7105 | 0.0399 | 70.74 | 3.14 | 0.7330 | 0.0305 | 73.19 | 4.72 | 0.8003 | 0.0506 |
10 000 | 66.26 | 5.59 | 0.6919 | 0.0528 | 69.63 | 3.27 | 0.7248 | 0.0211 | 71.05 | 6.57 | 0.7083 | 0.1424 |
Notes: The table shows the number of top SNPs selected based on phenotype influence score for AD classification and the accuracy and AUC of 10-fold cross-validation. Our CNN-based approach yielded the highest accuracy and AUC of 75.02% and 0.8157, respectively, for 4000 SNPs. In all cases, our CNN models outperformed two traditional machine learning models, random forest and XGBoost