Table 3.
Prediction performance for combinations of data sourcing with bag of features+Node2vec and random forest algorithms.
| Feature types | AUROCa (%) | ||||
|
|
|
Base feature set | With genetic information | ||
| 1 feature type |
|
|
|||
|
|
Gb | 73.12 | 90.89 | ||
|
|
Dc | 65.01 | 88.37 | ||
|
|
Hd | 91.00 | 95.80 | ||
|
|
Le | 72.83 | 89.94 | ||
|
|
Mf | 73.21 | 90.92 | ||
| 2 feature types |
|
|
|||
|
|
DH | 91.55 | 96.09 | ||
|
|
DL | 77.09 | 90.88 | ||
|
|
DM | 91.30 | 95.92 | ||
|
|
HL | 71.53 | 89.02 | ||
|
|
MH | 91.22 | 95.75 | ||
|
|
ML | 91.98 | 96.01 | ||
| 3 feature types |
|
|
|||
|
|
DHL | 76.76 | 91.28 | ||
|
|
DMH | 91.76 | 96.56 | ||
|
|
DML | 91.43 | 95.76 | ||
|
|
MHL | 91.74 | 96.19 | ||
| 4 feature types |
|
|
|||
|
|
DMHL | 73.12 | 90.89 | ||
aAUROC: area under the receiver operating characteristic curve.
bG: genetic information.
cD: diagnosis.
dH: family historical records.
eL: lab test.
fM: medication.