Table 3.
Performance analysis of model-based reasoning methods in combination with rule-based reasoning methods applied for syndrome pattern diagnosis of lung disease based on Word2Vec in the test and external data sets.
| Model and data set | Accuracy, mean (95% CI) | Precision, mean (95% CI) | Recall, mean (95% CI) | F1 score, mean (95% CI) | |||
| Word2Vec + RFa |
|
|
|
|
|||
|
|
Test | 0.9131 (0.8990-0.9261) | 0.9934 (0.9814-0.9983) | 0.9628 (0.9538-0.9748) | 0.9774 (0.9644-0.9864) | ||
|
|
External | 0.9040 (0.8903-0.9180) | 0.9657 (0.9547-0.9747) | 0.9580 (0.9501-0.9721) | 0.9617 (0.9477-0.9697) | ||
| Word2Vec + XGBoostb |
|
|
|
|
|||
|
|
Test | 0.7703 (0.7583-0.7803) | 0.9666 (0.9556-0.9786) | 0.9044 (0.8924-0.9144) | 0.9333 (0.9233-0.9433) | ||
|
|
External | 0.7980 (0.7871-0.8112) | 0.9702 (0.9582-0.9812) | 0.9227 (0.9137-0.9337) | 0.9444 (0.9364-0.9544) | ||
| Word2Vec + KNNc |
|
|
|
|
|||
|
|
Test | 0.8414 (0.8324-0.8534) | 0.9380 (0.9270-0.9502) | 0.9254 (0.9164-0.9334) | 0.9312 (0.9202-0.9432) | ||
|
|
External | 0.8521 (0.8403-0.8612) | 0.9441 (0.9321-0.9571) | 0.9373 (0.9263-0.9473) | 0.9446 (0.9306-0.9556) | ||
| Word2Vec + MLPd |
|
|
|
|
|||
|
|
Test | 0.9052 (0.8930-0.9181) | 0.9751 (0.9621-0.9830) | 0.9758 (0.9678-0.9858) | 0.9752 (0.9652-0.9862) | ||
|
|
External | 0.9021 (0.8940-0.9151) | 0.9791 (0.9671-0.9911) | 0.9780 (0.9660-0.9904) | 0.9784 (0.9704-0.9904) | ||
| Word2Vec + CNNe |
|
|
|
|
|||
|
|
Test | 0.9229 (0.9099-0.9319) | 0.9884 (0.9744-0.9964) | 0.9679 (0.9589-0.9809) | 0.9778 (0.9698-0.9888) | ||
|
|
External | 0.9160 (0.9030-0.9261) | 0.9765 (0.9655-0.9885) | 0.9662 (0.9582-0.9782) | 0.9698 (0.9608-0.9778) | ||
aRF: random forest.
bXGBoost: extreme gradient boosting.
cKNN: K nearest neighbor.
dMLP: multilayer perceptron.
eCNN: convolutional neural network.