Table 5.
Ordering of symptom checkers and physicians (denoted as MD1, MD2, and MD3) from best-performing to worst-performing symptom checkers and physicians.
| Metrics | Descending order (best to worst) | Symptom checkers | Doctors | |||
|
|
|
Values, range (%) | Values, SD (%) | Values, range (%) | Values, SD (%) | |
| M1% | MD3, Avey, MD2, Ada, MD1, K Health, Buoy, WebMD, and Babylon | 65.3 | 21 | 22.8 | 9 | |
| M3% | MD3, Avey, Ada, MD2, MD1, WebMD, Buoy, K Health, and Babylon | 84.8 | 27 | 26.2 | 11 | |
| M5% | Avey, MD3, Ada, MD2, MD1, WebMD, K Health, Buoy, and Babylon | 87.2 | 27 | 25.8 | 11 | |
| Average recall | Avey, Ada, MD3, WebMD, MD1 and MD2 (a tie), K Health, Buoy, and Babylon | 70.9 | 22 | 16.1 | 8 | |
| Average precision | MD3, MD2, MD1, Ada, Avey, K Health, Buoy, WebMD, and Babylon | 40.6 | 13 | 19.5 | 8 | |
| Average F1-measure | MD3, Avey, MD2, Ada, MD1, K Health, Buoy and WebMD (a tie), and Babylon | 32.9 | 16 | 15.3 | 6 | |
| Average NDCGa | Avey, MD3, Ada, MD2, MD1, WebMD, K Health, Buoy, and Babylon | 74.2 | 23 | 21.3 | 9 | |
aNDCG: Normalized Discounted Cumulative Gain.