Table 2.
ML techniques for Long COVID diagnosis.
| Study | Input data | AI method | Task | Output (%) |
|---|---|---|---|---|
| Jha et al. (2023) | 1,175 EHR & HRCT | Optimized XGBoost | Binary classificationof pulmonary fibrosis | Accuracy 99.37precision 99.54 |
| Pfaff et al. (2022) | N3C repository | XGBoost | Binary classificationof Long COVID | AUC all patients 92hospitalized 90non-hospitalized 85 |
| Jiang et al. (2022) | N3C repository | XGBoostCNN LSTM | Binary classificationof Long COVID | AUCXGBoost 82.2CNN 61.64 LSTM 59.94 |
| Hill et al. (2022) | N3C repository | XGBoostRandom forest | Risk factors associatedwith Long COVID | AUC XGBoost 73 Random forest 69 |
| Gupta et al. (2022) | 180 questionnaires | Stacking ensemble technique | Binary classificationof heart diseases | Accuracy 93.23 precision 95.248 |
| Sudre et al. (2021) | 2,149 self-reported health status and symptoms | Random forest | Binary classification ofshort and Long COVID | AUC 75.9 |
| Patel et al. (2023) | Expression of 2,925 unique blood proteins | Random forestNLP | Identification of blood proteins for Long COVID detection | AUC 100accuracy 100F1-score 100 |
| Patterson et al. (2021) | Immunologic profiles from 224 individuals | Random forest | Classification of healthy, mild-moderate, severe and Long COVID patients | Multi-class:accuracy 80F1-score 63Long COVID:accuracy 96F1-score 95Severe: accuracy 95F1-score 94 |
| Sengupta et al. (2022) | N3C repository | BiLSTM with 1D CNN model | Binary classificationof Long COVID | Accuracy 70.48 |
| Subramanian et al. (2022) | 925 HRCT | VGG-16 ResNet-50 U-Net | Binary classificationof Long COVID | Accuracy from 97.132 to 99.4 |
| Binka et al. (2022) | 26,730 health administrative data | Elastic Netregression | Binary classification of Long COVID | AUC 93sensitivity 86 specificity 86 |
| Moreno-Pérez et al. (2021) | 277 patients'demographics and comorbidities | Multiple logisticregression | Risk factors associatedwith Long COVID | Cumulative Incidence Value 95 |
| Zhang et al. (2023) | 34,605 EHR | Topic modelingclustering | Derive Long COVID subphenotypes | Four Long COVID subphenotypes |
The first part of the table (6 rows) refers to ensemble learning, the second part (2 rows) to deep learning, and the last parts (2 rows and 1 row) refer to regression models and other approaches, respectively (all reported measures have the same number of decimal digits as the original paper).