. 2024 Jan 11;6:1292466. doi: 10.3389/frai.2023.1292466

Table 2.

ML techniques for Long COVID diagnosis.

Study	Input data	AI method	Task	Output (%)
Jha et al. (2023)	1,175 EHR & HRCT	Optimized XGBoost	Binary classificationof pulmonary fibrosis	Accuracy 99.37precision 99.54
Pfaff et al. (2022)	N3C repository	XGBoost	Binary classificationof Long COVID	AUC all patients 92hospitalized 90non-hospitalized 85
Jiang et al. (2022)	N3C repository	XGBoostCNN LSTM	Binary classificationof Long COVID	AUCXGBoost 82.2CNN 61.64 LSTM 59.94
Hill et al. (2022)	N3C repository	XGBoostRandom forest	Risk factors associatedwith Long COVID	AUC XGBoost 73 Random forest 69
Gupta et al. (2022)	180 questionnaires	Stacking ensemble technique	Binary classificationof heart diseases	Accuracy 93.23 precision 95.248
Sudre et al. (2021)	2,149 self-reported health status and symptoms	Random forest	Binary classification ofshort and Long COVID	AUC 75.9
Patel et al. (2023)	Expression of 2,925 unique blood proteins	Random forestNLP	Identification of blood proteins for Long COVID detection	AUC 100accuracy 100F1-score 100
Patterson et al. (2021)	Immunologic profiles from 224 individuals	Random forest	Classification of healthy, mild-moderate, severe and Long COVID patients	Multi-class:accuracy 80F1-score 63Long COVID:accuracy 96F1-score 95Severe: accuracy 95F1-score 94
Sengupta et al. (2022)	N3C repository	BiLSTM with 1D CNN model	Binary classificationof Long COVID	Accuracy 70.48
Subramanian et al. (2022)	925 HRCT	VGG-16 ResNet-50 U-Net	Binary classificationof Long COVID	Accuracy from 97.132 to 99.4
Binka et al. (2022)	26,730 health administrative data	Elastic Netregression	Binary classification of Long COVID	AUC 93sensitivity 86 specificity 86
Moreno-Pérez et al. (2021)	277 patients'demographics and comorbidities	Multiple logisticregression	Risk factors associatedwith Long COVID	Cumulative Incidence Value 95
Zhang et al. (2023)	34,605 EHR	Topic modelingclustering	Derive Long COVID subphenotypes	Four Long COVID subphenotypes

The first part of the table (6 rows) refers to ensemble learning, the second part (2 rows) to deep learning, and the last parts (2 rows and 1 row) refer to regression models and other approaches, respectively (all reported measures have the same number of decimal digits as the original paper).