Table 2.
Reference | Year | Topic | Data and Cohort | Recording Device | ML Models Used | Data Processing Methods | KPIs |
---|---|---|---|---|---|---|---|
[43] | 2018 | Symptom Identification: Wheeze | Private-255 breathing cycles, 50 patients | Smartphone | SVM | Bag-of-Words To Features | Acc: 75.21% |
[44] | 2020 | Disease Identification: Bronchitis, Pneumonia | Private-739 recordings | Various Microphones | K-NN | EMD, MFCCs, GTCC | Acc: 99% |
[45] | 2021 | Disease Identification: Bronchial Asthma | Private-952 recordings | High-end Microphones | NN, RF | Spectral Bandwidth, Spectral Centroid, ZCR, Spectral Roll-Off, Chromacity | Sens: 89.3% Spec: 86% Acc: 88% Youden’s Index: 0.753 |
[46] | 2019 | Disease Identification: COPD | Private-55 recordings | Stethoscope | Fine Gaussian SVM | Statistical Features, MFCCs | Acc: 100% |
[47] | 2018 | Disease Identification: Asthma, COPD | Private-80 normal, 80 COPD, and 80 asthma recordings | Stethoscope | ANN | PSD Extracted Features, Feature Selection (ANOVA) | Acc: 60% Spec: 54.2% |
[48] | 2022 | Disease Identification: COVID-19 | Coswara-120 recordings from COVID-19 patients, 120 recordings from Healthy patients | Various Microphones | Neural Network | Statistical and CNN-BiLSTM Extracted Features | Acc: 100% (shallow recordings), 88.89% (deep recordings) |
[49] | 2021 | Disease Identification: COVID-19 | COVID-19 Sounds-141 recordings | High-end Microphones | VGGish | Spectral Centroid, MFCCs, Roll-off Frequency, ZCR | ROC-AUC: 80% Prec: 69% Recall: 69% |
[50] | 2022 | Symptom Identification: Wheeze, Crackle | Respiratory Sounds Database (RSDB) and private-943 recordings | Various Microphones | ResNet | Padding, STFT, Spectrum Correlation, Log-Mel Spectrograms, Normalization | Sens: 76.33% Spec: 78.86% |
[51] | 2021 | Symptom Identification: Wheeze, Crackle | RSDB-920 recordings | Various Microphones | ANN, SVM, RF | Time Statistics and Frequency features | Acc: NN: 73%, RF: 73%, SVM: 78.3% |
[52] | 2019 | Symptom Identification: Wheeze, Crackle | Private-21 normal samples, 12 wheezes, and 35 crackles | Stethoscope | SVM | CWT, Gaussian Filter, Average Power, Stacked Autoencoder | Acc: 86.51% |
[53] | 2019 | Symptom Identification: Wheeze, Crackle | RDSB’s stethoscope recordings-834 recordings | Stethoscope | ResNet | Optimized S-transform | Sens: 96.27% Spec: 100% Acc: 98.79% |
[54] | 2022 | Symptom Identification: Wheeze, Crackle | RDSB-920 recordings | Various Microphones | VGG-16 | Fluid-Solid Modeling, Recording Simulation, Downsampling, Feature Extraction | Sens: 28%, Spec: 81% |
[55] | 2020 | Disease Identification: Bronchiectasis, Bronchiolitis, COPD, Pneumonia, URTI, Healthy | RDSB-920 recordings | Various Microphones | RF | Resampling, Windows, Filtering, EMD, Features | Acc: 88%, Prec: 91%, Recall: 87%, Spec: 97% |
[47] | 2020 | Symptom Identification: Wheeze, Crackle | Private-705 lung sounds (240 crackle, 260 rhonchi, and 205 normal) | Stethoscope | SVM, NN, K-Nearest Neighbors (K-NN) | CWT | Acc: 90.71%, Sens: 91.19%, Spec: 95.20% |
[56] | 2021 | Symptom Identification: Crackle, Normal, Stridor, Wheeze | Private-600 recordings | Stethoscope | SVM, K-NN | Filtering, Amplification, Dimensionality Reduction, MFCCs, NLM Filter | Acc: SVM: 92%, K-NN: 97% |
[57] | 2021 | Symptom Identification: Wheeze, Crackle | RSDB-920 recordings | Various Microphones | 2D CNN | RMS Norm, Peak Norm, EBU Norm, Data Augmentation | Acc: 88% |
[58] | 2019 | Symptom Identification: Wheeze, Crackle | Private-384 recordings | Stethoscope | VGGish-BiGRU | Spectrograms | Acc: 87.41% |
[59] | 2017 | Symptom Identification: Wheeze, Crackle | Private-60 recordings | Stethoscope | Gaussian Mixture Model | MFCCs | Acc: 98.4% |
[60] | 2017 | Symptom Identification: Wheeze, Crackle | Private-recordings containing 11 crackles, 3 wheezes, 4 stridors, 2 squawks, 2 rhonchi, and 29 normal sounds | Digital stethoscope | MLP | EMD, IMF, Spectrum, Feature Extraction | Acc: Crackles 92.16%, Wheeze 95%, Stridor 95.77%, Squawk 99.14%, Normal 88.36%, AVG 94.82% |
[61] | 2021 | Symptom Identification: Wheeze, Crackle | RSDB-920 recordings | Various Microphones | VGG-16 | Resampling, Windows, Filtering, Mel spectrogram (Mel, Harmonic, Percussive, Derivative) | Acc: Wheeze 89.00%, Rhonchi 68.00%, Crackles 90.00% |
[62] | 2020 | Symptom Identification: Wheeze, Crackle | RSDB-920 recordings | Various Microphones | ResNet | Resampling, Windows, Filtering, Data Augmentation, Mel-spectrogram, Device Specific Features | 80/40 Split 4 class (per device): Spec: 83.3%, Sens: 53.7%, Score: 68.5% |
[63] | 2021 | Symptom Identification: Crackle, Wheeze – Disease Identification: Asthma, Cystic Fibrosis | Private-Recordings from 95 patients | Various Microphones | N/A | N/A | 85% agreement (k = 0.35 (95% CI 0.26-0.44)) between conventional and smartphone auscultation Features |
[64] | 2021 | Symptom Identification: Wheeze, Crackle, Other | RSDB-920 recordings | Various Microphones | LDA, SVM with Radial Basis Function (SVMrbf), Random Undersampling Boosted trees (RUSBoost), CNNs. | Spectrogram, Mel-spectrogram, Scalogram, Feature Extraction | Acc: 99.6% |
[65] | 2022 | Symptom Identification: Wheeze, Crackle, Normal | RSDB-920 recordings | Various Microphones | Hybrid CNN-LSTM | Feature Extraction | Sens: 52.78% Spec: 84.26% F1: 68.52% Acc: 76.39% |
Note. ML models: SVM = Support Vector Machine; K-NN = K-Nearest Neighbors; RF = Random Forest; ANN = Artificial Neural Network; NN = Neural Network; CNN = Convolutional Neural Network; MLP = Multilayer Perceptron; RUSBoost = Random Undersampling Boosted trees; LSTM = Long Short-Term Memory; LDA = Linear Discriminant Analysis. Data Processing Methods: EMD = Empirical Mode Decomposition; MFFC = Mel-Frequency Cepstral Coefficient; GTCC = Gamatone Cepstral Coefficient; ZCR = Zero-Crossing Rate; PSD = Power Spectral Density; STFT = Short Time Fourier Transform; CWT = Continous Wavelet Transform; S-Tranform = Stockwell Transform; NLM = Non-Local Means; RMS = Root Mean Square. Metrics: Acc = Accuracy; Sens = Sensitivity; Spec = Specificity AUC = Area Under Curve; IoU = Intersection over Union; ROC Curve = Receiver Operating Characteristic Curve; AROC = Area under the ROC curve.