Table 5. Data, features, and methods of analysis.
Ref | Year | Data Set | Features | Method | Performance | |||
---|---|---|---|---|---|---|---|---|
Training | Validation | Test | Total | |||||
[37] | 2016 | 70 Rec, 20 W, 50 N | 25 Rec, 7 W, 18 N | 39 Rec, 10 W, 29 N | 95 Rec | Spectral features (PSD mean, harmonics) | SVM, LRM | 71.4% Se, 88.9% Sp, for SVM on validation set at Rec level |
[38] | 2016 | 5-fold CV | 227 Rec | Denoising autoencoders | SVM | 90% Se, 64% Sp for W Rec level and 90% Se, 44% Sp for C Rec level | ||
[39] | 2016 | N/A | 112 Rec | 112 Rec | Rule-based Seg selection, Power Ratio | Threshold | 90% Se, 90.48% Sp at Rec level | |
[40] | 2016 | N/M | 3036 Seg | MFCC | GMM | 88.1% Se, 99.5% Sp at Seg level | ||
[41] | 2016 | 65% | 10-fold CV | 35% | 870 Ev | Ensemble Empirical Mode Decomposition and Instantaneous Frequency | SVM | 94.2% Se, 96.1% Sp, for SVM on best iteration of test set at Ev level |
[42] | 2016 | 10-fold CV | LOOCV | 400 Ev | Musical features, wavelet-based, teager energy, entropy | LRM | 76 ± 23% Se, 77 ± 22% PPV at Seg level | |
[43] | 2016 | LOOCV | 3120 Rec | MFCC | HMM | Best Acc at Seg level 82.82%, average Acc of 87.7% at Rec level | ||
[44] | 2016 | 219 Ev, 71 N, 39 FC, 39 CC, 35 mono W, 35 poly W | 40 holdout CV | 99 Ev, 31 N, 18 FC, 18 CC, 16 mono W, 16 poly W | 318 Ev | Higher Order Statistics (Cumulants) | GA + k-NN and NB | 94.6% Overall Acc on test set at Ev level |
[45] | 2016 | LOOCV | 72 Ev | LFCC, MFCC, IMFCC, and LPCC | MLP | 97.83% best Overall Acc using MFCC at Ev level | ||
[46] | 2016 | LOOCV | 600 Ev | Energy of High Q-Factor Wavelet coefficients | k-NN, SVM | 95.17% average Acc for SVM at Ev level | ||
[47] | 2015 | LOOCV | 57 Rec | Peak to mean ratio, expected number of false positives | Threshold+SVM | 86% Acc on Rec level | ||
[48] | 2015 | 20 Rec | - | Multiple sets | > 20 Rec | 13 MFCC each with first and second derivatives | k-NN | Performance of 6 different types of test reported as Acc |
[49] | 2015 | 23 Rec, 13 W, 10 N | - | 35 Rec, 19 W, 16 N | 58 Rec | Duration, frequency range, area, power, and slope of spectrum | BPNN | 94.6% Se, 100% Sp at Rec level |
[50] | 2015 | N/A | 45 Rec | 45 Rec | Entropy-based Features | Threshold | 99% Acc Stridor, 70% Acc W, 87% Acc C, 99% Acc N, at Rec level | |
[51] | 2015 | 41 Rec | 41 Rec | Spectral features | GMM | 92.85% Se, 100% Sp at Rec level | ||
[52] | 2015 | LOOCV | 130 Rec | MFCC, correlation score with other auscultation point and other Seg | HMM | Best Acc of 92.26% at Ev level and best Acc of 91% at Rec level | ||
[53] | 2015 | 21 Rec, 5 W, 21 Non-W | 20%-80% Train Validation Set repeated 20 times | Leave-one-out CV | 45 Rec | MFCC, Kurtosis, Entropy | 2 SVM + Threshold | 97.68% Reliability (TPR.TNR) using MFCC at Seg level |
[54] | 2015 | 10-fold CV | 113 Ev | Musical features and spectrogram signature | LRM, RF | 90.9% ± 2% Se, 99.4% ± 1% Sp for RF at Seg level | ||
[55] | 2015 | 70% of data | 15% of data | 15% of data | 28 Rec | Averaged Power Spectrum | ANN | 97.8% Se, 100% Sp on test set at Ev level |
[56] | 2015 | N/A | 24 Rec | 24 Rec | Fractal Dimension, CORSA criterion for Crackle | Threshold | Average Se of 89 ± 10%, PPV of 95 ± 11% at Ev level for different Rec | |
[57] | 2015 | LOOCV | 40 Rec | AR Model | GMM, SVM | 90% best total Acc for GMM on Rec level | ||
[58] | 2015 | LTOCV | 1188 Seg | MFCC, WPT, FT | C-Weighted SVM | 81.5 ± 10% Se, 82.6 ± 7% Sp for MFCC features on Seg level | ||
[59] | 2015 | N/M | 231 Ev | Quartile Frequency Ratios, Mean Crossing Irregularity | SVM, k-NN, NB | 75.78% best Overall Acc for kNN at Ev level | ||
[60] | 2015 | LOOCV | 230 Rec | MFCC | Subject adaptation HMM | 89.4% Se, 80.9% Sp at Ev level and 90.4% Se, 78.3% Sp at Rec level | ||
[61] | 2015 | 10-fold CV | 260 Seg | Audio Spectral Envelope and Tonality Index | SVM | 93% Overall Acc at Seg level | ||
[62] | 2015 | N/A | 100 Ev, 50 C, 50 N | 100 Ev | Mathematical morphology | Threshold | 86% Se, 92% Sp at Ev level | |
[63] | 2014 | N/M | Delay Coordinate | Threshold | 98.39% Acc at Ev level | |||
[64] | 2014 | 5-fold CV | 60 Vol | frequency ratio, average instantaneous frequency, eigenvalues | SVM | Individual Acc reported for all case of one-versus-one and one-versus-all for all features at Rec level | ||
[65] | 2014 | LOOCV | 578 Ev | Instantaneous Kurtosis, Discriminanting Function, Sample Entropy | SVM | 97.7% Mean Acc (Inhale), 98.8% Mean Acc (exhale) at Ev level | ||
[66] | 2014 | 371 Ev | 371 Rec | Centroid, time duration, slope, and area ratio of spectrum | SVM | 88.7% Se, 93.9% Sp at Rec level | ||
[67] | 2014 | LOOCV | 2 Rec | Teager energy, wavelet, fractal dimension, empirical mode decomposition, entropy, and GARCH process | LRM | MCC of 80% at Seg level | ||
[68] | 2014 | 5-fold CV | 120 Ev | Lacunarity, sample entropy, skewness, and kurtosis | SVM, ELM | 86.30% Se, 86.90% Sp for ELM at Ev level | ||
[69] | 2014 | LOOCV | 13 Ev | MFCC | MLP | 100% Acc W, 75% Acc C, 80% Acc N at Ev level | ||
[70] | 2014 | 10-fold CV | 68 Rec | MFCC | SVM, k-NN | 100% Acc N, 100% Acc AOP, 96% Acc PP for kNN at Rec level | ||
[71] | 2014 | 60 Ev | 14 Ev | 18 Ev | 92 Ev | Wavelet packet transform | ANN | 98.89% best average Acc for Symlet-10 wavelet base at Ev level on test set |
[72] | 2013 | 75%-25% Train Validation Set repeated 6 times | 345 Rec | Spectrogram evaluation for W, db5 Wavelet degree of similarity for C | ANN | 80% Se, 67% Sp at Rec level | ||
[73] | 2013 | N/A | 6 Ev | 6 Ev | Time Frequency Analysis and Wavelet Packet Decomposition | Threshold | All Ws detected | |
[74] | 2013 | N/A | 40 Rec | 40 Rec | Time Frequency Analysis | Threshold | 99.2% Se, 72.5% Sp at Ev level | |
[75] | 2013 | 60%-40% Train Validation Set repeated 25 times | 68 Rec | MFCC | SVM | 94.11% Acc N, 92.31% Acc AOP, 88% Accruacy PP, for SVM at Rec level | ||
[76] | 2013 | 2000 Seg, 1000 N, 1000 C | 2000 Seg, 1000 N, 1000 C | 2000 Seg, 1000 N, 1000 C | 6000 Seg | Time Frequency Analysis (Spectrogram), Time Scale Analysis (Wavelet) | SVM, MLP, k-NN | 97.5% Overall Acc rate for SVM using Time Frequency Analysis at Seg level |
[77] | 2013 | N/A | 59 Rec | 59 Rec | Correlation Coefficient | Threshold | 88% Se, 94% Sp at Rec level | |
[78] | 2012 | 10-fold CV | 28 Rec | Cortical Model | SVM | 89.44% Se, 80.50% Sp at Rec level | ||
[79] | 2012 | LOOCV | 126 Rec, 723 Ev | Power, spectral features, and duration distribution | HMM | 88.7% Se, 91.5% Sp at Ev level and 87% Se, 81% Sp at Rec level | ||
[80] | 2012 | N/A | 47 Rec | 47 Rec | Local similarity measure using Mutual Information, Weighted cepstral features | Threshold | High Acc for local similarity measure and separability index of 1 for weighted cepstral | |
[81] | 2012 | N/A | 180 Seg | 180 Seg | fractional Hilbert transform | Threshold | Acc of 90.5% at Seg level | |
[82] | 2012 | N/A | 33 C Ev | 33 Ev | fractional Hilbert transform and correlation coefficient | Threshold | Se 94.28%, PPV 97.05% at Ev level | |
[83] | 2012 | N/A | 26 Rec, 13 N, 13 W | 26 Rec | LPC prediction error ratio | Threshold | 70.9% Se, 98.6% Sp at Ev level | |
[84] | 2012 | N/A | 433 Seg | 433 Seg | Abnormality level | Threshold | 84.5% Acc at Seg level | |
[85] | 2012 | 50%-50% Train Validation Set repeated 100 times | 689 Ev | Multi-scale PCA (Wavelet) | Empirical Classification | 97.3% ± 2.7% Overall Acc for N vs CAS, 98.34% Overall Acc for N vs CAS+DAS at Ev level | ||
[86] | 2011 | LOOCV | 585 Ev | Temporal-Spectral Dominance spectrogram | k-NN | 92.4% ± 2.9% Overall Acc at Ev level | ||
[87] | 2010 | LOOCV | 4-7 Rec Each | MFCC | GMM | 52.5% Overall Acc on validation | ||
[88] | 2010 | N/A | 21 Vol, 393 W Ev | 393 Ev | Continuous Wavelet Transform | Man-Whitney U Test | Significance test for features | |
[89] | 2009 | LOOCV | 492 Seg | Kurtosis, Renyi entropy, frequency power ratio, Mean crossing irregularity | FDA | 93.5% Overall Acc at Seg level | ||
[90] | 2009 | LOOCV | 2807 Seg | Fourier Transform, LPC, Wavelet Transform, MFCC | VQ, GMM, ANN | 94.6% Se, 91.9% Sp for GMM using MFCC at Seg level | ||
[91] | 2009 | 180 Ev | - | 180 Ev | 360 Ev | averaged power spectrum | MLP, GAL, ISNN | Overall Acc of 98% for ISNN at Ev level |
[92] | 2009 | 75%-25% train-test split repeated 200 times | 362 Ev | Lacunarity | Discriminant Analysis | 99.75% maximum mean Acc at Seg level | ||
[93] | 2009 | LOOCV | 1544 Ev | MFCC | HMM | 93.2% Se, 64.8% Sp at Ev level | ||
[94] | 2009 | 40 Ev, 20 W, 20 N | - | 28 Rec, 112 Ev, 40 W, 72 N | 152 Ev | Amplitude and Frequency of largest edge of pre-processed spectrogarm | MLP | 86.1% Se, 82.5% Sp on test set at Ev level |
[95] | 2009 | N/A | 17 Rec | 17 Rec | Entropy-based features | Threshold | 84.4% Se, 80% Sp at Rec level | |
[96] | 2008 | 40 Vol | LOOCV | 25 Vol | 65 Vol | AR Coefficients | k-NN, Minimum Distance-based | 92% Se, 100% Sp using k-NN on test set at Rec level |
[97] | 2008 | N/A | 40 Ev | 40 Ev | Peak selection based on time duration | Threshold | 84% Se, 86% Sp at Ev level | |
[98] | 2008 | N/A | 186 Ev | 186 Ev | Distortion in Histogram of Sample Entropy | Threshold | 97.9% Acc Expiration, 85.3% Acc Inspiration at Ev level | |
[99] | 2007 | N/M | 870 Ev | MFCC | GMM | Acc 94.9% at Seg level | ||
[100] | 2007 | N/A | 18 Rec | 182 C Ev | Fractal Dimension | Threshold | 92.9% Se, 94.4% PPV at Ev level, 93.9% best Acc for classification | |
[101] | 2007 | 3 Vol, 85 W Ev | - | 10 Vol, 337 W Ev | 422 W Ev | Peak selection based on local maxima, coexistence, continuity, grouping | Threshold | Se 95.5 ± 4.8%, Sp 93.7 ± 9.3% at Ev level on test set |
[102] | 2005 | 50%-50% train-test Seg from same Ev split | 57 Vol | AR parameters and Cepstral Coefficients | MLP | 10-20% average misclassification error on test set at Ev level for cepstral features | ||
[103] | 2005 | N/A | 16 Vol | 16 Vol | spectrogram image | Edge Detection | Se and Sp above 89% | |
[104] | 2005 | 912 Seg | 114 Seg | 114 Seg | 1140 Seg | multi-variate AR model | BPNN | 80.7% Se, 84.21% Sp at Seg level on validation set |
[105] | 2005 | 160 Ev, 80 CC, 80 FC | - | 231 Ev, 158 CC, 73 FC | 391 Ev | wavelet network | Discriminant Function | 84% and 70% Acc for FC and CC respectively on test set at Ev level |
[106] | 2004 | N/A | 31 Vol | 31 Vol | energy | Threshold | 100% Se and Sp for a high airflow and 71% Se, 88.2% Sp for low airflow, at Ev level | |
[107] | 2000 | 1253 Ev, 509 Ab, 744 N | repeated 5 times | 1195 Ev, 530 Ab, 665 N | 2448 Ev | averaged power spectrum | BPNN | Best Se 59%, 81% Sp for recorded sound and Se 87%, 95% Sp for CD data at Ev level for Ab vs N respiratory sound classification |
[108] | 1997 | N/A | 2 Rec | 2 Rec | Matched wavelet | Threshold | Detection Acc of 99.8% and classification Acc of almost 100% at Seg level | |
[109] | 1997 | LOOCV | 69 Vol | AR model, crackle parameters | k-NN, multinomial, voting | Overall Acc of 71.07% at Rec level to classify pathology | ||
[110] | 1996 | 50%-50% training-test split | 13 Vol | Wavelet packet decomposition | LVQ (ANN Variant) | 59% Se, 24% PPV for FC, 19% Se, 6% PPV for CC, and 58% Se, 18% PPV for W at Seg level | ||
[111] | 1995 | 242 Seg, 128 W, 114 N | - | 2 test set: 233 Seg, 107 W, 126 N, and 235 Seg, 140 W, 95 N | 710 Seg | Power spectrum | BPNN, RBF, SOM, LVQ | Overall Acc of 93% and 96% on the two sets by using LVQ at Seg level |
[112] | 1992 | N/A | 9 Vol | 9 Vol | Energy envelope, Crackle characteristics | Threshold, Hierarchical clustering | 100% Acc on classifying FC vs CC at Ev level | |
[113] | 1984 | 42 Ev, 6 for each types | - | 105 Ev, 10-15 for each types | 147 Ev | LPC | Clustering (Minimum Distance) | Overall Acc of 95.24% at Ev level |
Rec: Recording, Ev: Event, Seg: Segment
W: Wheeze, FC: Fine Crackle, CC: Coarse Crackle, N: Normal, Ab: Abnormal, Vol: Volunteer
CV: Cross-Validation, Se; Sensitivity, Sp: Specificity, PPV: Positive Predictive Value, Acc: Accuracy
N/A: Not Applicable, N/M: Not Mentioned