Kethireddy, Kadiri & Gangashetty (2020)
|
Common voice |
8 |
81.26% |
Raw wave |
CNN |
CNN applied directly to the raw audio waveform |
Ubale et al., (2019)
|
TOEFL |
11 |
86.05% |
Raw wave, I-vector |
CNN, Attentive- pooling, PLDA |
CNN applied to raw waveform before weighted global averaging and fusion with PLDA using I-vector |
Ubale, Qian & Evanini (2018)
|
|
11 |
83.32% |
Log Filter-Bank, I-vector |
RNN, Attention, CNN, PLDA |
Fusion of RNN and CNN applied to Log Filter-Bank features, and PLDA applied to I-vector. |
Jiao et al. (2016b)
|
|
11 |
51.92% |
MFCC, LT vector |
RNN, DNN |
Fusion of RNN applied to MFCC sequence, and DNN applied to a statistically modelled long-term vector. |
Shivakumar, Chakravarthula & Georgiou (2016)
|
|
11 |
79.93% |
I-vector |
PLDA |
PLDA applied to I-vector |
Rizwan & Anderson (2018)
|
TIMIT |
7 |
77.88% |
MFCC, deltas |
ELM |
ELM applied to the combination of MFCC and delta features |
Ge (2015)
|
FAE |
7 |
54.00% |
PLP, PCA, HLDA |
UBM-GMM |
Universal Background GMM model applied to PLP features compressed using PCA and HLDA |
Brown (2018)
|
AISEB |
4 |
86.70% |
MFCC, ACCDIST |
SVM |
SVM applied to distance matrix among vowel acoustic features |
Najafian & Russell (2020)
|
ABI |
4 |
84.87% |
PPRLM, I-vector |
SVM |
Fusion of classification using I-vector and Phonotactic features |
De Marco & Cox (2013)
|
|
4 |
81.05% |
I-vector projections |
LDA |
LDA used to project I-vectors in lower dimensions before classification |