Skip to main content
. 2024 Apr 16;10:e1984. doi: 10.7717/peerj-cs.1984

Table 4. Benchmark models for speech accent classification.

Reference Dataset Accents Accuracy Features Classifier Remarks
Kethireddy, Kadiri & Gangashetty (2020) Common voice 8 81.26% Raw wave CNN CNN applied directly to the raw audio waveform
Ubale et al., (2019) TOEFL 11 86.05% Raw wave, I-vector CNN, Attentive- pooling, PLDA CNN applied to raw waveform before weighted global averaging and fusion with PLDA using I-vector
Ubale, Qian & Evanini (2018) 11 83.32% Log Filter-Bank, I-vector RNN, Attention, CNN, PLDA Fusion of RNN and CNN applied to Log Filter-Bank features, and PLDA applied to I-vector.
Jiao et al. (2016b) 11 51.92% MFCC, LT vector RNN, DNN Fusion of RNN applied to MFCC sequence, and DNN applied to a statistically modelled long-term vector.
Shivakumar, Chakravarthula & Georgiou (2016) 11 79.93% I-vector PLDA PLDA applied to I-vector
Rizwan & Anderson (2018) TIMIT 7 77.88% MFCC, deltas ELM ELM applied to the combination of MFCC and delta features
Ge (2015) FAE 7 54.00% PLP, PCA, HLDA UBM-GMM Universal Background GMM model applied to PLP features compressed using PCA and HLDA
Brown (2018) AISEB 4 86.70% MFCC, ACCDIST SVM SVM applied to distance matrix among vowel acoustic features
Najafian & Russell (2020) ABI 4 84.87% PPRLM, I-vector SVM Fusion of classification using I-vector and Phonotactic features
De Marco & Cox (2013) 4 81.05% I-vector projections LDA LDA used to project I-vectors in lower dimensions before classification