. 2024 Apr 16;10:e1984. doi: 10.7717/peerj-cs.1984

Table 4. Benchmark models for speech accent classification.

Reference	Dataset	Accents	Accuracy	Features	Classifier	Remarks
Kethireddy, Kadiri & Gangashetty (2020)	Common voice	8	81.26%	Raw wave	CNN	CNN applied directly to the raw audio waveform
Ubale et al., (2019)	TOEFL	11	86.05%	Raw wave, I-vector	CNN, Attentive- pooling, PLDA	CNN applied to raw waveform before weighted global averaging and fusion with PLDA using I-vector
Ubale, Qian & Evanini (2018)		11	83.32%	Log Filter-Bank, I-vector	RNN, Attention, CNN, PLDA	Fusion of RNN and CNN applied to Log Filter-Bank features, and PLDA applied to I-vector.
Jiao et al. (2016b)		11	51.92%	MFCC, LT vector	RNN, DNN	Fusion of RNN applied to MFCC sequence, and DNN applied to a statistically modelled long-term vector.
Shivakumar, Chakravarthula & Georgiou (2016)		11	79.93%	I-vector	PLDA	PLDA applied to I-vector
Rizwan & Anderson (2018)	TIMIT	7	77.88%	MFCC, deltas	ELM	ELM applied to the combination of MFCC and delta features
Ge (2015)	FAE	7	54.00%	PLP, PCA, HLDA	UBM-GMM	Universal Background GMM model applied to PLP features compressed using PCA and HLDA
Brown (2018)	AISEB	4	86.70%	MFCC, ACCDIST	SVM	SVM applied to distance matrix among vowel acoustic features
Najafian & Russell (2020)	ABI	4	84.87%	PPRLM, I-vector	SVM	Fusion of classification using I-vector and Phonotactic features
De Marco & Cox (2013)		4	81.05%	I-vector projections	LDA	LDA used to project I-vectors in lower dimensions before classification