Skip to main content
. 2020 Jun 18;13(4):326–339. doi: 10.21053/ceo.2020.00654

Table 2.

AI techniques used for voice-based analysis

Study Analysis modality Objective AI technique Validation method No. of samples in the training dataset No. of samples in the testing dataset Best result
[58] CI Noise reduction NC+DDAE Hold-out 120 Utterances 200 Utterances Accuracy: 99.5%
[59] CI Segregated speech from background noise DNN Hold-out 560×50 Mixtures for each noise type and SNR 160 Noise segments from original unperturbed noise Hit ratio: 84%; false alarm: 7%
[60] CI Improved pitch perception ANN Hold-out 1,500 Pitch pairs 10% of the training material Accuracy: 95%
[61] CI Predicted speech recognition and QoL outcomes k-NN, DT 10-CV A total of 29 patients, including 48% unilateral CI users and 51% bimodal CI users Accuracy: 81%
[62] CI Noise reduction DDAE Hold-out 12,600 Utterances 900 Noisy utterances Accuracy: 36.2%
[63] CI Improved speech intelligibility in unknown noisy environments DNN Hold-out 640,000 Mixtures of sentences and noises - Accuracy: 90.4%
[64] CI Modeling electrode-to-nerve interface ANN Hold-out 360 Sets of fiber activation patterns per electrode 40 Sets of fiber activation patterns per electrode -
[65] CI Provided digital signal processing plug-in for CI WNN Hold-out 120 Consonants and vowels, sampled at 16 kHz; half of data was used as training set and the rest was used as testing set. SNR: 2.496; MSE: 0.086; LLR: 2.323
[66] CI Assessed disyllabic speech test performance in CI k-NN - 60 Patients - Accuracy: 90.83%
[67] Acoustic signals Voice disorders detection CNN 10-CV 451 Images from 10 health adults and 70 adults with voice disorders Accuracy: 90%
[68] Dysphonic symptoms Voice disorders detection ANN Repeated hold-out 100 Cases of neoplasm, 508 cases of benign phonotraumatic, 153 cases of vocal palsy Accuracy: 83%
[69] Pathological voice Voice disorders detection DNN, SVM, GMM 5-CV 60 Normal voice samples and 402 pathological voice samples Accuracy: 94.26%
[70] Acoustic signal Hot potato voice detection SVM Hold-out 2,200 Synthetic voice samples 12 HPV samples from real patients Accuracy: 88.3%
[71] SEMG signals Voice restoration for laryngectomy patients XGBoost Hold-out 75 Utterances using 7 SEMG sensors - Accuracy: 86.4%

AI, artificial intelligence; CI, cochlear implant; NC, noise classifier; DDAE, deep denoising autoencoder; DNN, deep neural network; SNR, signal-to-noise ratio; ANN, artificial neural network; QoL, quality of life; k-NN, k-nearest neighbors; DT, decision tree; CV, cross-validation; WNN, wavelet neural network; MSE, mean square error; LLR, log-likelihood ratio; CNN, convolutional neural network; GMM, Gaussian mixture model; SVM, support vector machine; HPV, human papillomavirus; SEMG, surface electromyographic.