Table IV.
Comparison of Recurrent Neural Network models In Speech Processing
| Architecture | Application | Contribution | Limitations |
|---|---|---|---|
| Amodei et al. [159] Gated Recurrent Unit Network | English or Chinese Speech Recognition | Optimized speech recognition using Gated Recurrent Units to achieve near human-level results | Deployment requires GPU server |
| Weston et al. [134] Memory Network | Answering questions about simple text stories | Integration of long term memory (readable and writable) component within neural network architecture | Questions and input stories are still rather simple |
| Wu et al. [136] Deep LSTM | Language Translation (e.g. English-to-French) | Multi-layer LSTM with attention mechanism | Challenging translation cases and multisentence input yet to be tested |
| Karpathy et al. [137] CNN/RNN Fusion | Labeling Images and Image Regions | Hybrid CNN-RNN model to generate natural language descriptions of images | Fixed image size / requires training CNN and RNN models separately |