Table 1.
Year | Reference | Methods | Database | Recognition Task | Rec. Rate (%) | |
---|---|---|---|---|---|---|
Front-End | Back-End | |||||
2016 | Assael et al. [15] | 3D-CNN | Bi-GRU | GRID | Sentences | 95.20 |
2016 | Chung and Zisserman [22] | VGG-M | LSTM | OuluVS2 | Phrases | 31.90 |
SyncNet | LSTM | OuluVS2 | Phrases | 94.10 | ||
2016 | Chung and Zisserman [23] | CNN | LRW | Words | 61.10 | |
CNN | OuluVS | Phrases | 91.40 | |||
CNN | OuluVS2 | Phrases | 93.20 | |||
2016 | Wand et al. [19] | Eigenlips | SVM | GRID | Phrases | 69.50 |
HOG | SVM | GRID | Phrases | 71.20 | ||
Feed-forward | LSTM | GRID | Phrases | 79.50 | ||
2017 | Chung and Zisserman [24] | CNN | LSTM + attention | OuluVS2 | Phrases | 91.10 |
CNN | LSTM + attention | MV-LRS | Sentences | 43.60 | ||
2017 | Chung et al. [16] | CNN | LSTM+attention | LRW | Words | 76.20 |
CNN | LSTM + attention | GRID | Phrases | 97.00 | ||
CNN | LSTM + attention | LRS | Sentences | 49.80 | ||
2017 | Petridis et al. [25] | Autoencoder | Bi-LSTM | OuluVS2 | Phrases | 94.70 |
2017 | Stafylakis and Tizimiropoulos [26] | 3D-CNN + ResNet | Bi-LSTM | LRW | Words | 83.00 |
2018 | Fung and Mak [9] | 3D-CNN | Bi-LSTM | OuluVS2 | Phrases | 87.60 |
2018 | Petridis et al. [10] | 3D-CNN + ResNet | Bi-GRU | LRW | Words | 82.00 |
2018 | Wand et al. [20] | Feed-forward | LSTM | GRID | Phrases | 84.70 |
2018 | Xu et al. [21] | 3D-CNN+highway | Bi-GRU + attention | GRID | Phrases | 97.10 |
2019 | Weng [27] | Two-Stream 3D—CNN | Bi-GUR | LRW | Words | 82.07 |