Table 7:
Adaptation at extreme low data scenarios
Adaptation Data | Model (training) | WER |
---|---|---|
35 minutes | 1 layer | 36.47% |
35 minutes | 1 layer (dis-joint) | 34.13% |
35 minutes | 2 layers (simultaneous) | 35.73% |
35 minutes | 2 layers (dis-joint) | 35.04% |
45 minutes | 1 layer | 35.23% |
45 minutes | 1 layer (dis-joint) | 33.62% |
45 minutes | 2 layers (simultaneous) | 35.13% |
45 minutes | 2 layers (dis-joint) | 34.33% |
2 hours | 1 layer | 33.25% |
2 hours | 1 layer (dis-joint) | 33.62% |
2 hours | 2 layers (simultaneous) | 32.35% |
2 hours | 2 layers (dis-joint) | 32.94% |