Classifier with handcrafted features (Bowman et al., 2015a) |
- |
- |
99.7 |
78.2 |
|
LSTMs encoders (Bowman et al., 2015a) |
300 |
3.0M |
83.9 |
80.6 |
Dependency Tree CNN encoders (Mou et al., 2016) |
300 |
3.5M |
83.3 |
82.1 |
NTI-SLSTM (Ours) |
300 |
3.3M |
83.9 |
82.4 |
SPINN-PI encoders (Bowman et al., 2016) |
300 |
3.7M |
89.2 |
83.2 |
NTI-SLSTM-LSTM (Ours) |
300 |
4.0M |
82.5 |
83.4 |
|
LSTMs attention (Rocktäschel et al., 2016) |
100 |
242K |
85.4 |
82.3 |
LSTMs word-by-word attention (Rocktäschel et al., 2016) |
100 |
250K |
85.3 |
83.5 |
NTI-SLSTM node-by-node global attention (Ours) |
300 |
3.5M |
85.0 |
84.2 |
NTI-SLSTM node-by-node tree attention (Ours) |
300 |
3.5M |
86.0 |
84.3 |
NTI-SLSTM-LSTM node-by-node tree attention (Ours) |
300 |
4.2M |
88.1 |
85.7 |
NTI-SLSTM-LSTM node-by-node global attention (Ours) |
300 |
4.2M |
87.6 |
85.9 |
mLSTM word-by-word attention (Wang and Jiang, 2016) |
300 |
1.9M |
92.0 |
86.1 |
LSTMN with deep attention fusion (Cheng et al., 2016) |
450 |
3.4M |
88.5 |
86.3 |
Tree matching NTI-SLSTM-LSTM tree attention (Ours) |
300 |
3.2M |
87.3 |
86.4 |
Decomposable Attention Model (Parikh et al., 2016) |
200 |
580K |
90.5 |
86.8 |
Tree matching NTI-SLSTM-LSTM global attention (Ours) |
300 |
3.2M |
87.6 |
87.1 |
Full tree matching NTI-SLSTM-LSTM global attention (Ours) |
300 |
3.2M |
88.5 |
87.3 |