Skip to main content
. Author manuscript; available in PMC: 2016 Dec 19.
Published in final edited form as: Proc Conf Empir Methods Nat Lang Process. 2016 Nov;2016:648–657. doi: 10.18653/v1/d16-1062

Table 5.

Theta scores and area under curve percentiles for LSTM trained on SNLI and tested on GSIRT. We also report the accuracy for the same LSTM tested on all SNLI quality control items (see Section 3.1). All performance is based on binary classification for each label.

Item Set Theta Score Percentile Test
Acc.
5GS

Entailment −0.133 44.83% 96.5%
Contradiction 1.539 93.82% 87.9%
Neutral 0.423 66.28% 88%

4GS

Contradiction 1.777 96.25% 78.9%
Neutral 0.441 67% 83%