Table 9.
ALL | A | B | C | D | E | F | G | H | I | J | K | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
S1 | OBJ | .90 | .89 | .87 | .92 | .90 | .90 | .91 | .91 | .91 | .92 | .91 | .88 |
METH | .80 | .81 | .80 | .80 | .79 | .81 | .79 | .80 | .80 | .80 | .81 | .81 | |
RES | .88 | .90 | .88 | .90 | .88 | .90 | .88 | .88 | .88 | .89 | .89 | .90 | |
CON | .86 | .85 | .82 | .87 | .88 | .90 | .90 | .88 | .89 | .88 | .88 | .90 | |
S2 | BKG | .91 | .94 | .90 | .90 | .93 | .94 | .94 | .91 | .93 | .94 | .92 | .94 |
OBJ | .72 | .78 | .84 | .78 | .83 | .88 | .84 | .81 | .83 | .84 | .78 | .83 | |
METH | .81 | .83 | .80 | .81 | .80 | .85 | .80 | .78 | .81 | .81 | .82 | .83 | |
RES | .88 | .90 | .88 | .89 | .88 | .91 | .89 | .89 | .90 | .90 | .90 | .89 | |
CON | .84 | .83 | .77 | .83 | .86 | .88 | .86 | .87 | .88 | .89 | .88 | .81 | |
REL | - | - | - | - | - | - | - | - | - | - | - | - | |
FUT | - | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | |
S3 | HYP | - | - | - | - | - | - | - | - | - | - | - | - |
MOT | .82 | .84 | .80 | .76 | .82 | .82 | .83 | .78 | .83 | .83 | .83 | .83 | |
BKG | .59 | .60 | .60 | .54 | .67 | .62 | .62 | .59 | .61 | .61 | .62 | .61 | |
GOAL | .62 | .67 | .67 | .62 | .71 | .62 | .67 | .43 | .67 | .67 | .67 | .62 | |
OBJT | .88 | .85 | .83 | .74 | .83 | .85 | .83 | .74 | .83 | .83 | .83 | .85 | |
EXP | .72 | .68 | .72 | .53 | .65 | .70 | .72 | .73 | .74 | .74 | .72 | .68 | |
MOD | - | - | - | - | - | - | - | - | - | - | - | - | |
METH | .87 | .86 | .87 | .66 | .85 | .89 | .87 | .88 | .86 | .86 | .87 | .86 | |
OBS | .82 | .81 | .84 | .72 | .80 | .82 | .81 | .80 | .82 | .82 | .81 | .81 | |
RES | .87 | .87 | .88 | .74 | .87 | .86 | .87 | .86 | .87 | .87 | .87 | .88 | |
CON | .88 | .88 | .82 | .88 | .83 | .87 | .87 | .84 | .87 | .88 | .87 | .86 |
A-K: History, Location, Word, Bi-gram, Verb, Verb Class, POS, GR, Subj, Obj, Voice
We have 1.0 for FUT in S2 probably because the size of the training data is just right, and the model doesn't over-fit the data. We make this assumption because we have 1.0 for almost all the categories on the training data, but only for FUT on the test data.