Table 3.
Development (per split) | Test | |||||||
---|---|---|---|---|---|---|---|---|
# patterns | Aver. pattern length | Precision | Recall | F1 | Precision | Recall | F1 | |
Baseline | 590 | 8.93 | 24.7 | 49.2 | 32.9 | 17.2 | 43.9 | 24.8 |
Split 1 | 50 | 5.34 | 65.6 | 51.8 | 57.9 | 64.7 | 42.7 | 51.4 |
Split 2 | 50 | 4.86 | 78.1 | 52.3 | 62.6 | 63.0 | 37.8 | 47.3 |
Split 3 | 60 | 4.68 | 67.6 | 52.9 | 59.3 | 60.9 | 42.5 | 50.1 |
Split 4 | 40 | 5.02 | 67.7 | 49.5 | 57.2 | 66.6 | 36.7 | 47.3 |
Split 5 | 50 | 4.80 | 63.7 | 48.7 | 55.2 | 64.2 | 40.7 | 49.8 |
Union of patterns | 104 | 5.65 | 58.2 | 46.8 | 51.9 | |||
Best 90 | 90 | 5.66 | 59.7 | 45.1 | 51.4 | |||
Best 80 | 80 | 5.75 | 64.8 | 37.7 | 47.6 | |||
Best 70 | 70 | 6.01 | 69.4 | 26.7 | 38.6 | |||
Best 60 | 60 | 6.17 | 60.0 | 10.0 | 17.1 | |||
Results of the winner of the shared task [21] | 78.5 | 69.8 | 73.9 |
See the definition of splits in text in Results (Evaluation of Test Data)