Table 2.
Lift results for the full data set. m is the number of top sentences from each cluster to be manually reviewed.
| lift@m | Number of clusters | Batch percentile | |||
|
|
|
1% (approximately 100 sentences) | 10% (approximately 1000 sentences) | 20% | 40% |
| lift@5 | 200 | 1.36 | 1.36 a | 1.29 | 1.17 |
| lift@10 | 130 | 1.23 | 1.31 | 1.3 | 1.17 |
| lift@15 | 100 | 1.49 | 1.27 | 1.22 | 1.16 |
aThe best performing set of parameters for a given batch percentile is italicized.