Table 7.
Topic 200: keyword frequency rank
| # | Term | Term count | Percentage | Rank |
|---|---|---|---|---|
| 1 | Lupus | 869 | 23.80% | 1 |
| 2 | Diseas | 753 | 20.70% | 2 |
| 3 | Activ | 496 | 13.60% | 3 |
| 4 | Associ | 476 | 13.10% | 4 |
| 5 | Serum | 294 | 8.10% | 5 |
| 6 | High | 274 | 7.50% | 6 |
| 7 | Protein | 195 | 5.40% | 7 |
| 8 | Express | 179 | 4.90% | 8 |
| 9 | Chang | 108 | 3.00% | 9 |
The original information of the keywords for Topic 200 is shown: (1) terms are extracted with stemming; (2) term counts are obtained from the first round retrieved passages, which are the top 1000 retrieved passages as the baseline; (3) the percentage is calculated based on 9 terms; (4) the rank depends on the term counts; (4) the parameters for this baseline are (k1, b) = (2.0, 0.4) with the paragraph-based index.