Table 6.
Presence of ... | Example | P | R | F | Modified | |
0 | 0.770 | 0.673 | 0.718 | 0 | ||
1 | Gene chromosome location | 3p11-3p12.1 | 0.772 | 0.673 | 0.719 | 34 |
2 | Single, short lowercase word | heme | 0.778 | 0.672 | 0.721 | 112 |
3 | Strings of only numbers &/or punct | 9+/-76 | 0.779 | 0.672 | 0.722 | 206 |
4 | Extra preceding words | protein SNF to SNF | 0.790 | 0.681 | 0.731 | 225a |
5 | Extra trailing words | SNF protein to SNF | 0.812 | 0.723 | 0.765 | 419a |
6 | Amino acids | Ser-119 | 0.815 | 0.723 | 0.766 | 460 |
7 | Protein families | Bcl-2 family proteins | 0.816 | 0.722 | 0.766 | 701 |
8 | Protein domains, motifs, fusion | SNH domain | 0.828 | 0.722 | 0.771 | 883 |
9 | Nonhuman keywords | rat IFN gamma | 0.829 | 0.725 | 0.774 | 1,086a |
Results depicted here are from the development dataset. Step 0 indicates performance before application of any rules. At each step, the rules of preceding steps are also applied. Modified refers to the cumulative number of gene mentions removed or altered. aRules 4 and 5 result in modification of gene mentions only. Rule 9 can result in either modification or removal of gene mentions. All other rules result in removal of gene mentions. GN, gene normalization.