Table 6.
Macrofactors selected by the multilevel factor elimination (MFE) algorithm in the Bidirectional Encoder Representations from Transformers for Biomedical Text Mining (BioBERT) model across different datasets.
| MFE algorithm | On the basis of the Revised JNLPBAa dataset | On the basis of the BC5CDRb dataset | On the basis of the AnatEMc dataset |
| Input | sLend, eLene, eNumf, eDeng, elConh, and tEWCi | sLen, eLen, eNum, eDen, elCon, and tEWC | sLen, eLen, eNum, eDen, elCon, and tEWC |
| Layer 1 | eLen, eNum, elCon, and tEWC | eLen, eNum, eDen, and tEWC | sLen, eLen, eNum, and eDen |
| Layer 2 | eLen and eNum | eNum and tEWC | eLen and eNum |
| Layer 3 | eNum | eNum | eNum |
aJNLPBA: Joint Workshop on Natural Language Processing in Biomedicine and its Applications.
bBC5CDR: BioCreative V CDR.
cAnatEM: Anatomical Entity Mention.
dsLen: sentence length.
eeLen: entity phrase length.
feNum: number of entity words in each entity phrase.
geDen: entity density.
helCon: entity label consistency.
itEWC: total entity word count in each entity type.