Skip to main content
. 2008 Jul 1;24(13):i286–i294. doi: 10.1093/bioinformatics/btn183

Table 1.

Categories of predicates on observed tokens

Predicate Example Predicate Example Predicate Example
Word proteins Hyphen - Nucleoside Thymine
StemmedWord protein BackSlash / Nucleotide ATP
PartOfSpeech NN OpenSqure [ Roman I, II, XI
InitCap Kinase CloseSqure ] MorphologyTypeI p53→p*
EndCap kappaB Colon : MorphologyTypeII p53→a1
AllCaps SOX SemiColon ; MorphologyTypeIII GnRH→AaAA
LowerCase interlukin Percent % WordLength 1, 2, 3-5, 6+
MixCase RalGDS OpenParen ( N-grams(2-4) p53→{p5, 53}
SingleCap kDa CloseParen ) ATCGUsequece ATCGU
TwoCap IL Comma , Greek alpha
ThreeCap CSF FullStop . NucleicAcid cDNA
MoreCap RESULT Apostrophe ' AminoAcidLong tyrosine
SingleDigit 1 QuotationMark ‘,’ AminoAcidShort Ser
TwoDigit 22 Star * AminoAcid+Position Ser150
FourDigit 1983 Equal =
MoreDigit 513256 Plus +