Skip to main content
. 2015 Jan 19;7(Suppl 1):S8. doi: 10.1186/1758-2946-7-S1-S8

Table 2.

Orthographic features used in our system.

Feature Regular Expression Feature Regular Expression
ALLCAPS ^[A-Z]+$ MANY_NUM ^[0-9]{1,2}(,[0-9]{1,2})+$

INITCAP ^[A-Z].* REAL_NUM ^-?[0-9]+[\.][0-9]+$

HASCAP ^.*[A-Z].*$ INDASH ^([\w+][\-]+)+\w+$

SINGLECAP ^[A-Z]$ HASDIGIT .*[0-9].*

PUNCTATION ^[,;:\'\"]$ IS_DASH ^[-]+$

INITDIGIT ^[0-9].* ROMAN ^[IVXDLCM]+$

SINGLEDIGIT ^[0-9]$ END_PUNC ^[.?!]$

ALPHANUM .*[A-Za-z].*[0-9].*
|.*[0-9].*[A-Za-z].*
CAPSMIX .*[A-Z].*[a-z].*
|.*[a-z].*[A-Z].*