Table 2.
Rules for strict filtering procedure
| No. | Rule | Examples |
|---|---|---|
| 1 |
SpecialSymbolsRule
True, if a token contains at least one of the special symbols different from: . -,/: () [] + = @ ® |
SIZE(**), SELECTIVITY%, NIMG_650, H2S↔35SCAT, 1AUDAE_AM, ΔGADS, H0 ≦−8.2 |
| 2 |
StopListRule
True, if a token is in the stop list (Table 1) |
LITERATURE, VIEWPOINT, PERCENT, PRESENT, IMPORTANCE, FUNDAMENTAL, CONCLUSION, TYPICALLY, EXAMPLE, INTRODUCTION |
| Rules of regular expressions: True, if a token satisfies at least one of the regular expressions from the following list | ||
| 3 |
4DigitRule
True, if a token contains four or more digits in succession |
FQM-3994, RYC-2008-03387, 20000H-1, MAT2010-21147, CO(0001)-CARBIDE, CO(111)/CO(0001), RU(0001) ELECTRODE |
| 4 |
3DigitRule
True, if a token contains three digits in succession |
215KMTA, 220ML, 148H-1, CU2O(111), AU{111}-CEO2{100}, MGO/AG(100) |
|
2DigitRule
True, if a token begins with one or two digits |
12C16O-13C16O, 31P{1H}, 2-PROPANOL, 2-METHYL-1-BUTENE, 3-METHYL-1,3-BUTADIENE, 15 %H3PW12O40/TIO2 | |
| 5 |
UnitsRule
True, if a token ends with a string from the dictionary of measurement units (Table 1) |
KJMOL-1, MMOL.MIN-1, KJ.MOL-1, G.GZEOLITE-1.H-1, CM3.MIN-1.G-1 |