Table 3. Summary of the top 20 features from the ERα model along with their corresponding SMARTS patterns and description.
The top features were obtained from the feature importance plot of the RF model.
| Features | SMARTS pattern | Substructure description |
|---|---|---|
| PubChemFP199 | >= 4 any ring size 6 | Greater than or equal to 4 six-membered cyclic ring |
| PubChemFP193 | >= 3 saturated or aromatic carbon-only ring size 6 | Greater than or equal to 3 saturated or aromatic carbon-only six-membered cyclic ring |
| PubChemFP714 | Cc1ccc(O)cc1 | 4-methylphenol |
| PubChemFP2 | >= 16 H | Greater than or equal to sixteen hydrogen atoms |
| PubChemFP345 | C(~C)(~H)(~N) | Ethylamine |
| PubChemFP697 | C-C-C-C-C-C(C)-C | 2-methylheptane |
| PubChemFP777 | CC1CCC(O)CC1 | 4-methylphenol |
| PubChemFP540 | C-N-C-[#1] | 1-(2-chloroethyl)-3-[2-[2-[[2-chloroethyl(nitroso)carbamoyl]amino]ethyldisulfanyl]ethyl]-1-nitrosourea |
| PubChemFP259 | >= 3 aromatic rings | Greater than or equal to 3 aromatic rings |
| PubChemFP804 | OC1CC(S)CCC1 | 3-sulfonyl phenol |
| PubChemFP12 | >= 16 C | Greater than or equal to sixteen carbon atoms |
| PubChemFP365 | C(~H)(~N) | Methanamine |
| PubChemFP453 | N(-C)(=C) | N-methylmethanimine |
| PubChemFP391 | N(~C)(~C)(~C) | N,N-dimethylmethanamine |
| PubChemFP741 | Oc1cc(S)ccc1 | 3-sulfonyl phenol |
| PubChemFP696 | C-C-C-C-C-C-C-C | Octane |
| PubChemFP622 | O=C-O-C:C | Ethyl formate |
| PubChemFP418 | C=N | Methanimine |
| PubChemFP532 | S-C:C-[#1] | Ethanethiol |
| PubChemFP500 | C-S-C:C | Methylsulfanylethane |