Table 2. Summary of model performance from classification models for ERα and ERβ.
| Fingerprint | Training set | 5-fold cross-validation | Testing set | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AcTr | SnTr | SpTr | MCCTr | AcCV | SnCV | SpCV | MCCCV | AcTest | SnTest | SpTest | MCCTest | |
| ERα | ||||||||||||
| 2D Atom Pairs | 96.78 | 95.86 | 97.90 | 0.94 | 83.90 | 84.14 | 83.61 | 0.68 | 84.73 | 98.57 | 68.85 | 0.72 |
| 2D Atom Pairs Count | 100.00 | 100.00 | 100.00 | 1.00 | 87.31 | 87.50 | 87.08 | 0.74 | 97.71 | 98.57 | 96.72 | 0.95 |
| E-state | 89.01 | 89.47 | 88.48 | 0.78 | 83.33 | 84.95 | 81.53 | 0.67 | 90.08 | 94.29 | 85.25 | 0.80 |
| CDK | 96.59 | 96.82 | 96.33 | 0.93 | 88.83 | 87.84 | 90.09 | 0.78 | 93.89 | 100.00 | 86.89 | 0.88 |
| CDK Extended | 99.24 | 99.29 | 99.18 | 0.98 | 89.39 | 89.00 | 89.87 | 0.79 | 94.66 | 98.57 | 90.16 | 0.89 |
| CDK Graph Only | 97.34 | 97.19 | 97.53 | 0.95 | 86.93 | 86.64 | 87.29 | 0.74 | 86.26 | 98.57 | 72.13 | 0.74 |
| Klekota-Roth | 93.56 | 95.60 | 91.37 | 0.87 | 86.93 | 89.34 | 84.38 | 0.74 | 90.84 | 88.57 | 93.44 | 0.82 |
| Klekota-Roth Count | 95.83 | 98.15 | 93.39 | 0.92 | 88.83 | 91.18 | 86.33 | 0.78 | 86.26 | 81.43 | 91.80 | 0.73 |
| MACCS | 96.40 | 96.15 | 96.69 | 0.93 | 84.66 | 85.31 | 83.88 | 0.69 | 96.95 | 95.71 | 98.36 | 0.94 |
| PubChem | 96.40 | 96.15 | 96.69 | 0.93 | 87.69 | 86.09 | 89.82 | 0.75 | 94.66 | 100.00 | 88.52 | 0.90 |
| Substructure | 92.99 | 93.93 | 91.94 | 0.86 | 83.52 | 86.57 | 80.38 | 0.67 | 93.89 | 97.14 | 90.16 | 0.88 |
| Substructure Count | 91.28 | 93.09 | 89.33 | 0.83 | 82.95 | 85.87 | 79.92 | 0.66 | 93.89 | 97.14 | 90.16 | 0.88 |
| ERβ | ||||||||||||
| 2D Atom Pairs | 96.68 | 96.32 | 98.18 | 0.90 | 86.01 | 87.53 | 77.11 | 0.55 | 88.73 | 99.10 | 51.61 | 0.65 |
| 2D Atom Pairs Count | 99.65 | 99.55 | 100.00 | 0.99 | 88.46 | 90.11 | 80.41 | 0.64 | 92.96 | 100.00 | 67.74 | 0.79 |
| E-state | 94.06 | 94.03 | 94.17 | 0.82 | 87.24 | 89.45 | 76.53 | 0.60 | 90.14 | 98.20 | 61.29 | 0.69 |
| CDK | 99.13 | 99.11 | 99.18 | 0.97 | 90.03 | 91.14 | 84.69 | 0.69 | 95.77 | 98.20 | 87.10 | 0.87 |
| CDK Extended | 98.95 | 98.89 | 99.17 | 0.97 | 90.73 | 92.09 | 84.62 | 0.72 | 94.37 | 99.10 | 77.42 | 0.83 |
| CDK Graph Only | 96.85 | 96.53 | 98.20 | 0.91 | 87.41 | 89.31 | 77.89 | 0.61 | 91.55 | 98.20 | 67.74 | 0.74 |
| Klekota-Roth | 98.25 | 98.23 | 98.32 | 0.95 | 89.16 | 90.87 | 81.19 | 0.66 | 94.37 | 99.10 | 77.42 | 0.83 |
| Klekota-Roth Count | 98.95 | 98.89 | 99.17 | 0.97 | 90.38 | 91.53 | 85.00 | 0.70 | 94.37 | 99.10 | 77.42 | 0.83 |
| MACCS | 99.48 | 99.78 | 98.41 | 0.98 | 88.81 | 91.18 | 78.50 | 0.66 | 95.07 | 99.10 | 80.65 | 0.85 |
| PubChem | 98.25 | 98.02 | 99.15 | 0.95 | 90.38 | 91.53 | 85.00 | 0.70 | 92.25 | 99.10 | 67.74 | 0.76 |
| Substructure | 95.10 | 94.67 | 97.09 | 0.85 | 87.94 | 89.71 | 79.17 | 0.62 | 94.37 | 99.10 | 77.42 | 0.83 |
| Substructure Count | 99.30 | 99.33 | 99.19 | 0.98 | 89.34 | 91.24 | 80.77 | 0.67 | 95.77 | 99.10 | 83.87 | 0.87 |