Table 1.
Average AUC values using different prediction models and different training set sizes
Models | Training set size |
|||
---|---|---|---|---|
1% | 5% | 10% | 25% | |
NCI antiviral dataset | ||||
MCS-based | 57.9 (3.0) | 64.0 (2.4) | 67.0 (1.3) | 70.0 (0.9) |
AP-based | 58.2 (3.1) | 63.7 (1.8) | 65.8 (1.8) | 68.9 (1.5) |
Hybrid | 61.3 (3.4) | 66.7 (1.9) | 69.2 (1.3) | 71.6 (1.2) |
NCI anticancer dataset | ||||
MCS-based | 60.3 (2.8) | 65.4 (1.8) | 68.0 (1.7) | 70.9 (1.3) |
AP-based | 59.3 (3.3) | 65.2 (1.8) | 67.8 (1.7) | 70.9 (1.8) |
Hybrid | 62.7 (3.2) | 69.2 (1.8) | 71.8 (1.4) | 74.8 (1.2) |
The MCS-based model uses the absolute MCS sizes to represent a chemical structure as a vector. The AP-based model uses the AP-based similarity, and the hybrid model concatenates the vectors from both previous models. SDs are given in parentheses.