Table 2. Performance of the gene signature obtained during the formulation and training of the cumulus cell support vector machine on the external validation data set.
Accuracy (%) of validation set, part 1 | Accuracy (%) of validation set, part 2 | Accuracy (%) of validation set, part 3 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Training set | # genes | Classification Model | Overall | LB | NP | Overall | LB | NP | Overall | LB | NP |
PLIER above bg | 25 | SVM Linear, cost1 | 57 | 64 | 50 | 67 | 61 | 72 | 96 | 85 | 100 |
PLIER above bg | 25 | SVM linear, cost 2 | 62 | 64 | 60 | 75 | 67 | 83 | 88 | 91 | 85 |
PLIER above bg | 25 | SVM linear, cost 10 | 57 | 64 | 50 | 58 | 61 | 56 | 92 | 100 | 85 |
The table shows the classification accuracy of the binary classifier built to distinguish between live birth (LB) and no pregnancy (NP) on the three parts of the external cumulus expression data set (GEO accession: GSE37110, GSE37116 and GSE37117) using the linear support vector machine classifier with three settings of the cost parameter. Cost = 2 shows the best ability to classify the external data correctly.