Table 4. Statistics of the Models Used in the Various External Validation Applicationsa.
learning curve external validation |
model ensemble external validation |
|||||
---|---|---|---|---|---|---|
30% model 1 | 50% model 1 | 70% model 1 | model 1 | model 3 | consensus | |
active data points (training) | 1336 | 2310 | 3222 | 4549 | 4531 | 4843 |
inactive data points (training) | 1271 | 2207 | 3103 | 4502 | 4580 | 10588 |
active data points (validation) | 3210 | 2239 | 1327 | 1206 | 1224 | 912 |
inactive data points (validation) | 3205 | 2295 | 1337 | 26399 | 26321 | 20313 |
OoB sensitivity | 0.89 | 0.90 | 0.92 | 0.92 | 0.92 | n/a |
OoB specificity | 0.88 | 0.89 | 0.90 | 0.91 | 0.91 | n/a |
OoB ROC AUC | 0.94 | 0.96 | 0.96 | 0.97 | 0.97 | n/a |
ExtVal sensitivity | 0.89 | 0.91 | 0.90 | 0.88 | 0.90 | 0.91 |
ExtVal specificity | 0.88 | 0.90 | 0.91 | 0.94 | 0.95 | 0.94 |
ExtVal MCC | 0.77 | 0.81 | 0.81 | 0.57 | 0.62 | 0.58 |
ExtVal ROC AUC | 0.94 | 0.96 | 0.96 | 0.97 | 0.97 | 0.97 |
Overview of representative models created in the external validation. Shown are one of each created learning curve models (30%, 50%, 70%), 2 out of 5 models created for ensemble model screening (model 1 and model 3), and finally the performance of the consensus model used for prospective application. The abbreviations are as follows: External Validation (ExtVal), Out-of-Bag (OoB), Matthews Correlation Coefficient (MCC, see main text for details), Receiver Operator Characteristic (ROC), Area Under the Curve (AUC), Sensitivity is defined as True Positives divided by the sum of True Positives and False Negatives, Specificity is defined as True Negatives divided by the sum of True Negatives and False Positives. Note that no OoB parameters are present for the consensus application as this method consists of 5 separate OoB validated models for which data for 2 is shown.