Skip to main content
. 2017 Nov 26;57(12):2976–2985. doi: 10.1021/acs.jcim.7b00338

Table 4. Statistics of the Models Used in the Various External Validation Applicationsa.

  learning curve external validation
model ensemble external validation
  30% model 1 50% model 1 70% model 1 model 1 model 3 consensus
active data points (training) 1336 2310 3222 4549 4531 4843
inactive data points (training) 1271 2207 3103 4502 4580 10588
active data points (validation) 3210 2239 1327 1206 1224 912
inactive data points (validation) 3205 2295 1337 26399 26321 20313
OoB sensitivity 0.89 0.90 0.92 0.92 0.92 n/a
OoB specificity 0.88 0.89 0.90 0.91 0.91 n/a
OoB ROC AUC 0.94 0.96 0.96 0.97 0.97 n/a
ExtVal sensitivity 0.89 0.91 0.90 0.88 0.90 0.91
ExtVal specificity 0.88 0.90 0.91 0.94 0.95 0.94
ExtVal MCC 0.77 0.81 0.81 0.57 0.62 0.58
ExtVal ROC AUC 0.94 0.96 0.96 0.97 0.97 0.97
a

Overview of representative models created in the external validation. Shown are one of each created learning curve models (30%, 50%, 70%), 2 out of 5 models created for ensemble model screening (model 1 and model 3), and finally the performance of the consensus model used for prospective application. The abbreviations are as follows: External Validation (ExtVal), Out-of-Bag (OoB), Matthews Correlation Coefficient (MCC, see main text for details), Receiver Operator Characteristic (ROC), Area Under the Curve (AUC), Sensitivity is defined as True Positives divided by the sum of True Positives and False Negatives, Specificity is defined as True Negatives divided by the sum of True Negatives and False Positives. Note that no OoB parameters are present for the consensus application as this method consists of 5 separate OoB validated models for which data for 2 is shown.