Skip to main content
. 2012 Nov 3;2012:164–169.

Table 2.

Performance comparison of Hosmer-Lemeshow (HL) and our new calibration test from first principles (FP). The null hypothesis that the model is correct is false; the degree of incorrectness is given by the angles between the correct and incorrect β vectors (left portion) and the slope (right portion) of the logistic regression models. AUC denotes area under the ROC curve of the incorrect model. The p-value is for the comparison of HL vs. FP type II errors (* denotes the test with significantly lower type II error at α level 0.05).

angle incorrect model increase in slope
10° 20° 30° 10% 20% 30%
dim = 5:
average AUC 0.855 0.837 0.808 0.862 0.862 0.862
HL type II error 83.2%* 48.1% 2.8% 67.8% 28.4% 4.9%
FP type II error 87.5% 41.7%* 2.4% 69.0% 27.4% 5.2%
p-value 0.0065 0.004 0.574 0.5638 0.6181 0.7593

dim = 10:
AUC 0.856 0.836 0.807 0.862 0.862 0.862
HL type II error 87.5% 44.7% 2.2% 67.1 26.8 4.9
FP type II error 87.7% 39.2%* 1.7% 68.1 26.2 5.4
p-value 0.892 0.0127 0.4188 0.6328 0.7611 0.613

dim = 20:
AUC 0.855 0.835 0.805 0.861 0.861 0.861
HL type II error 84.8% 44.4% 1.8% 68.0% 25.8% 5.1%
FP type II error 86.6% 38.4%* 1.9% 67.2% 27.5% 6.6%
p-value 0.2503 0.0065 0.868 0.7023 0.3899 0.1530