. 2016 Sep 24;41(1):30–43. doi: 10.1177/0146621616668015

Table 2.

Statistical Power and Type I Error Rate of Different Inferential Methods for Uniform DIF.

$n R$	$n F$	Effect size	Focal mean	S&R	LRT	PLRT	BLRT1	BLRT2
50	50	0.0	0	0.038^*	0.050^***	0.045^***	0.047^***	0.048^***
150	50	0.0	0	0.030^*	0.041^**	0.038^*	0.036^*	0.041^**
100	100	0.0	0	0.038^*	0.049^***	0.042^**	0.045^***	0.048^***
450	50	0.0	0	0.041^**	0.056^**	0.050^***	0.049^***	0.050^***
250	250	0.0	0	0.052^***	0.056^**	0.053^***	0.057^**	0.056^**
500	500	0.0	0	0.048^***	0.048^***	0.048^***	0.049^***	0.051^***
50	50	0.4	0	0.119	0.144	0.133	0.134	0.141
150	50	0.4	0	0.182	0.190	0.186	0.179	0.185
100	100	0.4	0	0.228	0.247	0.231	0.233	0.239
250	250	0.4	0	0.552	0.559	0.554	0.552	0.554
450	50	0.4	0	0.219	0.231	0.233	0.217	0.228
500	500	0.4	0	0.841	0.843	0.841	0.840	0.841
50	50	0.6	0	0.252	0.310	0.277	0.282	0.302
150	50	0.6	0	0.413	0.420	0.421	0.401	0.403
100	100	0.6	0	0.535	0.552	0.544	0.537	0.547
450	50	0.6	0	0.529	0.513	0.538	0.492	0.510
250	250	0.6	0	0.930	0.937	0.932	0.935	0.935
500	500	0.6	0	0.999	0.999	0.999	0.999	0.999
50	50	0.0	−1	0.039^*	0.055^***	0.044^**	0.051^***	0.043^**
150	50	0.0	−1	0.044^**	0.053^***	0.049^***	0.051^***	0.052^***
100	100	0.0	−1	0.041^**	0.053^***	0.047^***	0.048^***	0.045^***
450	50	0.0	−1	0.034^*	0.041^**	0.037^*	0.037^*	0.040^**
250	250	0.0	−1	0.064^*	0.067^*	0.064^*	0.061^*	0.058^**
500	500	0.0	−1	0.055^***	0.059^**	0.054^***	0.058^**	0.054^***
50	50	0.4	−1	0.110	0.125	0.117	0.108	0.118
150	50	0.4	−1	0.151	0.159	0.158	0.153	0.155
100	100	0.4	−1	0.241	0.249	0.240	0.231	0.242
450	50	0.4	−1	0.196	0.212	0.211	0.196	0.209
250	250	0.4	−1	0.514	0.511	0.514	0.501	0.504
500	500	0.4	−1	0.791	0.802	0.800	0.793	0.800
50	50	0.6	−1	0.244	0.268	0.257	0.248	0.256
150	50	0.6	−1	0.365	0.372	0.372	0.353	0.368
100	100	0.6	−1	0.503	0.504	0.504	0.493	0.497
450	50	0.6	−1	0.416	0.417	0.422	0.396	0.411
250	250	0.6	−1	0.875	0.877	0.875	0.868	0.870
500	500	0.6	−1	0.991	0.991	0.991	0.991	0.991

Note. $n R$ and $n F$ represent sample sizes in reference and focal groups, respectively. S&R represents the test statistic proposed by Swaminathan and Rogers (1990). BLRT1 and BLRT2 used bootstrap samples of size 1,000 and 10,000, respectively. The *, **, and *** represent liberally [0.025, 0.075], moderately [0.040, 0.060], and strictly [0.045, 0.055] robust type I error (Bradley, 1978). The number highlighted with bold font represents the highest statistical power within each simulation condition. DIF = differential item functioning; S&R = Swaminathan & Rogers; LRT = likelihood ratio test; PLRT = penalized likelihood ratio test; BLRT = bootstrap likelihood ratio test.