. 2016 Sep 24;41(1):30–43. doi: 10.1177/0146621616668015

Table 3.

Statistical Power and Type I Error Rate of Different Inferential Methods for Non-Uniform DIF.

$n R$	$n F$	Effect size	Focal mean	S&R	LRT	PLRT	BLRT1	BLRT2
50	50	0.0	0	0.029^*	0.043^**	0.033^*	0.042^**	0.044^**
150	50	0.0	0	0.042^**	0.053^***	0.047^***	0.045^***	0.048^***
100	100	0.0	0	0.057^**	0.062^*	0.060^**	0.061^*	0.058^**
450	50	0.0	0	0.032^*	0.050^***	0.045^***	0.047^***	0.056 ^**
250	250	0.0	0	0.039^*	0.040^**	0.040^**	0.041^**	0.044^**
500	500	0.0	0	0.053^***	0.053^***	0.053^***	0.053^***	0.052^***
50	50	0.4	0	0.053	0.072	0.062	0.064	0.068
150	50	0.4	0	0.083	0.094	0.089	0.082	0.090
100	100	0.4	0	0.103	0.110	0.108	0.106	0.106
450	50	0.4	0	0.092	0.099	0.104	0.093	0.096
250	250	0.4	0	0.141	0.149	0.144	0.146	0.146
500	500	0.4	0	0.291	0.298	0.293	0.290	0.293
50	50	0.6	0	0.064	0.107	0.079	0.093	0.103
150	50	0.6	0	0.151	0.150	0.156	0.141	0.150
100	100	0.6	0	0.172	0.200	0.189	0.195	0.194
450	50	0.6	0	0.225	0.235	0.211	0.197	0.209
250	250	0.6	0	0.457	0.470	0.463	0.464	0.467
500	500	0.6	0	0.734	0.739	0.736	0.732	0.735
50	50	0.0	−1	0.045^***	0.060^**	0.052^***	0.053^***	0.054^***
150	50	0.0	−1	0.046^***	0.053^***	0.051^***	0.049^***	0.052^***
100	100	0.0	−1	0.047^***	0.054^***	0.052^***	0.051^***	0.052^***
450	50	0.0	−1	0.041^**	0.049^***	0.048^***	0.042^**	0.043^**
250	250	0.0	−1	0.046^***	0.053^***	0.049^***	0.053^***	0.054^***
500	500	0.0	−1	0.054^***	0.056^**	0.054^***	0.052^***	0.053^***
50	50	0.4	−1	0.062	0.080	0.071	0.072	0.078
150	50	0.4	−1	0.112	0.125	0.117	0.117	0.122
100	100	0.4	−1	0.099	0.112	0.101	0.108	0.110
450	50	0.4	−1	0.119	0.125	0.127	0.120	0.121
250	250	0.4	−1	0.164	0.171	0.166	0.169	0.169
500	500	0.4	−1	0.326	0.334	0.328	0.321	0.328
50	50	0.6	−1	0.098	0.137	0.114	0.120	0.125
150	50	0.6	−1	0.192	0.195	0.195	0.184	0.190
100	100	0.6	−1	0.198	0.232	0.209	0.213	0.219
450	50	0.6	−1	0.274	0.289	0.288	0.259	0.274
250	250	0.6	−1	0.514	0.533	0.517	0.527	0.530
500	500	0.6	−1	0.805	0.813	0.809	0.806	0.811

Note. $n R$ and $n F$ represent sample sizes in reference and focal groups, respectively. S&R represents the test statistic proposed by Swaminathan and Rogers (1990). BLRT1 and BLRT2 used bootstrap samples of size 1,000 and 10,000, respectively. The *, **, and *** represent liberally [0.025, 0.075], moderately [0.040, 0.060], and strictly [0.045, 0.055] robust type I error (Bradley, 1978). The number highlighted with bold font represents the highest statistical power within each simulation condition. DIF = differential item functioning; S&R = Swaminathan & Rogers; LRT = likelihood ratio test; PLRT = penalized likelihood ratio test; BLRT = bootstrap likelihood ratio test.