A Note on N in Bayesian Information Criterion for Item Response Models

Sun-Joo Cho; Paul De Boeck

doi:10.1177/0146621617726791

. 2017 Oct 31;42(2):169–172. doi: 10.1177/0146621617726791

A Note on N in Bayesian Information Criterion for Item Response Models

Sun-Joo Cho ^1,^✉, Paul De Boeck ^2,³

PMCID: PMC5978647 PMID: 29881118

Abstract

This brief report derives the N in the penalty term of the Schwarz’s (1978) Bayesian information criterion (BIC) for two-parameter logistic item response models. The results in this study show that the N is the number of persons for fixed item models, whereas it is the number of observations (the Number of Persons times the Number of Items) for random item models. Given these results, the authors recommend researchers to calculate the BIC or to validate the BIC value that shows in the output of software instead of accepting the output value without a further check of implicit assumptions made for the software.

Keywords: item response theory, Bayesian information criterion, information function

It is common to use information criteria such as Schwarz’s (1978) Bayesian information criterion (BIC) in model selection for item response models, as exhibited by their automatic computation in software packages. It has been observed that there are discrepancies in the formula of the BIC for item response models in software. The discrepancies concern the sample size term $N$ in the penalty term of the BIC. As an example, for the one-parameter item response model with a random person effect and a fixed item effect using (an approximate) marginal maximum likelihood estimation, $N$ is the number of persons in flexMIRT (Cai, 2015), Mplus (Muthén & Muthén, 1998-2015), and SAS NLMIXED (e.g., Nandakumar & Hotchkiss, 2012, for item response models), whereas it is the total number of observations (i.e., the Number of Persons $\times$ the Number of Items) in the glmer function of lme4 R package (Bates, Mächler, Bolker, & Walker, 2015). In the glmer function, the $N$ is the total number of observations for the one-parameter item response model with a random person effect and a random item effect. However, the model cannot be fit using flexMIRT, SAS NLMIXED, and Mplus.¹

In spite of the popularity of the BIC, to the authors’ knowledge, there is no derivation available for the $N$ to be used for the BIC of item response models. Different calculation for BIC can result in different model selections and the same calculation for BIC can be misleading if that same calculation is correct for one model but not for another. The aim of this brief report is to provide a derivation, based on the previous derivations of BIC in general (e.g., Kass & Raftery, 1995, p. 779). It focuses on two-parameter logistic (2PL) item response models for binary responses. The derivation below is applicable to the one-parameter, two-parameter, and three-parameter item response models for the binary responses.

Schwarz (1978) derived the BIC as an asymptotic approximation of the Bayesian posterior probability of a candidate model $M$ :

\log P (M | y) = \log P (y | \hat{ϑ}, M) - \frac{K}{2} \log N + O (N^{- 1}),

where $y$ is data, $ϑ$ is a set of parameters, $P (y | \hat{ϑ}, M)$ is the likelihood of data $y$ evaluated at parameter estimates $\hat{ϑ}$ . $K$ is the number of parameters, and $N$ is sample size. For a large sample size, the order $O (N^{- 1})$ term can be dropped in Equation 1. Below, the derivation of $N$ is presented for (a) the 2PL item response model having a random person effect and a fixed item effect and (b) the 2PL item response model having a random person effect and a random item effect. In the derivations, the notation for the candidate model $M$ as it appears in Equation 1 is omitted for reasons of simplicity.

Random Person Effect and Fixed Item Effect

The 2PL item response model can be specified as

logit [P (y_{j i} = 1 | θ_{j})] = α_{i} (θ_{j} - β_{i}),

where $j$ is an index of person ( $j = 1, \dots, J$ ), $i$ is an index of item ( $i = 1, \dots, I$ ), $y_{J I}$ is an item response, $α_{i}$ is an item discrimination parameter, $β_{i}$ is an item location parameter, and $θ_{j}$ is a latent variable. To identify the model, a standard normal distribution can be set on $θ_{j}$ . For a sample of $J$ independent persons, the marginal probability of the data, $y = [y_{1}, \dots, y_{j}, \dots, y_{J}]'$ , is

P (y) = \prod_{j = 1}^{J} \int_{θ} [\prod_{i = 1}^{I} P {(y_{j i} = 1 | θ_{j}; {\hat{ϑ}}_{i})}^{y_{j i}} P {(y_{j i} = 0 | θ_{j}; {\hat{ϑ}}_{i})}^{1 - y_{j i}}] P (θ_{j}) {dθ}_{j},

where $ϑ_{i} = [α_{i}, β_{i}]'$ and $P (θ_{j})$ is a prior distribution for $θ_{j}$ .

The marginal probability of the data $y$ can be approximated using Laplace approximation (Tierney & Kadane, 1986):

\log P (y) = \log P (y | θ; \hat{ϑ}) + \log P (θ) + \frac{K}{2} \log (2 π) - \frac{1}{2} \log ∥ A ∥ + O (I^{- 1}),

where $θ = [θ_{1}, \dots, θ_{J}]'$ and $A$ is the expected information matrix of item parameter estimates, $\hat{ϑ} = [\hat{α}, \hat{β}]'$ .

For a sample of $J$ independent persons, the $A$ in Equation 4 approaches $J \cdot I_{j} (ϑ)$ , where $J$ is the number of persons and $I_{j} (ϑ)$ is the Fisher information (or expected information) of item parameter estimates for a person $j$ (Thissen & Wainer, 1982, p. 411):

A = J \cdot I_{j} (ϑ) ≅ J \cdot E [- {\overset{\cdot\cdot}{P}}_{j} (\hat{ϑ})],

where $- {\overset{\cdot\cdot}{P}}_{j} (ϑ)$ is minus the second derivative of the loglikelihood function (called the Hessian matrix or the observed information). The matrix of $I_{j} (ϑ)$ is a ( $K \times K$ ) matrix (e.g., $2 I \times 2 I$ matrix for the 2PL item response models). Thus, $∥ A ∥ \equiv ∥ J I_{j} (ϑ) ∥ \equiv J^{K} ∥ I_{j} (ϑ) ∥$ .

Inserting $∥ A ∥ \equiv ∥ J I_{j} (ϑ) ∥ \equiv J^{K} ∥ I_{j} (ϑ) ∥$ into Equation 4, the approximation is written as

\log P (y) = \log P (y | θ; \hat{ϑ}) + \log P (θ) + \frac{K}{2} \log (2 π) - \frac{K}{2} \log (J) - \frac{1}{2} \log ∥ I_{j} ∥ + O (I^{- 1}) .

In Equation 6, $\log P (θ)$ , $(K / 2) \log (2 π)$ , and $(1 / 2) \log | | I_{j} ∥$ have an error of order $O (1)$ . When ignoring the terms having an error of order $O (1)$ , Equation 6 is

\log P (y) = \log P (y | θ; \hat{ϑ}) - \frac{K}{2} \log (J) + O (I^{- 1}) .

As presented in Equation 7, $N$ in BIC is $J$ .

Random Person Effect and Random Item Effect

Item parameters in Equation 2 can be random effects in specialized measurement problems (see De Boeck, 2008, for a discussion). For a sample of $J$ independent persons and a sample of $I$ independent items, the marginal probability of the data, $y = [y_{11}, \dots, y_{J I}, \dots, y_{J I}]'$ , is

P (y) = \prod_{j = 1}^{J} \prod_{i = 1}^{I} \int_{θ} \int_{ξ} {[P {(y_{j i} = 1 | θ_{j}, ξ_{i}; \hat{ϑ})}^{y_{j i}} [1 - P (y_{j i} = 1 | θ_{j}, ξ_{i}; \hat{ϑ})]}^{1 - y_{j i}}] P (θ_{j}) P (ξ_{i}) d θ_{j} {dξ}_{i},

where $ξ_{i} = [α_{i}, β_{i}]'$ and $P (ξ_{i})$ is a prior distribution of item parameters. When item parameters are random, the log-normal distribution can be imposed on item discrimination parameters $(α_{i} ~ \log N (0, σ_{α}))$ and the normal distribution can be imposed on item location parameters $(β_{i} ~ N (0, σ_{β}))$ . In such a case, there are three parameters $(ϑ = [μ, σ_{α}, σ_{β}]')$ , where $μ$ is a grand mean across persons and items, $σ_{α}$ is a variance of a random item discrimination parameter, and $σ_{β}$ is a variance of a random location parameter.

Using the Laplace approximation, the marginal probability of the data $y$ can be approximated as follows:

\log P (y) = \log P (y | θ, ξ; \hat{ϑ}) + \log P (θ) + \log P (ξ) + \frac{K}{2} \log (2 π) - \frac{K}{2} \log ∥ A ∥ + O ({(J I)}^{- 1}),

where $ξ = [ξ_{1}, \dots, ξ_{i}, \dots, ξ_{I}]'$ and $A$ is the expected information matrix of the parameter estimates, $\hat{ϑ} = [\hat{μ}, {\hat{σ}}_{α}, {\hat{σ}}_{β}]'$ .

For a sample of independent pairs of person $j$ and item $i$ , $A$ in Equation 9 approaches $(J I) \cdot I_{j i} (ϑ)$ , where $J$ is the number of persons, $I$ is the number of items, and $I_{J I} (ϑ)$ is the Fisher information of parameter estimates for a person $j$ and an item $i$ :

A = (J I) \cdot I_{j i} (ϑ) ≅ (J I) \cdot E [- {\overset{\cdot\cdot}{P}}_{j i} (\hat{ϑ})],

where $- {\overset{\cdot\cdot}{P}}_{J I} (\hat{ϑ})$ is minus the second derivative of the loglikelihood function. When ${IH}_{J I}$ is a $(K \times K)$ matrix (i.e., $K = 3$ for $ϑ = [μ, σ_{α}, σ_{β}]'$ ), $∥ A ∥ \equiv ∥ (J I) \times I_{J I} ∥ \equiv {(J I)}^{K} ∥ I_{J I} ∥$ .

With $∥ A ∥ \equiv ∥ (J I) \times I_{J I} ∥ \equiv {(J I)}^{K} ∥ I_{J I} ∥$ , the approximation in Equation 9 can be rewritten as

\log P (y) = \log P (y | θ, ξ; \hat{ϑ}) + \log P (θ) + \log P (ξ) + \frac{K}{2} \log (2 π) - \frac{K}{2} \log (J I) - \frac{1}{2} \log ∥ I_{j i} ∥ + O ({(J I)}^{- 1}),

where $\log P (θ)$ , $\log P (ξ)$ , $(K / 2) \log (2 π)$ , and $(1 / 2) \log ∥ I_{J I} ∥$ are all approximations with an error of order $O (1)$ . Ignoring the $O (1)$ terms, Equation 11 results in the following equation:

\log P (y) = \log P (y | θ, ξ; \hat{ϑ}) - \frac{K}{2} \log (J I) + O ({(J I)}^{- 1}) .

It follows from Equation 12 that $N$ is $J I$ for item response models with random item parameters.

^1.

In Mplus, the one-parameter item response model with a random person effect and a random item effect can be fit using Bayesian analysis so that a likelihood-based Bayesian information criterion (BIC) is not available.

Footnotes

Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Bates D., Mächler M., Bolker B., Walker S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1-48. [Google Scholar]
Cai L. (2015). flexMIRT version 3.0: A numerical engine for flexible multilevel and multidimensional item analysis and test scoring [Computer Software]. Raleigh-Durham, NC: Vector Psychometric Group. [Google Scholar]
De Boeck P. (2008). Random item IRT models. Psychometrika, 73, 533-559. [Google Scholar]
Kass R. E., Raftery A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773-795. [Google Scholar]
Muthén L. K., Muthén B. O. (1998-2015). Mplus users guide (7th ed.). Los Angeles, CA: Author. [Google Scholar]
Nandakumar R., Hotchkiss L. (2012). PROC NLMIXED: For estimating parameters of IRT models. Applied Psychological Measurement, 38, 404-405. [Google Scholar]
Schwarz G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461-464. [Google Scholar]
Thissen D., Wainer H. (1982). Some standard errors in item response theory. Psychometrika, 47, 397-412. [Google Scholar]
Tierney L., Kadane J. B. (1986). Accurate approximations for posterior moments and marginal densities. Journal of the American Statistical Association, 81, 82-86. [Google Scholar]

[bibr1-0146621617726791] Bates D., Mächler M., Bolker B., Walker S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1-48. [Google Scholar]

[bibr2-0146621617726791] Cai L. (2015). flexMIRT version 3.0: A numerical engine for flexible multilevel and multidimensional item analysis and test scoring [Computer Software]. Raleigh-Durham, NC: Vector Psychometric Group. [Google Scholar]

[bibr3-0146621617726791] De Boeck P. (2008). Random item IRT models. Psychometrika, 73, 533-559. [Google Scholar]

[bibr4-0146621617726791] Kass R. E., Raftery A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773-795. [Google Scholar]

[bibr5-0146621617726791] Muthén L. K., Muthén B. O. (1998-2015). Mplus users guide (7th ed.). Los Angeles, CA: Author. [Google Scholar]

[bibr6-0146621617726791] Nandakumar R., Hotchkiss L. (2012). PROC NLMIXED: For estimating parameters of IRT models. Applied Psychological Measurement, 38, 404-405. [Google Scholar]

[bibr7-0146621617726791] Schwarz G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461-464. [Google Scholar]

[bibr8-0146621617726791] Thissen D., Wainer H. (1982). Some standard errors in item response theory. Psychometrika, 47, 397-412. [Google Scholar]

[bibr9-0146621617726791] Tierney L., Kadane J. B. (1986). Accurate approximations for posterior moments and marginal densities. Journal of the American Statistical Association, 81, 82-86. [Google Scholar]

PERMALINK

A Note on N in Bayesian Information Criterion for Item Response Models

Sun-Joo Cho

Paul De Boeck

Abstract

Random Person Effect and Fixed Item Effect

Random Person Effect and Random Item Effect

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

A Note on N in Bayesian Information Criterion for Item Response Models

Sun-Joo Cho

Paul De Boeck

Abstract

Random Person Effect and Fixed Item Effect

Random Person Effect and Random Item Effect

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases