Skip to main content
Applied Psychological Measurement logoLink to Applied Psychological Measurement
. 2017 Oct 31;42(2):169–172. doi: 10.1177/0146621617726791

A Note on N in Bayesian Information Criterion for Item Response Models

Sun-Joo Cho 1,, Paul De Boeck 2,3
PMCID: PMC5978647  PMID: 29881118

Abstract

This brief report derives the N in the penalty term of the Schwarz’s (1978) Bayesian information criterion (BIC) for two-parameter logistic item response models. The results in this study show that the N is the number of persons for fixed item models, whereas it is the number of observations (the Number of Persons times the Number of Items) for random item models. Given these results, the authors recommend researchers to calculate the BIC or to validate the BIC value that shows in the output of software instead of accepting the output value without a further check of implicit assumptions made for the software.

Keywords: item response theory, Bayesian information criterion, information function


It is common to use information criteria such as Schwarz’s (1978) Bayesian information criterion (BIC) in model selection for item response models, as exhibited by their automatic computation in software packages. It has been observed that there are discrepancies in the formula of the BIC for item response models in software. The discrepancies concern the sample size term N in the penalty term of the BIC. As an example, for the one-parameter item response model with a random person effect and a fixed item effect using (an approximate) marginal maximum likelihood estimation, N is the number of persons in flexMIRT (Cai, 2015), Mplus (Muthén & Muthén, 1998-2015), and SAS NLMIXED (e.g., Nandakumar & Hotchkiss, 2012, for item response models), whereas it is the total number of observations (i.e., the Number of Persons × the Number of Items) in the glmer function of lme4 R package (Bates, Mächler, Bolker, & Walker, 2015). In the glmer function, the N is the total number of observations for the one-parameter item response model with a random person effect and a random item effect. However, the model cannot be fit using flexMIRT, SAS NLMIXED, and Mplus.1

In spite of the popularity of the BIC, to the authors’ knowledge, there is no derivation available for the N to be used for the BIC of item response models. Different calculation for BIC can result in different model selections and the same calculation for BIC can be misleading if that same calculation is correct for one model but not for another. The aim of this brief report is to provide a derivation, based on the previous derivations of BIC in general (e.g., Kass & Raftery, 1995, p. 779). It focuses on two-parameter logistic (2PL) item response models for binary responses. The derivation below is applicable to the one-parameter, two-parameter, and three-parameter item response models for the binary responses.

Schwarz (1978) derived the BIC as an asymptotic approximation of the Bayesian posterior probability of a candidate model M:

logP(M|y)=logP(y|ϑ^,M)K2logN+O(N1),

where y is data, ϑ is a set of parameters, P(y|ϑ^,M) is the likelihood of data y evaluated at parameter estimates ϑ^. K is the number of parameters, and N is sample size. For a large sample size, the order O(N1) term can be dropped in Equation 1. Below, the derivation of N is presented for (a) the 2PL item response model having a random person effect and a fixed item effect and (b) the 2PL item response model having a random person effect and a random item effect. In the derivations, the notation for the candidate model M as it appears in Equation 1 is omitted for reasons of simplicity.

Random Person Effect and Fixed Item Effect

The 2PL item response model can be specified as

logit[P(yji=1|θj)]=αi(θjβi),

where j is an index of person (j=1,,J), i is an index of item (i=1,,I), yJI is an item response, αi is an item discrimination parameter, βi is an item location parameter, and θj is a latent variable. To identify the model, a standard normal distribution can be set on θj. For a sample of J independent persons, the marginal probability of the data, y=[y1,,yj,,yJ], is

P(y)=j=1Jθ[i=1IP(yji=1|θj;ϑ^i)yjiP(yji=0|θj;ϑ^i)1yji]P(θj)j,

where ϑi=[αi,βi] and P(θj) is a prior distribution for θj.

The marginal probability of the data y can be approximated using Laplace approximation (Tierney & Kadane, 1986):

logP(y)=logP(y|θ;ϑ^)+logP(θ)+K2log(2π)12logA+O(I1),

where θ=[θ1,,θJ] and A is the expected information matrix of item parameter estimates, ϑ^=[α^,β^].

For a sample of J independent persons, the A in Equation 4 approaches J·Ij(ϑ), where J is the number of persons and Ij(ϑ) is the Fisher information (or expected information) of item parameter estimates for a person j (Thissen & Wainer, 1982, p. 411):

A=JIj(ϑ)JE[P··j(ϑ^)],

where P··j(ϑ) is minus the second derivative of the loglikelihood function (called the Hessian matrix or the observed information). The matrix of Ij(ϑ) is a (K×K) matrix (e.g., 2I×2I matrix for the 2PL item response models). Thus, AJIj(ϑ)JKIj(ϑ).

Inserting AJIj(ϑ)JKIj(ϑ) into Equation 4, the approximation is written as

logP(y)=logP(y|θ;ϑ^)+logP(θ)+K2log(2π)K2log(J)12logIj+O(I1).

In Equation 6, logP(θ), (K/2)log(2π), and (1/2)log||Ij have an error of order O(1). When ignoring the terms having an error of order O(1), Equation 6 is

logP(y)=logP(y|θ;ϑ^)K2log(J)+O(I1).

As presented in Equation 7, N in BIC is J.

Random Person Effect and Random Item Effect

Item parameters in Equation 2 can be random effects in specialized measurement problems (see De Boeck, 2008, for a discussion). For a sample of J independent persons and a sample of I independent items, the marginal probability of the data, y=[y11,,yJI,,yJI], is

P(y)=j=1Ji=1Iθξ[P(yji=1|θj,ξi;ϑ^)yji[1P(yji=1|θj,ξi;ϑ^)]1yji]P(θj)P(ξi)dθji,

where ξi=[αi,βi] and P(ξi) is a prior distribution of item parameters. When item parameters are random, the log-normal distribution can be imposed on item discrimination parameters (αi~logN(0,σα)) and the normal distribution can be imposed on item location parameters (βi~N(0,σβ)). In such a case, there are three parameters (ϑ=[μ,σα,σβ]), where μ is a grand mean across persons and items, σα is a variance of a random item discrimination parameter, and σβ is a variance of a random location parameter.

Using the Laplace approximation, the marginal probability of the data y can be approximated as follows:

logP(y)=logP(y|θ,ξ;ϑ^)+logP(θ)+logP(ξ)+K2log(2π)K2logA+O((JI)1),

where ξ=[ξ1,,ξi,,ξI] and A is the expected information matrix of the parameter estimates, ϑ^=[μ^,σ^α,σ^β].

For a sample of independent pairs of person j and item i, A in Equation 9 approaches (JI)·Iji(ϑ), where J is the number of persons, I is the number of items, and IJI(ϑ) is the Fisher information of parameter estimates for a person j and an item i:

A=(JI)Iji(ϑ)(JI)E[P··ji(ϑ^)],

where P··JI(ϑ^) is minus the second derivative of the loglikelihood function. When IHJI is a (K×K) matrix (i.e., K=3 for ϑ=[μ,σα,σβ]), A(JI)×IJI(JI)KIJI.

With A(JI)×IJI(JI)KIJI, the approximation in Equation 9 can be rewritten as

logP(y)=logP(y|θ,ξ;ϑ^)+logP(θ)+logP(ξ)+K2log(2π)K2log(JI)12logIji+O((JI)1),

where logP(θ), logP(ξ), (K/2)log(2π), and (1/2)logIJI are all approximations with an error of order O(1). Ignoring the O(1) terms, Equation 11 results in the following equation:

logP(y)=logP(y|θ,ξ;ϑ^)K2log(JI)+O((JI)1).

It follows from Equation 12 that N is JI for item response models with random item parameters.

1.

In Mplus, the one-parameter item response model with a random person effect and a random item effect can be fit using Bayesian analysis so that a likelihood-based Bayesian information criterion (BIC) is not available.

Footnotes

Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

  1. Bates D., Mächler M., Bolker B., Walker S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1-48. [Google Scholar]
  2. Cai L. (2015). flexMIRT version 3.0: A numerical engine for flexible multilevel and multidimensional item analysis and test scoring [Computer Software]. Raleigh-Durham, NC: Vector Psychometric Group. [Google Scholar]
  3. De Boeck P. (2008). Random item IRT models. Psychometrika, 73, 533-559. [Google Scholar]
  4. Kass R. E., Raftery A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773-795. [Google Scholar]
  5. Muthén L. K., Muthén B. O. (1998-2015). Mplus users guide (7th ed.). Los Angeles, CA: Author. [Google Scholar]
  6. Nandakumar R., Hotchkiss L. (2012). PROC NLMIXED: For estimating parameters of IRT models. Applied Psychological Measurement, 38, 404-405. [Google Scholar]
  7. Schwarz G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461-464. [Google Scholar]
  8. Thissen D., Wainer H. (1982). Some standard errors in item response theory. Psychometrika, 47, 397-412. [Google Scholar]
  9. Tierney L., Kadane J. B. (1986). Accurate approximations for posterior moments and marginal densities. Journal of the American Statistical Association, 81, 82-86. [Google Scholar]

Articles from Applied Psychological Measurement are provided here courtesy of SAGE Publications

RESOURCES