Abstract
This brief report derives the N in the penalty term of the Schwarz’s (1978) Bayesian information criterion (BIC) for two-parameter logistic item response models. The results in this study show that the N is the number of persons for fixed item models, whereas it is the number of observations (the Number of Persons times the Number of Items) for random item models. Given these results, the authors recommend researchers to calculate the BIC or to validate the BIC value that shows in the output of software instead of accepting the output value without a further check of implicit assumptions made for the software.
Keywords: item response theory, Bayesian information criterion, information function
It is common to use information criteria such as Schwarz’s (1978) Bayesian information criterion (BIC) in model selection for item response models, as exhibited by their automatic computation in software packages. It has been observed that there are discrepancies in the formula of the BIC for item response models in software. The discrepancies concern the sample size term in the penalty term of the BIC. As an example, for the one-parameter item response model with a random person effect and a fixed item effect using (an approximate) marginal maximum likelihood estimation, is the number of persons in flexMIRT (Cai, 2015), Mplus (Muthén & Muthén, 1998-2015), and SAS NLMIXED (e.g., Nandakumar & Hotchkiss, 2012, for item response models), whereas it is the total number of observations (i.e., the Number of Persons the Number of Items) in the glmer function of lme4 R package (Bates, Mächler, Bolker, & Walker, 2015). In the glmer function, the is the total number of observations for the one-parameter item response model with a random person effect and a random item effect. However, the model cannot be fit using flexMIRT, SAS NLMIXED, and Mplus.1
In spite of the popularity of the BIC, to the authors’ knowledge, there is no derivation available for the to be used for the BIC of item response models. Different calculation for BIC can result in different model selections and the same calculation for BIC can be misleading if that same calculation is correct for one model but not for another. The aim of this brief report is to provide a derivation, based on the previous derivations of BIC in general (e.g., Kass & Raftery, 1995, p. 779). It focuses on two-parameter logistic (2PL) item response models for binary responses. The derivation below is applicable to the one-parameter, two-parameter, and three-parameter item response models for the binary responses.
Schwarz (1978) derived the BIC as an asymptotic approximation of the Bayesian posterior probability of a candidate model :
where is data, is a set of parameters, is the likelihood of data evaluated at parameter estimates . is the number of parameters, and is sample size. For a large sample size, the order term can be dropped in Equation 1. Below, the derivation of is presented for (a) the 2PL item response model having a random person effect and a fixed item effect and (b) the 2PL item response model having a random person effect and a random item effect. In the derivations, the notation for the candidate model as it appears in Equation 1 is omitted for reasons of simplicity.
Random Person Effect and Fixed Item Effect
The 2PL item response model can be specified as
where is an index of person (), is an index of item (), is an item response, is an item discrimination parameter, is an item location parameter, and is a latent variable. To identify the model, a standard normal distribution can be set on . For a sample of independent persons, the marginal probability of the data, , is
where and is a prior distribution for .
The marginal probability of the data can be approximated using Laplace approximation (Tierney & Kadane, 1986):
where and is the expected information matrix of item parameter estimates, .
For a sample of independent persons, the in Equation 4 approaches , where is the number of persons and is the Fisher information (or expected information) of item parameter estimates for a person (Thissen & Wainer, 1982, p. 411):
where is minus the second derivative of the loglikelihood function (called the Hessian matrix or the observed information). The matrix of is a () matrix (e.g., matrix for the 2PL item response models). Thus, .
Inserting into Equation 4, the approximation is written as
In Equation 6, , , and have an error of order . When ignoring the terms having an error of order , Equation 6 is
As presented in Equation 7, in BIC is .
Random Person Effect and Random Item Effect
Item parameters in Equation 2 can be random effects in specialized measurement problems (see De Boeck, 2008, for a discussion). For a sample of independent persons and a sample of independent items, the marginal probability of the data, , is
where and is a prior distribution of item parameters. When item parameters are random, the log-normal distribution can be imposed on item discrimination parameters and the normal distribution can be imposed on item location parameters . In such a case, there are three parameters , where is a grand mean across persons and items, is a variance of a random item discrimination parameter, and is a variance of a random location parameter.
Using the Laplace approximation, the marginal probability of the data can be approximated as follows:
where and is the expected information matrix of the parameter estimates, .
For a sample of independent pairs of person and item , in Equation 9 approaches , where is the number of persons, is the number of items, and is the Fisher information of parameter estimates for a person and an item :
where is minus the second derivative of the loglikelihood function. When is a matrix (i.e., for ), .
With , the approximation in Equation 9 can be rewritten as
where , , , and are all approximations with an error of order . Ignoring the terms, Equation 11 results in the following equation:
It follows from Equation 12 that is for item response models with random item parameters.
In Mplus, the one-parameter item response model with a random person effect and a random item effect can be fit using Bayesian analysis so that a likelihood-based Bayesian information criterion (BIC) is not available.
Footnotes
Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.
References
- Bates D., Mächler M., Bolker B., Walker S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1-48. [Google Scholar]
- Cai L. (2015). flexMIRT version 3.0: A numerical engine for flexible multilevel and multidimensional item analysis and test scoring [Computer Software]. Raleigh-Durham, NC: Vector Psychometric Group. [Google Scholar]
- De Boeck P. (2008). Random item IRT models. Psychometrika, 73, 533-559. [Google Scholar]
- Kass R. E., Raftery A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773-795. [Google Scholar]
- Muthén L. K., Muthén B. O. (1998-2015). Mplus users guide (7th ed.). Los Angeles, CA: Author. [Google Scholar]
- Nandakumar R., Hotchkiss L. (2012). PROC NLMIXED: For estimating parameters of IRT models. Applied Psychological Measurement, 38, 404-405. [Google Scholar]
- Schwarz G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461-464. [Google Scholar]
- Thissen D., Wainer H. (1982). Some standard errors in item response theory. Psychometrika, 47, 397-412. [Google Scholar]
- Tierney L., Kadane J. B. (1986). Accurate approximations for posterior moments and marginal densities. Journal of the American Statistical Association, 81, 82-86. [Google Scholar]
