Efficient and Effective Variational Bayesian Inference Method for Log-Linear Cognitive Diagnostic Model

Xue Wang; Jiwei Zhang; Jing Lu

doi:10.1017/psy.2024.7

. 2025 Jan 3;90(1):74–107. doi: 10.1017/psy.2024.7

Efficient and Effective Variational Bayesian Inference Method for Log-Linear Cognitive Diagnostic Model

Xue Wang ¹, Jiwei Zhang ^2,^✉, Jing Lu ^1,^✉

PMCID: PMC12478622

Abstract

In this paper, we propose a novel and highly effective variational Bayesian expectation maximization-maximization (VBEM-M) inference method for log-linear cognitive diagnostic model (CDM). In the implementation of the variational Bayesian approach for the saturated log-linear CDM, the conditional variational posteriors of the parameters that need to be derived are in the same distributional family as the priors, the VBEM-M algorithm overcomes this problem. Our algorithm can directly estimate the item parameters and the latent attribute-mastery pattern simultaneously. In contrast, Yamaguchi and Okada’s (2020a) variational Bayesian algorithm requires a transformation step to obtain the item parameters for the log-linear cognitive diagnostic model (LCDM). We conducted multiple simulation studies to assess the performance of the VBEM-M algorithm in terms of parameter recovery, execution time, and convergence rate. Furthermore, we conducted a series of comparative studies on the accuracy of parameter estimation for the DINA model and the saturated LCDM, focusing on the VBEM-M, VB, expectation-maximization, and Markov chain Monte Carlo algorithms. The results indicated that our method can obtain more stable and accurate estimates, especially for the small sample sizes. Finally, we demonstrated the utility of the proposed algorithm using two real datasets.

Keywords: cognitive diagnostic assessments, expectation-maximization algorithm, log-linear cognitive diagnostic model, Markov chain Monte Carlo, variational Bayesian algorithm

1. Introduction

Cognitive diagnostic assessments (CDAs) have developed rapidly over the past several decades, and they are widely used in educational and psychological research (de la Torre, 2009, 2011; de la Torre & Douglas, 2004; DiBello et al., 2007; Haberman & von Davier, 2007; Henson et al., 2009; Junker & Sijtsma, 2001; Rupp et al., 2010; Templin & Henson, 2006; von Davier, 2014a). The primary motivation for the development of CDAs is to ascertain whether or not a student has mastered some fine-grained skills or attributes that are required to solve a particular item. More specifically, not only can CDAs be used to analyze in detail the strengths and weaknesses of students in the areas they are learning, but they can also provide powerful tools to help teachers improve classroom instruction.

There is a wide variety of cognitive diagnostic models (CDMs) available in the published CDA literature (DiBello et al., 2007; Rupp & Templin, 2008b), and many of these are built on strong cognitive assumptions about the processes involved in problem-solving. These CDMs can be broadly classified into three different types: compensatory, non-compensatory, and general models. Compensatory models are based on the assumption of attribute compensation, which means that although the examinee may not have mastered all the attributes involved in an item, they are still more likely to score well on that item if they have mastered some of its attributes. This is because the attributes that the examinee has mastered can “compensate” for the other attributes that they have not mastered. The most famous compensatory model is the deterministic inputs, noisy “or” gate (DINO) model (Templin & Henson, 2006) and the linear logistic model (LLM; Maris, 1999). In contrast, non-compensatory models are constructed under the assumption of attribute conjunction, which means that under the assumption of an ideal response, an examinee can score on an item only after mastering all of the attributes involved in that item; otherwise, he or she will not be able to answer the item correctly. The widely used non-compensatory (conjunctive) models are the deterministic inputs, noisy and gate (DINA) model (Haertel, 1989; Junker & Sijtsma, 2001; Macready & Dayton, 1977) and the reduced reparameterized unified model (rRUM; Hartz, 2002). Some general CDM frameworks have also been established that include a variety of widely applied CDMs, such as the log-linear CDM (LCDM; Henson et al., 2009), the generalized DINA (GDINA; de la Torre, 2011) model, and the general diagnostic model (von Davier, 2008). Although DINA, DINO, rRUM, and LLM were developed from different application backgrounds, they can in fact be viewed as special cases of the LCDM by restricting certain parameters to zero in its saturated version. Henson et al. (2009) detailed how the LCDM can be transformed into our traditional models such as DINA, DINO, rRUM, and LLM through parameter restrictions. Additionally, Ma and de la Torre (2016) elucidated that the LCDM and GDINA models are equivalent in their saturated forms.

Parameter estimation is the basis of model applications, and it is a prerequisite for interpretation of complicated data in the field of educational psychology. Several strategies have been developed to estimate the parameters of CDMs. Algorithms based on maximum likelihood have been widely used to estimate CDMs in the frequency framework. Examples using a marginal maximum likelihood method to estimate the parameters of several CDMs via an Expectation-Maximization (EM) algorithm (Dempster et al., 1977) can be found in the literature (de la Torre, 2009, 2011; Ma & de la Torre, 2016; Ma & Guo, 2019; Maris, 1999). Some available R packages, such as “CDM” (George et al., 2016) and “GDINA” (Ma & de la Torre, 2020), have been developed to estimate CDM parameters. However, algorithms based on maximum likelihood have some disadvantages, as elaborated by Yamaguchi and Templin (2022); for example, there is the possibility of a local maximum being reached by a maximum likelihood algorithm. Accordingly, it is challenging to discern whether parameter estimates are obtained from a global maximum, even if a multiple starting value method is used to evaluate their optimality. In addition, calculation of the variability (standard errors) of parameter estimates depends on asymptotic theory in the likelihood framework, and an asymptotic distribution with parameter restrictions may not be correct when small sample sizes are involved.

In parallel with maximum likelihood-based methods, Bayesian statistical methods have also gained widespread attention for inferring various types of CDM parameters (e.g., Chung, 2019; Culpepper, 2015, 2019, Culpepper & Hudson, 2018; DeCarlo, 2012; de la Torre & Douglas, 2004; Henson et al., 2009; Jiang & Carter, 2019; Liu, 2022; Liu et al., 2020; Zhan et al., 2019). More specifically, de la Torre and Douglas (2004) implemented a Metropolis–Hastings (MH) algorithm for estimating the higher-order DINA model parameters. Henson et al. (2009) also adopted the MH algorithm to estimate LCDM parameters. Liu et al. (2020) and Liu (2022) developed the Metropolis–Hastings Robbins–Monro (MH-RM) algorithm (Cai, 2010) to estimate CDM parameters. With the help of conjugate prior distributions, Culpepper (2015) proposed a Gibbs sampling algorithm to estimate the parameters of the DINA model; the corresponding R package “dina” was developed by Culpepper in (2015). On the basis of the work of Culpepper (2015), a new No-U-Turn Gibbs sampler was proposed by da Silva et al. (2018) to estimate the parameters of the DINA model. In addition, the Gibbs sampling algorithm has also been used for updating the Inline graphic -matrix in CDMs (Chung, 2019; Culpepper, 2019; Culpepper & Hudson, 2018). DeCarlo (2012) developed the software OpenBUGS (Thomas et al., 2006) for estimating reparameterized DINA model parameters. Zhan et al. (2019) published a tutorial for estimating various types of CDM estimation using the R package “R2jags” (Su & Yajima, 2015), which is associated with the JAGS program (Plummer, 2003). Jiang and Carter (2019) estimated the parameters of the LCDM by means of the Hamiltonian Monte Carlo (HMC) algorithm (Neal, 2011) in the Stan program (Carpenter et al., 2017). However, the computationally intensive nature of Markov chain Monte Carlo (MCMC) estimation for the CDM parameters presents a major hurdle to its widespread use in the empirical application of Bayesian approaches to the study of education when faced with large samples, numerous items, numerous attributes, and complex models (Yamaguchi & Okada, 2020a; Oka et al., 2023).

Researchers have recently become interested in the variational inference (VI) method as a more flexible and less computationally intensive alternative to traditional Bayesian statistical methods (Bishop, 2006; Blei et al., 2017; Cho et al., 2021; Grimmer, 2011; Jaakkola & Jordan, 2000; Jeon et al., 2017; Oka & Okada, 2023; Rijmen et al., 2016; Urban & Bauer, 2021; Yamaguchi, 2020; Yamaguchi & Martinez, 2023; Yamaguchi & Okada, 2020a, 2020b). Compared to the traditional MCMC methods, the VI method is a deterministic approximation approach that is based on posterior density factorization. This method accomplishes its goal of rapidly and efficiently dealing with large amounts of complex educational psychology data (e.g., large numbers of samples, items, and attributes) by transforming the statistical inference problem of the posterior density into an optimization problem. In view of their many benefits, VI algorithms have been developed to estimate a variety of psychological models such as item response theory models (Rijmen et al., 2016; Urban & Bauer, 2021), generalized linear mixed models (Jeon et al., 2017), and CDMs (Oka et al., 2023; Oka & Okada, 2023; Yamaguchi, 2020; Yamaguchi & Martinez, 2023; Yamaguchi & Okada, 2020a, 2020b).

Recently, Yamaguchi and Okada (2020b) introduced a VI method specifically tailored for the DINA model, marking a significant advancement in this field. This method was derived based on the optimal variational posteriors for each model parameter. Subsequently, Yamaguchi (2020) further extended VB inference applications by developing an algorithm for the multiple-choice item of the DINA model (MC-DINA). This extension to MC-DINA demonstrated the flexibility and computational efficiency of VB methods. Subsequently, Yamaguchi and Okada (2020a) developed a VB inference algorithm for saturated CDMs. They ingeniously introduced a G-matrix, reformulating existing generalized CDMs, typically parameterized by attribute parameters, into a Bernoulli mixture model. This reformulation facilitated conditionally conjugate priors for model parameters, simplifying the derivation process and enhancing algorithmic efficiency. Oka et al. (2023) sustained this trajectory of innovation by developing a VB algorithm for a polytomous-attribute saturated CDM. Their work, building on the foundational research of Yamaguchi and Okada (2020a), not only advanced the field but also incorporated parallel computing configuration. This significantly improved the computational efficiency of the VB algorithm, demonstrating its evolving capability to handle more complex CDM structures. Simultaneously, Oka and Okada (2023) tackled scalability challenges in CDMs by developing an estimation algorithm for the Q-matrix of DINA model. Their approach, integrating stochastic optimization with VI in an iterative algorithm, showcased the adaptability and robustness of VB methods in dealing with large-scale CDMs. This series of developments highlight the ongoing progress and effectiveness of VB methods in the estimation of diverse models within the CDMs framework.

To date, no VB algorithms have been developed to directly estimate the item parameters in the LCDM with a logit link function. This is largely due to the challenges in directly deriving the conditional posterior density of these item parameters. Although Yamaguchi and Okada (2020a) proposed the variational EM (VEM) algorithm to estimate the saturated LCDM, they actually used the least-squares transformation method (de la Torre et al., 2011) to convert the estimates of the item response probability of the item-specific attribute-mastery pattern parameters, obtained through the VEM algorithm, into the corresponding item parameters of the LCDM. Furthermore, Yamaguchi and Templin (2022) employed a one-to-one mapping within the Bayesian framework to equivalently transform the item response probability parameters, obtained through the Gibbs sampling algorithm, into item parameters in the LCDM model. This paper effectively bridges this gap by proposing a novel and highly effective variational Bayesian EM-maximization (VBEM-M) algorithm for estimating the saturated LCDM. Briefly, we obtained a tight lower bound on the likelihood function of the LCDM model using Taylor expansion (Jaakkola & Jordan, 2000), where the item parameters take a quadratic form. This allows for the existence of a conjugate prior distribution, enabling the implementation of the VI method. Consequently, the VI algorithm can be executed in the LCDM by deriving a specific posterior distribution for the item parameters, originating from the Gaussian prior distribution that serves as the conjugate prior for item parameters.

We outline the benefits from the following perspectives to highlight the advantages by which the VBEM-M algorithm excels above the other algorithms. First, our VBEM-M algorithm overcomes the problem of the conditional variational posteriors of the parameters that need to be derived being in the same distributional family as the priors in the implementation of the VI method for the saturated LCDM formulation. Second, the VBEM-M algorithm can directly estimate the item parameters and latent attribute-mastery pattern (also called “attribute profile”) simultaneously, unlike Yamaguchi and Okada’s (2020a) VEM algorithm, which requires a two-step process to acquire the estimation of item parameters. Third, the VBEM-M algorithm can obtain a more stable and accurate estimate than an EM algorithm, especially in high-dimensional and small sample size conditions. Finally, our VBEM-M algorithm offers considerable benefits in computing time compared to the time-consuming traditional MCMC algorithms. This is because we use the VI method to transform a posterior inference issue into an optimization problem.

The rest of this paper is organized as follows. Section 2 presents the LCDM and its special case, the DINA model; Section 3 introduces the specific implementation of the VBEM-M algorithm for estimating the LCDM. Section 4 presents three simulation studies that evaluate the performance of the VBEM-M algorithm in parameter recovery across different simulation conditions and compares the performance of the VBEM-M, VB, MCMC, and EM algorithms. Section 5 uses two empirical examples to demonstrate the model estimation results of these four algorithms. Finally, some concluding remarks are presented in Section 6.

2. Cognitive diagnostic models

2.1. Log-linear cognitive diagnostic model

In this study, we focused on the LCDM. This is because it is a general model that contains a large number of models that have been previously discussed, such as DINA, DINO, rRUM, and LLM (Henson et al., 2009). More importantly, the LCDM can provide a parameterization that not only enables it to characterize the differences between the various models but also offers support for more complex data structures (Henson et al., 2009). In fact, any possible set of constraints for the saturated form LCDM can be used to define a model that fits the item response in the framework of cognitive theory. Moreover, a better understanding of the relationships between compensatory models and non-compensatory models can be described in the general parametric form. After this, a brief introduction to the LCDM will be given.

First, we define several indices that will be important throughout this paper. Each examinee is denoted by Inline graphic , each item by , each attribute by , and the latent class corresponding to an attribute profile is denoted by . We consider the latent attribute to be a binary variable, where the absence or presence of the corresponding attribute is represented by the values 0 and 1, respectively. Inline graphic is a vector of K-dimensional latent attribute profiles for the ith examinee. In light of the categorical nature of the latent classes, belongs to one of latent attribute profiles. Defining as the attribute profile for examinees of class l, where is 1 if the examinees of class l acquire skill k and 0 otherwise, will be useful in the following.

Inline graphic denotes a matrix of dimensions containing all the attribute profiles. The -matrix(Tatsuoka, 1983) is a matrix used to describe the relationship between attributes and items, where , and is a vector of the jth row of the -matrix; that is, : if the attribute k is required by item j, and Inline graphic otherwise. Next, a binary latent indicator variable is introduced, which satisfies , where denotes the ith examinee belonging to the lth attribute profile (i.e., ). Let be the observed item response for the ith examinee to the jth item: if the ith examinee gives the correct answer for the jth item, and it is 0 otherwise. The corresponding item response matrix for all examinees answering all items is Inline graphic , where , . Then, the probability of a correct response for the LCDM can be expressed as

(1)

where Inline graphic is the intercept parameter, and indicates the probability that an examinee answers correctly on item j if he or she does not master any of the attributes examined on that item. is the slope parameter vector, which is composed of a vector, where . represents a set of linear combinations of Inline graphic and :

(2)

Combining the latent variable Inline graphic and Eq. (2), the LCDM can be rewritten as

(3)

2.2. DINA model

The DINA model, as a special case of the LCDM, has a relatively straightforward structure and widespread adoption in cognitive diagnostic assessments; specialized software packages are also available for a number of estimation techniques grounded in the model. Therefore, we provide a short overview of the traditional DINA model and its interconversion with the LCDM. Two-item parameters have been introduced in the traditional DINA models for each item j: Inline graphic is the slipping parameter and is the guessing parameter, and the probability of a correct response can be written as

(4)

where Inline graphic is the ideal response pattern. indicates that examinee i possesses all the required attributes for item j; otherwise, . The parameters and can be formally defined by

(5)

Since the estimation approach presented in this work is based on the LCDM, we must first convert the DINA model to LCDM format. Our next topic is the connection between the DINA model and the LCDM and how they may be converted back and forth.

Let Inline graphic denote an indicator set of attributes investigated by item j and denote the number of investigated attributes. Then, the DINA model can be rewritten in the form:

(6)

where

(7)

For simplicity, we denote Inline graphic as and the DINA model is equivalent to the following form:

(8)

While this study focuses mostly on the LCDM, various variants of the LCDM, such as the DINO model, LLM, and saturated LCDM, are also discussed. We will therefore not go into great depth here; instead, the reader should refer to the Supplementary Material for the necessary information.

3. Variational Bayesian EM-maximization algorithm for the LCDM

3.1. Variational Bayesian EM algorithm

Since it is straightforward to convert an approximate conditional posterior distribution problem into an optimization problem using VI methods, these techniques see extensive application in inferring Bayesian models in the area of machine learning (Beal, 2003; Bishop, 2006; Jordan et al., 1999). Next, we briefly outline the implementation process of the variational Bayesian EM (VBEM) algorithm (Beal, 2003). Assume that the observed dataset Inline graphic is produced by model , where model consists of the latent variables ) and model parameters . Next, we specify a variational density family over the unknown variables and . The purpose of this is to establish the optimal approximation to their posterior distribution using this specified variational density (i.e., Inline graphic ). Next, we introduce the concept of the evidence lower bound (ELBO), which is critical for determining the optimal . Let be a marginal density of the model ; the ELBO can then be represented as a lower bound of the logarithm marginal density :

(9)

where Inline graphic is denoted as the ELBO, which is a function of the free distribution . We need to maximize with respect to so that it tends more closely to . Blei et al. (2017) presented a formula connecting with the ELBO and the Kullback–Leibler (KL) divergence:

(10)

Since Inline graphic is a constant with respect to , maximizing the ELBO is actually equivalent to minimizing the KL distance. Specifically, the optimal we obtained in the variational density family is the density that minimizes the KL divergence between the posterior distribution and itself. To further simplify the variational density Inline graphic , we assume that it satisfies mean-field theory. Mean-field theory has been widely used in variational Bayesian inference (Beal, 2003; Blei et al., 2017; Jordan et al., 1999; Wand et al., 2011). In the mean-field theory, latent variables are mutually independent and each is governed by a separate factor in the variational density, allowing the variational density Inline graphic to be decomposed into . An iterative optimization procedure is implemented by seeking to maximize the mean-field variational density of a parameter of interest while fixing the others. The VB algorithm can be divided into the following two steps:

(11)

where Inline graphic and are the normalizing constants. To sum up, the variational density for the latent variable is updated in the VBE step, while the variational density for the model parameters is updated in the VBM step. Therefore, the prerequisite to be able to implement the VBEM algorithm is that the posterior distribution of all parameters, either latent variables or model parameters, should have a closed form. The VEM algorithm proposed by Yamaguchi and Okada (2020a, 2020b) in educational psychometric research is essentially identical to the VBEM algorithm provided by Beal (2003), with the only differences being in nomenclature.

3.2. Variational methods in Bayesian logistic regression

As mentioned above, implementing the VBEM algorithm requires a closed form for the posterior distributions of each parameter. Therefore, the VBEM algorithm cannot be directly applied to the LCDM based on the logit link function. To overcome this challenge, we adopt Jaakkola and Jordan’s (2000) variational Bayesian method for logistic regression models to estimate the more complex LCDM in the cognitive diagnostic framework. Specifically, their method uses a Taylor expansion on the logistic function to obtain a tight lower bound, facilitating parameter representation in a Gaussian distribution form that is easily implementable for VI. Next, we will provide the mathematical expression that Jaakkola and Jordan (2000) used for performing the first-order Taylor expansion and the specific derivation of the tight lower bound.

Consider the logistic function Inline graphic . The corresponding log logistic function can be derived as

(12)

Denote that

By calculating the second derivative, we can determine that Inline graphic is a convex function about the variable . Therefore, any tangent line of can serve as its lower bound, as it will always be less than or equal to . A tight lower bound function for can be obtained by executing a first-order Taylor expansion on the function in terms of the variable Inline graphic at the point ,

(13)

According to Eqs.(12) and (13), we can derive a tight lower bound of Inline graphic with the specific form as

(14)

which results in a quadratic form on Inline graphic .

Regarding the LCDM, which also employs a logistic form, Inline graphic represents a set of linear combinations. These combinations involve unknown item parameters, an individual’s latent attribute vector, and the known -matrix within the LCDM framework (for further details, please refer to Eqs. (1) and (2)). Based on Eq. (14), we can derive a quadratic form for the item parameter. Consequently, the VI algorithm can be implemented in the LCDM by deriving a specific posterior distribution for item parameters using the Gaussian prior distribution, which serves as the conjugate prior for these parameters. In the next subsection, we will focus on elucidating the process of deriving the tight lower bound in the LCDM using Eq. (14).

3.3. Tight lower bound for the LCDM

In this section, the goal is to derive the tight lower bound for LCDM as outlined above. We first conducted a transformation on the item response data in the LCDM to make it easier to acquire the tight lower bound term of the likelihood function before providing the implementation of our VBEM-M algorithm. The item response data Inline graphic is transformed into with the help of the equation . Let and ; the item response probability of is then given by

(15)

Recalling the logistic function form, the item response probability of Inline graphic can then be rewritten as follows:

(16)

Therefore, the likelihood based on the introduced latent variable z can be represented by

(17)

According to Eq. (14), the tight lower bound function for Inline graphic is determined by performing a first-order Taylor expansion with respect to the variable at the point . Therefore, a tight lower bound of the likelihood for the LCDM can be derived by:

(18)

Given that the tight lower bound for the likelihood function is of exponential form, using the multivariate normal distribution as a conjugate prior distribution for Inline graphic will yield a closed-form posterior distribution. Due to these considerations, in the subsequent computations, we implement the VBEM-M algorithm using the tight lower bound of the likelihood function rather than the original likelihood function. Moreover, it is important to highlight that a new local parameter, Inline graphic , has been introduced at this stage. Determining the optimal value for is an essential part of our analysis. In this paper, we implement a maximization process to ascertain the most suitable value for . The detailed methodology behind this process will be elaborated in the following subsection.

3.4. Fully Bayesian representation of the joint posterior distribution

In the fully Bayesian framework, statistical inference relies on the selection of the prior distribution. The posterior distribution can be derived by combining the prior distribution (prior information) with the likelihood function (sample information). Prior distributions from the following Bayesian hierarchical structures will be considered in this study:

(19)

where Inline graphic is a -dimensional identity matrix. Parameter c is a truncation parameter. Some literature restricts the main effect terms of to non-negative values (Zhan et al., 2019). To address this, a truncation parameter c is introduced to adjust the range of values for the prior parameter Inline graphic . For example, when c is set to , there is no restriction on , while setting restricts to non-negative values. In practice, users can adjust the value of c to restrict the range of according to their specific requirements. Let , the joint posterior distribution of based on the tight lower bound can be represented by

(20)

where Inline graphic denotes a constant. The logarithm of can be further expressed as

(21)

3.5. Implementation of VBEM-M algorithm for LCDM

Assuming that the joint variational density of Inline graphic for the LCDM satisfies mean-field theory, the following equation holds:

(22)

Let Inline graphic ; in terms of Eqs. (9) and (21), the ELBO can then be derived as

(23)

where Inline graphic is a tight lower bound of . Next, we maximize to obtain estimates of latent variables , model parameters , hyperparameters , and local point parameter . Specifically, there are three steps to the implementation process:

VBE step: update variational density for latent variable;
VBM step: update variational densities for model parameters and hyperparameters;
M step: update local point parameter by maximizing .

In the following text, Inline graphic denotes the optimal variational posterior in each iteration. To keep things simple, we only present the core formulation for updating. The specifics can be found in the Supplementary Material. The estimation procedure of the VBEM-M algorithm is shown in Table 1. In Table 1 and subsequent tables, all parameters are estimated using their posterior means. In addition, the specific implementation process of each step for the VBEM-M algorithm is shown in Figure 1.

Table 1.

Estimation procedure of the VBEM-M algorithm

VBEM-M Algorithm
Input: , , , , , , , , T
Initialization: , , , .
Repeat
(a) VBE-step: update according to Eq. (25).
(b) VBM-step:
(b1) update according to Eq. (27).
(b2) update according to Eq. (29).
(b3) update according to Eq. (31).
(b4) update according to Eq. (33).
(b5) update according to Eq. (35).
(c) M-step: update using Eq. (37).
Until the absolute difference of between two adjacent iterations is less than or t >T, where is the convergence threshold and T is the maximum iterations.

Open in a new tab

Graphical illustration of the VBEM-M algorithm implementation process. Let . the variational density of the latent variable is updated in the VBE-step. In VBM-step, the variational densities for model parameters and hyperparameters are updated. In M-step, we update by maximizing .

(a) VBE step. In this step, we update the variational density of Inline graphic for each i, where . is derived to be a categorical distribution with parameter . That is,

(24)

where

(25)

(b) VBM step. In this step, we update the variational density for Inline graphic , , , and

(b1) Update the variational density for Inline graphic

Inline graphic is derived to be a Dirichlet distribution with parameter . That is,

(26)

where

(27)

(b2) Update the variational density for Inline graphic

Inline graphic is proportional to a multivariate normal distribution with mean vector and covariance . That is,

(28)

where

(29)

(b3) Update the variational density for Inline graphic

Inline graphic is proportional to a normal distribution with mean and variance . That is,

(30)

where

(31)

where Inline graphic is the corresponding expected value of the element in the vector .

(b4) Update the variational density for Inline graphic

Inline graphic is proportional to a truncated normal distribution with mean and variance . Specifically,

(32)

where

(33)

where Inline graphic denotes the number of all main effect terms, , and is the corresponding expected value of the element in the vector .

(b5) Update the variational density for Inline graphic

Inline graphic is proportional to a truncated normal distribution with mean and variance . Specifically,

(34)

where

(35)

where Inline graphic denotes the number of all interaction terms.

(c) M step. In this step, we update the local point parameter Inline graphic . To obtain the optimal , we need to maximize by computing the derivative of to zero:

(36)

Therefore, we have

(37)

Considering the aforementioned presentation of the VBEM-M algorithm, it is clear that we need to compute a large number of expectations using categorical, Dirichlet, normal, multivariate normal, and truncated normal distributions. Some formulae for calculating these expectations are as follows:

(38)

where Inline graphic is , , is the density function of a standard normal distribution, and is the cumulative distribution function of a standard normal distribution.

4. Simulation study

In the following simulation studies, we address three primary concerns: First, the performance of the VBEM-M algorithm under various conditions for the DINA model; second, the performance of the VBEM-M algorithm, based on the DINA model, compares to Yamaguchi and Okada’s (2020b) VB method, the MCMC algorithms within the full Bayesian framework, and the EM algorithm in the frequency framework under different simulation settings; third, the performance of the VBEM-M algorithm is compared with the VB, MCMC, and EM algorithms under the saturated LCDM with different simulation conditions. Supplementary Material showcases the performance of the VBEM-M algorithm for the DINA model under different initial values and in other widely used CDMs, including the DINO model and LLM.

4.1. Data generation

Item response data Inline graphic is generated from a Bernoulli distribution with probability of correct response . The true values of the item parameters based on DINA model are constrained by considering four different levels of noise to investigate the correlation between noise and recovery. For each item, the following scenarios are considered. (a1) Low noise level (LNL): Inline graphic , with corresponding true values , . (a2) High noise level (HNL): , with corresponding true values , . (a3) Slipping higher than guessing (SHG): , , with corresponding true values , . (a4) Guessing higher than slipping (GHS): , , with corresponding true values , Inline graphic .

To generate the attribute-mastery patterns, we used the same procedure as Chiu and Douglas (2013), which takes into account the correlations among the attributes. Specifically, Inline graphic are generated from a multivariate normal distribution; that is, , where and

where the off-diagonal elements of Inline graphic are . As increases from 0 to 1, the correlation between attributes also increases from 0 to maximum. The relationships between the attribute profiles and can be expressed as if and otherwise. Although the -matrices are created randomly, they still conform to the identifiability constraints outlined by Chen et al. (2015, 2017), Liu and Andersson (2020), and Xu and Shang (2018). We present the Inline graphic -matrices used in these simulations in the Supplementary Material.

4.2. Prior distributions

The prior parameter Inline graphic is set as (Culpepper, 2015; Zhan et al., 2019), where denotes a L-dimensional vector with all elements equal to 1. The hyperparameters are chosen as follows: , , and .

4.3. Estimation software

We implemented four different approaches, namely, the VBEM-M algorithm, VB algorithm, MCMC sampling algorithm, and EM algorithm, using the R programming language (R Core Team, 2017) on a desktop computer equipped with Intel (R) Core (TM) i5-10400 CPU @ 2.90GHz, 16GB RAM. To enhance the computational efficiency of the VBEM-M method, we utilized two R packages, “Rcpp” (Eddelbuettel & Francois, 2011) and “RcppArmadillo” (Eddelbuettel & Sanderson, 2014), to call the C++ programming language. The R code of our VBEM-M algorithm can be found in the Supplementary Material. We used the R package “variationalDCM” (Hijikata et al., 2023) to implement the VB method. The MCMC sampling algorithms were implemented separately using the R packages “dina” (Culpepper & Balamuta, 2019), which is integrated with the C++ program, and “R2jags” (Su & Yajima, 2015) which is associated with the JAGS program (Plummer, 2003). The EM algorithm was implemented using the R packages “GDINA” (Ma & de la Torre, 2020) and “CDM” (George et al., 2016), respectively.

4.4. Convergence diagnosis

The VBEM-M algorithm was considered converged if the absolute difference between two consecutive iterations was less than Inline graphic , or if the number of iterations T had reached 2,000. When using the R packages “dina” and “R2jags” to implement the MCMC sampling algorithms, for the DINA model, Culpepper (2015) demonstrated that it would have converged after 750 iterations, thus the chain length was set to 2,000 and the first 1,000 iterations were set as a “burn-in” period. For the saturated LCDM, we chose a chain length of 10,000, with a burn-in of 5,000. For the EM algorithm, when employing the R package “GDINA,” the convergence criteria is when the maximum absolute change in item success probabilities between consecutive iterations was smaller than Inline graphic or when T exceeded 2,000. In addition, when using the R package “CDM,” iteration will end if the maximal change in parameter estimates is below .

4.5. Evaluation Criteria

For item parameters and class membership probability parameters, we assess the accuracy of parameter estimation using bias and root mean square error (RMSE). For attribute parameters, we adopt the following two evaluation indices: the pattern-wise agreement rate (PAR), which indicates the rates of correct classification for attribute patterns, and is formulated as

(39)

and the attribute-wise agreement rate (AAR), which signifies the rates of correct classification for individual attributes, and is defined as

(40)

where Inline graphic is the true value of the ith student’s attribute profile and is the estimated value of . is the estimated value of for the specific attribute k.

4.6. Simulation study 1

In this simulation study, we explored the performance of the VBEM-M algorithm under various simulation conditions. We set the test length to Inline graphic , the number of attributes was set to , and the corresponding -matrix is shown in the Supplementary Material. The following manipulated conditions were considered: (A) number of examinees and 2,000; (B) correlation among attributes , 0.3 and 0.7; and (C) noise levels LNL, HNL, SHG, and GHS. Fully crossing different levels of these three factors yields 24 conditions (2 sample sizes Inline graphic 3 correlations 4 noise levels). There were 100 replications for each simulation condition. The recovery results of parameters are displayed in Tables 2 and 3 and Figure 3.

Table 2.

The accuracy of item parameters and class membership probability parameters using the VBEM-M algorithm in simulation study 1


LNL
	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD
	0.1351(–0.0118)	0.0920	0.2560(0.0365)	0.1765	0.0022(0.0000)	0.0053	0.0976(–0.0038)	0.0652	0.1877(0.0150)	0.1260	0.0015(0.0000)	0.0038
	0.1369(–0.0140)	0.0937	0.2337(0.0307)	0.1617	0.0022(0.0000)	0.0051	0.0981(–0.0077)	0.0664	0.1669(0.0133)	0.1152	0.0016(0.0000)	0.0037
	0.1388(–0.0104)	0.0967	0.2216(0.0279)	0.1516	0.0021(0.0000)	0.0045	0.1004(–0.0073)	0.0687	0.1600(0.0135)	0.1078	0.0015(0.0000)	0.0032

HNL	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD
	0.1131(–0.0038)	0.0844	0.2240(0.0261)	0.1621	0.0058(0.0000)	0.0053	0.0797(–0.0038)	0.0559	0.1605(0.0132)	0.1155	0.0041(0.0000)	0.0038
	0.1126(–0.0088)	0.0859	0.1979(0.0138)	0.1489	0.0056(0.0000)	0.0051	0.0813(–0.0016)	0.0609	0.1417(0.0069)	0.1059	0.0039(0.0000)	0.0037
	0.1130(–0.0080)	0.0889	0.1738(0.0144)	0.1393	0.0048(0.0000)	0.0045	0.0822(–0.0038)	0.0629	0.1297(0.0101)	0.0990	0.0036(0.0000)	0.0032

SHG	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD
	0.1406(–0.0093)	0.0919	0.2218(0.0217)	0.1667	0.0035(0.0000)	0.0053	0.1029(–0.0061)	0.0652	0.1649(0.0156)	0.1185	0.0025(0.0000)	0.0038
	0.1440(–0.0100)	0.0935	0.2135(0.0171)	0.1534	0.0033(0.0000)	0.0051	0.1026(–0.0070)	0.0665	0.1475(0.0130)	0.1090	0.0025(0.0000)	0.0037
	0.1478(–0.0156)	0.0968	0.1932(0.0240)	0.1444	0.0032(0.0000)	0.0045	0.1027(–0.0106)	0.0688	0.1362(0.0144)	0.1026	0.0021(0.0000)	0.0032

GHS	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD	RMSE(Bias)	SD
	0.1038(–0.0064)	0.0844	0.2537(0.0279)	0.1729	0.0035(0.0000)	0.0053	0.0729(–0.0011)	0.0599	0.1852(0.0141)	0.1227	0.0026(0.0000)	0.0038
	0.1073(–0.0047)	0.0860	0.2262(0.0193)	0.1573	0.0035(0.0000)	0.0051	0.0752(0.0005)	0.0610	0.1650(0.0131)	0.1119	0.0025(0.0000)	0.0037
	0.1071(–0.0074)	0.0889	0.2031(0.0186)	0.1466	0.0032(0.0000)	0.0045	0.0763(–0.0023)	0.0630	0.1470(0.0080)	0.1042	0.0023(0.0000)	0.0032

Open in a new tab

Note: The values outside parentheses represent the RMSE, while the values inside the parentheses indicate the bias. These reflect the average RMSE, bias and SD for all intercept parameters Inline graphic , slope parameters , and class membership probability parameters .

Table 3.

The accuracy of attribute profile parameters using the VBEM-M algorithm in simulation study 1


LNL	AAR1	AAR2	AAR3	AAR4	AAR5	PAR	AAR1	AAR2	AAR3	AAR4	AAR5	PAR
	0.9792	0.9667	0.9880	0.9672	0.9903	0.9025	0.9787	0.9667	0.9876	0.9669	0.9900	0.9007
	0.9807	0.9718	0.9877	0.9712	0.9918	0.9107	0.9803	0.9727	0.9876	0.9719	0.9914	0.9120
	0.9821	0.9799	0.9874	0.9797	0.9941	0.9290	0.9827	0.9807	0.9872	0.9803	0.9940	0.9307

HNL	AAR1	AAR2	AAR3	AAR4	AAR5	PAR	AAR1	AAR2	AAR3	AAR4	AAR15	PAR
	0.9102	0.8959	0.9334	0.8919	0.9413	0.6736	0.9098	0.8939	0.9332	0.8935	0.9413	0.6731
	0.9137	0.9085	0.9372	0.9039	0.9497	0.6980	0.9150	0.9111	0.9375	0.9061	0.9491	0.7031
	0.9345	0.9335	0.9512	0.9286	0.9608	0.7724	0.9364	0.9350	0.9520	0.9301	0.9628	0.7781

SHG	AAR1	AAR2	AAR3	AAR4	AAR5	PAR	AAR1	AAR2	AAR3	AAR4	AAR15	PAR
	0.9525	0.9443	0.9707	0.9406	0.9766	0.8172	0.9516	0.9458	0.9724	0.9432	0.9761	0.8213
	0.9596	0.9452	0.9740	0.9435	0.9762	0.8266	0.9592	0.9462	0.9739	0.9432	0.9762	0.8269
	0.9689	0.9618	0.9777	0.9604	0.9826	0.9871	0.9693	0.9630	0.9785	0.9614	0.9831	0.8734

GHS	AAR1	AAR2	AAR3	AAR4	AAR5	PAR	AAR1	AAR2	AAR3	AAR4	AAR15	PAR
	0.9476	0.9442	0.9689	0.9416	0.9762	0.8153	0.9489	0.9462	0.9685	0.9424	0.9762	0.8186
	0.9476	0.9521	0.9667	0.9467	0.9783	0.8238	0.9485	0.9519	0.9665	0.9478	0.9780	0.8248
	0.9624	0.9620	0.9755	0.9584	0.9818	0.8630	0.9623	0.9623	0.9753	0.9590	0.9813	0.8635

Open in a new tab

Note: AAR1 represents the correct classification rate for the first attribute, AAR2 for the second attribute, AAR3 for the third attribute, AAR4 for the fourth attribute, and AAR5 for the fifth attribute. PAR stands for the pattern-wise agreement rate.

The boxplots of bias and RMSE for , and estimated by the VBEM-M, VB, MCMC-dina, MCMC-R2jags, EM-GDINA and EM-CDM with under the LNL condition in simulaion study 2.

The following conclusions can be drawn from Tables 2 and 3. (1) Given the correlation and noise levels, when the number of examinees is increased from 1,000 to 2,000, the average RMSE, the average bias, and standard deviation (SD) for η , λ , and π show decreasing trends. For example, when the correlation among attributes is 0.3 and the LNL is applied, increasing the number of examinees from 1,000 to 2,000 results in the average bias of η decreasing from -0.0140 to -0.0077, and the average bias of λ decreasing from 0.0307 to 0.0133. The average RMSE of η decreases from 0.1369 to 0.0981, the average RMSE of λ from 0.2337 to 0.1669, and the average RMSE of π from 0.0022 to 0.0016. The SD of η decreases from 0.0937 to 0.0664, the SD of λ decreases from 0.1617 to 0.1152, and the SD of π decreases from 0.0051 to 0.0037. (2) When the number of examinees and the noise level are given, with increasing Inline graphic , the average RMSE for increase somewhat. This indicates that is less impacted by the correlation between attributes. is substantially more impacted by ; specifically, the average bias and RMSE for tend to decrease markedly as increases. In the meanwhile, RMSE for also tend to decrease as Inline graphic increases. For example, when the number of examinees is fixed at 1,000 and the LNL noise level is applied, the average bias are –0.0118, 0.140, 0.104, respectively, and the average RMSE rises from 0.1351 to 0.1388 when increases. The change in bias and RMSE of are found to be slight. However, the decreases in bias and RMSE are markedly greater for Inline graphic , with the average bias of decreasing from 0.0365 to 0.0279 and the corresponding average RMSE decreasing from 0.2560 to 0.2216. For , the average bias remains at 0.0000 in all conditions, while the average RMSE exhibits the largest change in the HNL condition, decreasing from 0.0058 to 0.0048. (3) The accuracy of attribute profile recovery is highest under the LNL condition because the noise is the lowest. For example, with a fixed number of examinees at 1,000 and a correlation of Inline graphic , the PAR is 0.9025 under the LNL condition and only 0.6736 under the HNL condition. Under the LNL condition, the AAR values for five attributes exceed 0.9667 across various sample sizes and levels of attribute correlation. Moreover, the accuracy of attribute profile recovery tends to improve as Inline graphic increases.

In Figure 3, as an explanation, we only show the recovery results for the LNL and HNL based on the sample size Inline graphic . On each item, the bias of are almost the same for the LNL and the HNL. Furthermore, when the correlation between attributes is strengthened ( from 0 to 0.7), there is no difference between the bias and RMSE of in the LNL (HNL). It was also discovered that, for both low and high levels of noise, the RMSE of Inline graphic is lower when the items evaluate more attributes. At low noise levels, for instance, the RMSE of for the first item evaluating one attribute is greater than that for the eleventh item evaluating the first three attributes together. For , although the bias of differs on each item at low and high noise levels, the values of bias are basically around 0. Similarly, for both low and high levels of noise, the RMSE of Inline graphic is lower when items have higher correlation among themselves. This is because as the attribute correlation increases, more accurate estimates of are obtained, which in turn enhances the accuracy of estimates. This also provides an empirical guarantee for our later practical research. That is, when designing the items, we should aim to achieve higher correlations between attributes to increase the accuracy of parameter estimation.

Additionally, we assess the performance of the VBEM-M algorithm under different initial values (please see the Supplementary Material for details), and the results showed that our VBEM-M algorithm is not affected by the different initial values.

4.7. Simulation study 2

The purpose of this simulation study is to compare the proposed method with Yamaguchi and Okada’s (2020b) VB method, the MCMC sampling algorithms, and the EM algorithm in terms of parameter accuracy for the DINA model. Specifically, the R package “variationalDCM” was used to implement Yamaguchi and Okada’s (2020b) VB method, while the R packages “dina” and “R2jags” were used to implement the MCMC sampling algorithms. The EM algorithm was implemented using the R packages “GDINA” and “CDM”.

The simulation design is as follows: the test length was fixed at Inline graphic , and the number of attributes was set to . The varying conditions of the simulation are as follows: (D) The number of examinees , 500, 1,000, and 2,000; (E) correlation among attributes , 0.3, and 0.7; and (F) LNL and HNL conditions. Fully crossing different levels of these two factors yields 24 conditions (4 sample sizes Inline graphic 3 correlations 2 noise levels). Each simulation condition was replicated 100 times. The recovery results of item parameters and attribute profile recovery for all six methods are shown in Tables 4 and 5. Due to the space limit, we only present the results with the correlation in Tables 4 and 5; the other two correlation cases ( Inline graphic and ) are given in the Supplementary Material. Figure 2 depicts the boxplots of the bias and RMSE for , , and estimated by the six methods with under the LNL condition. Table 6 shows the computation time for these six methods under the same conditions. Here, the displayed computation time is the average time across 100 replications.

Table 4.

The accuracy of item parameters and class membership probability parameters using the VBEM-M, VB, MCMC-dina, MCMC-R2jags, EM-GDINA, and EM-CDM algorithms for the DINA model under the Inline graphic condition in simulation study 2


	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.2759(–0.0439)	0.3068(0.0255)	0.3043(0.0291)	0.3045(0.0267)	0.3660(–0.0533)	0.3659(–0.0531)
	0.1881(–0.0191)	0.1973(0.0112)	0.1968(0.0127)	0.1970(0.0120)	0.2050(–0.0172)	0.2050(–0.0172)
	0.1370(–0.0103)	0.1400(0.0052)	0.1400(0.0059)	0.1400(0.0055)	0.1426(–0.0088)	0.1426(–0.0088)
	0.0974(-0.0033)	0.0987(0.0046)	0.0988(0.0050)	0.0987(0.0048)	0.0994(–0.0024)	0.0994(–0.0024)

LNL	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.4500(0.1151)	0.5507(–0.1284)	0.5470(–0.1441)	0.5554(–0.1670)	1.0239(0.2478)	1.0238(0.2478)
	0.3284(0.0541)	0.3552(–0.0502)	0.3551(–0.0564)	0.3567(–0.0656)	0.3961(0.0555)	0.3961(0.0555)
	0.2437(0.0295)	0.2532(–0.0243)	0.2527(–0.0272)	0.2539(–0.0317)	0.2638(0.0263)	0.2638(0.0263)
	0.1780(0.0129)	0.1814(–0.0146)	0.1814(–0.0159)	0.1817(–0.0183)	0.1845(0.0104)	0.1845(0.0104)

	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.0055(0.0000)	0.0055(0.0000)	0.0055(0.0000)	0.0055(0.0000)	0.0056(0.0000)	0.0056(0.0000)
	0.0034(0.0000)	0.0034(0.0000)	0.0034(0.0000)	0.0034(0.0000)	0.0034(0.0000)	0.0034(0.0000)
	0.0023(0.0000)	0.0023(0.0000)	0.0023(0.0000)	0.0023(0.0000)	0.0023(0.0000)	0.0023(0.0000)
	0.0016(0.0000)	0.0016(0.0000)	0.0016(0.0000)	0.0016(0.0000)	0.0016(0.0000)	0.0016(0.0000)

	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.2250(–0.0207)	0.2533(0.0010)	0.2456(0.0095)	0.2465(0.0036)	0.3180(0.0351)	0.3181(–0.0351)
	0.1564(–0.0122)	0.1633(0.0002)	0.1622(0.0047)	0.1625(0.0016)	0.1682(–0.0103)	0.1681(–0.0102)
	0.1113(–0.0042)	0.1136(0.0024)	0.1133(0.0051)	0.1134(0.0036)	0.1150(–0.0027)	0.1150(–0.0027)
	0.0811(–0.0045)	0.0819(–0.0011)	0.0817(0.0000)	0.0818(–0.0005)	0.0824(–0.0039)	0.0824(–0.0039)

HNL	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.4108(0.0928)	0.4754(–0.0237)	0.4617(–0.0591)	0.4686(–0.0897)	0.6678(0.1202)	0.6679(0.1202)
	0.2874(0.0322)	0.3064(–0.0179)	0.3042(–0.0342)	0.3068(–0.0472)	0.3192(0.0249)	0.3192(0.0249)
	0.2061(0.0211)	0.2118(–0.0047)	0.2112(–0.0138)	0.2116(–0.0199)	0.2165(0.0168)	0.2165(0.0169)
	0.1489(0.0118)	0.1509(–0.0011)	0.1505(–0.0055)	0.1507(–0.0087)	0.1524(0.0094)	0.1524(0.0094)

	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.0111(0.0000)	0.0113(0.0000)	0.0107(0.0000)	0.0107(0.0000)	0.0146(0.0000)	0.0146(0.0000)
	0.0077(0.0000)	0.0077(0.0000)	0.0076(0.0000)	0.0077(0.0000)	0.0088(0.0000)	0.0088(0.0000)
	0.0056(0.0000)	0.0057(0.0000)	0.0057(0.0000)	0.0057(0.0000)	0.0060(0.0000)	0.0060(0.0000)
	0.0041(0.0000)	0.0041(0.0000)	0.0042(0.0000)	0.0042(0.0000)	0.0043(0.0000)	0.0043(0.0000)

Open in a new tab

Note: The values outside the parentheses represent the RMSE, while the values inside the parentheses indicate the bias. Here, RMSE and Bias denote the average RMSE and Bias, respectively, for all intercept parameters Inline graphic , all slope parameters and all class membership probability parameters .

Table 5.

The accuracy of attribute profile parameters using the VBEM-M, VB, MCMC-dina, MCMC-R2jags, EM-GDINA, and EM-CDM algorithms for the DINA model under the Inline graphic condition in simulation study 2

	AAR1						AAR2
	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.9877	0.9876	0.9876	0.9878	0.9874	0.9874	0.9580	0.9572	0.9579	0.9577	0.9564	0.9564
	0.9875	0.9875	0.9876	0.9873	0.9875	0.9875	0.9578	0.9576	0.9577	0.9573	0.9576	0.9576
	0.9886	0.9886	0.9887	0.9886	0.9886	0.9886	0.9622	0.9622	0.9621	0.9620	0.9621	0.9621
	0.9884	0.9884	0.9883	0.9883	0.9884	0.9884	0.9614	0.9615	0.9613	0.9614	0.9615	0.9615
	AAR3						AAR4
	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.9832	0.9828	0.9828	0.9829	0.9825	0.9825	0.9820	0.9815	0.9818	0.9815	0.9808	0.9808
LNL	0.9855	0.9854	0.9853	0.9853	0.9855	0.9855	0.9821	0.9821	0.9820	0.9820	0.9820	0.9820
	0.9856	0.9856	0.9856	0.9856	0.9856	0.9856	0.9824	0.9824	0.9825	0.9825	0.9824	0.9824
	0.9852	0.9852	0.9852	0.9852	0.9852	0.9852	0.9817	0.9817	0.9817	0.9816	0.9817	0.9817
	AAR5						PAR
	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.9699	0.9690	0.9690	0.9690	0.9686	0.9686	0.8916	0.8892	0.8900	0.8902	0.8868	0.8869
	0.9729	0.9729	0.9727	0.9727	0.9728	0.9728	0.8964	0.8962	0.8960	0.8954	0.8961	0.8961
	0.9726	0.9726	0.9724	0.9723	0.9726	0.9725	0.9002	0.9001	0.9001	0.8999	0.9000	0.9000
	0.9724	0.9724	0.9724	0.9725	0.9724	0.9724	0.8990	0.8990	0.8987	0.8989	0.8990	0.8990
	AAR1						AAR2
	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.9375	0.9366	0.9370	0.9366	0.9354	0.9354	0.8781	0.8746	0.8762	0.8748	0.8702	0.8701
	0.9389	0.9386	0.9383	0.9385	0.9382	0.9383	0.8830	0.8818	0.8806	0.8809	0.8802	0.8803
	0.9408	0.9408	0.9407	0.9407	0.9406	0.9406	0.8878	0.8877	0.8876	0.8877	0.8873	0.8873
	0.9403	0.9404	0.9402	0.9401	0.9403	0.9404	0.8899	0.8899	0.8900	0.8898	0.8899	0.8899
	AAR3						AAR4
	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.9244	0.9232	0.9238	0.9235	0.9218	0.9218	0.9174	0.9158	0.9169	0.9170	0.9130	0.9130
HNL	0.9278	0.9276	0.9272	0.9279	0.9264	0.9265	0.9192	0.9188	0.9188	0.9188	0.9176	0.9176
	0.9290	0.9289	0.9287	0.9290	0.9286	0.9286	0.9212	0.9211	0.9209	0.9212	0.9208	0.9209
	0.9304	0.9304	0.9302	0.9303	0.9303	0.9303	0.9209	0.9209	0.9205	0.9208	0.9209	0.9209
	AAR5						PAR
	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
	0.8939	0.8926	0.8938	0.8922	0.8875	0.8876	0.6570	0.6520	0.6550	0.6526	0.6397	0.6397
	0.9015	0.9006	0.9004	0.9000	0.8996	0.8996	0.6707	0.6686	0.6675	0.6677	0.6646	0.6647
	0.9062	0.9060	0.9057	0.9059	0.9058	0.9058	0.6812	0.6809	0.6800	0.6807	0.6798	0.6798
	0.9064	0.9064	0.9063	0.9063	0.9064	0.9064	0.6838	0.6838	0.6834	0.6833	0.6837	0.6838

Open in a new tab

The bias and RMSE of and for each item in the simulation study 1. The -Matrix denotes the skills required for each item along the x axis, where the black square =“1” and white square =“0”.

Table 6.

The computational time (in seconds) for the VBEM-M, VB, MCMC-dina, MCMC-R2jags, EM-GDINA, and EM-CDM algorithms with the Inline graphic condition based on DINA in simulation study 2

	VBEM-M	VB	MCMC-dina	MCMC-R2jags	EM-GDINA	EM-CDM
LNL	0.0721s	0.0483s	9.7123s	163.2525s	0.0847s	0.0428s
	0.1220s	0.1092s	23.6093s	467.2393s	0.0887s	0.0668s
	0.2061s	0.2415s	46.5265s	998.5244s	0.1208s	0.1126s
	0.3686s	0.5136s	93.5255s	2061.8450s	0.1949s	0.2097s
HNL	0.0783s	0.1012s	9.7794s	170.3141s	0.1618s	0.0606s
	0.1524s	0.2613s	23.8735s	463.9300s	0.1624s	0.0886s
	0.2617s	0.5403s	47.2185s	994.7302s	0.2078s	0.1376s
	0.4890s	1.0802s	94.3035s	2190.041s	0.2986s	0.2667s

Open in a new tab

In Tables 4 and 5, as well as in the subsequent simulation studies, the RMSE and bias mentioned are the average RMSE and average bias. From Tables 4 and 5, we can draw the following conclusions: (1) The VBEM-M algorithm consistently outperforms the other five methods in terms of achieving lower RMSE values for item parameters Inline graphic and under all four sample sizes, regardless of LNL or HNL condition. (2) For the EM algorithm, both EM-GDINA and EM-CDM methods have higher bias and RMSE for item parameters and than four other methods, especially for a small sample size of N=200, under both LNL and HNL conditions. (3) With the same sample size and noise level, both MCMC methods (MCMC-dina and MCMC-R2jags) show similar estimation accuracy, as do the two EM methods (EM-GDINA and EM-CDM). (4) For parameter Inline graphic , the estimated bias and RMSE of the six methods are basically the same under various identical simulation conditions, with no significant differences. (5) In terms of the accuracy of attribute profile recovery, the results of the six methods are essentially the same under each simulation condition.

From Table 6, we can see that the VBEM-M algorithm is highly efficient in terms of computation time. It performs faster than the VB method across most simulation conditions, and this speed advantage is more noticeable as sample sizes increase. Overall, the computation speed of the VBEM-M algorithm is second only to the two EM algorithms, i.e., EM-GDINA and EM-CDM. The two Bayesian methods, MCMC-dina and MCMC-R2jags, have longer computation time than the other four methods. Additionally, MCMC-dina is faster than MCMC-R2jags due to its use of the “Rcpp” and “RcppArmadillo” packages, which are built on C++ programming language.

4.8. Simulation study 3

This simulation study aims to evaluate the effectiveness of the VBEM-M algorithm on the saturated LCDM by comparing it with Yamaguchi and Okada’s (2020a) VB method, the MCMC sampling algorithms, and the EM algorithm. Specifically, the R package “variationalDCM” was used to implement Yamaguchi and Okada’s (2020a) VB method, the R package “R2jags” was used to implement the MCMC sampling algorithms, and the EM algorithm was implemented using the R package “GDINA”.

This simulation was designed with an attribute number of Inline graphic and a test length of . In the saturated LCDM, each item’s is an 8-dimensional vector ( ). The true values of are shown in Table 7. We conducted simulations across different sample sizes (N=1,000, 2,000) and attribute correlations ( ), resulting in six different conditions. Each condition was replicated 100 times. Notably, an additional calculation procedure was needed for Yamaguchi and Okada’s (2020a) VB method, as the R package “variationalDCM” only reports the correct response probabilities for different attribute mastery patterns. We transformed these probabilities into LCDM parameters by solving a linear system of equations (Liu & Johnson, 2019; Yamaguchi & Templin, 2022). The parameter recovery results for the Inline graphic condition are displayed in Tables 8 and 9. The estimation results for and 0.7 are available in the Supplementary Material.

Table 7.

True values of Inline graphic for the saturated LCDM in simulation study 3


Item
1	–1.5	3.5	0	0	0	0	0	0
2	–1.5	0	3.5	0	0	0	0	0
3	–1.5	0	0	3.5	0	0	0	0
4	–1.5	3.5	0	0	0	0	0	0
5	–1.5	0	3.5	0	0	0	0	0
6	–1.5	0	0	3.5	0	0	0	0
7	–1.5	2	2	0	–0.5	0	0	0
8	–1.5	2	0	2	0	–0.5	0	0
9	–1.5	0	2	2	0	0	–0.5	0
10	–1.5	1.5	1.5	1.5	–0.5	–0.5	–0.5	1
11	–1.5	2	2	0	–0.5	0	0	0
12	–1.5	2	0	2	0	–0.5	0	0
13	–1.5	0	2	2	0	0	–0.5	0
14	–1.5	1.5	1.5	1.5	–0.5	–0.5	–0.5	1
15	–1.5	2	2	0	–0.5	0	0	0
16	–1.5	2	0	2	0	–0.5	0	0
17	–1.5	0	2	2	0	0	–0.5	0
18	–1.5	1.5	1.5	1.5	–0.5	–0.5	–0.5	1

Open in a new tab

Table 8.

The accuracy of item parameters and class membership probability parameters using the VBEM-M, VB, MCMC-R2jags, and EM-GDINA algorithms for the LCDM model under the Inline graphic condition in simulation study 3


VBEM-M	VB	MCMC-R2jags	EM-GDINA	VBEM-M	VB	MCMC-R2jags	EM-GDINA
0.3044(–0.0454)	0.3929(–0.0562)	0.3312(0.1008)	0.4784(–0.0602)	0.4075(0.0720)	0.5998(–0.0909)	0.4922(–0.1235)	0.8472(0.0909)
0.2128(–0.0238	0.2431(–0.0297)	0.2263(0.0419)	0.2492(–0.0248)	0.2944(0.0293)	0.3701(–0.0272)	0.3469(–0.0708)	0.3883(0.0362)
0.1489(0.0009)	0.1642(–0.0036)	0.1600(0.0348)	0.1661(–0.0002)	0.2244(0.0051 )	0.2627(–0.0161)	0.2543(–0.0461)	0.2689(0.0146)
0.1115(0.0030)	0.1177(0.0001)	0.1162(0.0208)	0.1183(0.0016)	0.1695(–0.0030)	0.1866(–0.0106)	0.1834(–0.0298)	0.1884(0.0044)

VBEM-M	VB	MCMC-R2jags	EM-GDINA	VBEM-M	VB	MCMC-R2jags	EM-GDINA
0.5798(–0.0653)	1.0939(0.0215)	0.9812(0.3198)	2.3665( 0.1440)	0.0177(0.0000)	0.0302(0.0000)	0.0187(0.0000)	0.0290(0.0000)
0.4880(–0.0253)	0.7015(0.0007)	0.6916(0.1476)	0.7824( 0.0034)	0.0109(0.0000)	0.0179(0.0000)	0.0110(0.0000)	0.0173(0.0000)
0.3960(–0.0014)	0.5032(0.0042)	0.5030(0.0855)	0.5311(0.0060)	0.0077(0.0000)	0.0122(0.0000)	0.0078(0.0000)	0.0119(0.0000)
0.3140(0.0023)	0.3683(0.0019)	0.3665( 0.0455)	0.3770( 0.0020)	0.0054(0.0000)	0.0086(0.0000)	0.0054(0.0000)	0.0085(0.0000)

Open in a new tab

Note: The values outside the parentheses represent the RMSE, while the values inside the parentheses indicate the bias. Here, RMSE and Bias denote the average RMSE and Bias, respectively, for all intercept parameters Inline graphic , all main effect slope parameters , all interaction slope parameters and all class membership probability parameters .

Table 9.

Evaluation of the accuracy of attribute profile parameters using the VBEM-M, VB, MCMC-R2jags and EM-GDINA Algorithms for the saturated LCDM under the Inline graphic condition in simulation study 3

AAR1				AAR2
VBEM-M	VB	MCMC-R2jags	EM-GDINA	VBEM-M	VB	MCMC-R2jags	EM-GDINA
0.9361	0.9306	0.9315	0.9284	0.9400	0.9334	0.9370	0.9334
0.9428	0.9425	0.9418	0.9421	0.9410	0.9404	0.9403	0.9400
0.9427	0.9423	0.9420	0.9418	0.9441	0.9442	0.9440	0.9438
0.9434	0.9434	0.9433	0.9431	0.9438	0.9438	0.9437	0.9436
AAR3				PAR
VBEM-M	VB	MCMC-R2jags	EM-GDINA	VBEM-M	VB	MCMC-R2jags	EM-GDINA
0.9355	0.9310	0.9324	0.9304	0.8296	0.8126	0.8204	0.8120
0.9392	0.9391	0.9392	0.9384	0.8401	0.8387	0.8389	0.8380
0.9434	0.9432	0.9431	0.9429	0.8468	0.8462	0.8459	0.8456
0.9429	0.9430	0.9429	0.9428	0.8465	0.8464	0.8464	0.8461

Open in a new tab

Note: AAR1 represents the correct classification rate for the first attribute, AAR2 for the second attribute, AAR3 for the third attribute. PAR stands for the pattern-wise agreement rate.

For a more detailed analysis, we split the parameter Inline graphic into two parts: (i.e., ) and (i.e., ), which represent the main effects and interactions, respectively. From the results, we can draw the following conclusions: (1) As the number of examinees increases, the RMSE for item parameters of all algorithms decreases. (2) The proposed VBEM-M algorithm performs better than other algorithms on all item parameters across all conditions, especially on the interactions. Specifically, in terms of the parameters Inline graphic and , VBEM-M has a slight advantage over the other algorithms, whereas it shows a significant advantage in estimating and , particularly for the parameter . On the other hand, the EM algorithm performs poorly with small sample sizes. For the parameter, its RMSE exceeds 2 when Inline graphic . (3) Compared to other algorithms, VBEM-M performs significantly better with small sample sizes ), with noticeably lower RMSE. (4) It is worth noting that the results from all algorithms indicate that, although the interaction terms have smaller true values compared to the main effects, their estimation accuracy is worse. This suggests that estimating interaction effects is the most challenging aspect of the saturated LCDM model. (5) As for the accuracy of attribute profiles, there is no obvious difference among these algorithms, but VBEM-M still shows slightly higher accuracy than the others.

Table 10 shows the average computation time across 100 replications for the four algorithms under the Inline graphic condition. The results indicate that our algorithm performs better than the other algorithms in terms of computational efficiency. Additionally, an interesting observation is that the EM algorithm takes the longest time when the sample size is small . This suggests that the EM algorithm converges more slowly with smaller sample sizes.

Table 10.

The computational time (in seconds) for the VBEM-M, VB, MCMC-R2jags and EM-GDINA algorithms based on LCDM with the Inline graphic condition in simulation study 3

VBEM-M	VB	MCMC-R2jags	EM-GDINA
0.0564s	0.0816s	170.1357s	1.4199s
0.0705s	0.1061s	478.8929s	0.6961s
0.0978s	0.1827s	1072.0754s	0.6939s
0.1680s	0.3605s	3162.5147s	0.7485s

Open in a new tab

5. Empirical example

5.1. Empirical Example 1

In this example, a fraction subtraction test dataset (de la Torre & Douglas, 2004; Tatsuoka, 1990, 2002) was investigated using the DINA model. The VBEM-M algorithm, VB algorithm (implemented in the “variationalDCM” package), MCMC sampling technique (implemented in the “dina” package), and EM algorithm (implemented in the “GDINA” package) were used for the parameter estimation of the DINA model. This test involves 2144 middle school students responding to 15 fraction subtraction items, including five measured attributes: subtract basic fractions, reduce and simplify, separate whole from fraction, borrow from whole, and convert whole to fraction; 536 of 2144 students were chosen for this study (Zhang et al., 2020). The corresponding Inline graphic -matrix, parameter estimates, and SDs are shown in Table 11.

Table 11.

The Inline graphic -matrix and the estimation results of the parameters and using the VBEM-M algorithm in the empirical example 1

	-matrix					Estimate
Item	1	2	3	4	5
1	1	0	0	0	0	–3.9286(0.2585)	4.8606(0.2746)	0.0193	0.2825
2	1	1	1	1	0	–1.3649(0.1225)	3.4358(0.1893)	0.2035	0.1120
3	1	0	0	0	0	–1.9954(0.2123)	5.1863(0.2440)	0.1197	0.0395
4	1	1	1	1	1	–1.9774(0.1223)	3.9219(0.1998)	0.1216	0.1252
5	0	0	1	0	0	–1.8950(0.2203)	3.0390(0.2398)	0.1307	0.2416
6	1	1	1	1	0	–3.3033(0.1546)	4.5416(0.2015)	0.0355	0.2247
7	1	1	1	1	0	–2.5221(0.1417)	5.0019(0.2047)	0.0743	0.0773
8	1	1	0	0	0	–1.8014(0.1785)	4.7880(0.2181)	0.1417	0.0480
9	1	0	1	0	0	–2.3739(0.1983)	5.0362(0.2306)	0.0835	0.0652
10	1	0	1	1	1	–1.6155(0.1180)	4.2588(0.2073)	0.1658	0.0664
11	1	0	1	0	0	–2.2268(0.1952)	4.3746(0.2246)	0.0974	0.1045
12	1	0	1	1	0	–3.2651(0.1552)	5.1252(0.2064)	0.0368	0.1347
13	1	1	1	1	0	–1.9080(0.1324)	3.6173(0.1889)	0.1293	0.1533
14	1	1	1	1	1	–3.5572(0.1459)	4.9477(0.2081)	0.0277	0.1993
15	1	1	1	1	0	–3.9134(0.1649)	5.3988(0.2111)	0.0196	0.1846

Open in a new tab

Note: The values outside the parentheses represent the posterior means of the parameters, while the values inside the parentheses indicate the standard deviation.

To facilitate the following item analysis, we transformed the estimates of the intercept and interaction parameters into the traditional estimates of slipping and guessing parameters, as shown in Table 11. Additionally, the comparison of the parameter estimates among the four algorithms can be found in the Supplementary Material. Based on Table 11, we found that the estimates of the five items with the lowest slipping are items 3, 8, 9, 10, and 7, in that order. The estimated values of the slipping parameters for the five items are 0.0395, 0.0480, 0.0652, 0.0664, and 0.0773, respectively. This demonstrates that the five items are less likely to slip than the other ten items. Furthermore, the five items with the highest guessing are items 2, 10, 8, 5, and 13, in that order. For these five items, the estimated guessing parameters are 0.2035, 0.1658, 0.1417, 0.1307, and 0.1293, respectively. Moreover, items 3, 8, and 10 have low slipping parameters and high guessing parameters, indicating that these items are more likely to be correctly guessed. It is worth noting that there is an interesting observation regarding the results for item 1: since Inline graphic is very small and is very large, it is difficult for students who do not master the first attribute to get a correct response by guessing (the probability of a correct response is lower than 0.0200), and even if they do master the first attribute, the probability of a correct response is still only about 0.7000 due to the possibility of slipping.

Based on the results in Table S11 in the Supplementary Material, we investigated the relationship between the VBEM-M algorithm and the other three algorithms in parameter estimation by analyzing the correlations of parameters s and g across these algorithms. The correlations between s estimates from the VBEM-M and VB algorithms are 0.9984, between VBEM-M and MCMC algorithms is 0.9979, and between VBEM-M and EM algorithms is 0.9989. The correlations between the estimators of g calculated using the VBEM-M algorithm and those obtained from the VB, MCMC, and EM algorithms are 0.9488, 0.9552, and 0.8632, respectively. These findings suggest that the VBEM-M algorithm’s parameter estimates align more closely with those from VB and MCMC algorithms, as indicated by the high correlations. In addition, the estimators of the mixing proportions of attribute-mastery patterns, π^_l for l = 1, ⋯ , 2⁵ = 32, are presented in Figure S2 in the Supplementary Material. Notably, these estimates are highly consistent across the VBEM-M algorithm, VB algorithm, MCMC sampling technique, and EM algorithm. A total of 67% of the examinees were classified into the following four attribute profiles: (1,1,1,0,0), (1,1,1,1,0), (1,1,1,0,1), and (1,1,1,1,1). This suggests that a majority of students have mastered the first three attributes. The computation time for the VBEM-M, VB, MCMC, and EM algorithms were 0.1651 s, 0.1661 s, 11.3820 s, and 0.2870 s, respectively.

5.2. Empirical Example 2

In this section, we analyze the Examination for the Certificate of Proficiency in English (ECPE) dataset based on the LCDM. The ECPE has been widely used in previous research based on the LCDM (e.g., Liu & Johnson, 2019; Templin & Bradshaw, 2014; Templin & Hoffman, 2013; von Davier, 2014b), and it includes 0-1 response data from 2,922 examinees on 28 items. Three attributes are measured: morphosyntactic rules, cohesive rules, and lexical rules. Nine of the 28 items measure two attributes, and the others measure one. The VBEM-M algorithm, VB algorithm (implemented in the “variationalDCM” package), MCMC algotithm (implemented in the “R2jags” package), and EM algorithm (implemented in the “GDINA” package) were used for the parameter estimation of the LCDM model. However, due to space limitations, we only present the estimation results of the VBEM-M method in Tables 12 and 13. The results of the other algorithms can be found in the Supplementary Material.

Table 12.

The estimation results of the parameters Inline graphic and using the VBEM-M algorithm in the empirical example 2

Item
1	0.8043(0.0576)	0.6103(0.2493)	0.7109(0.1066)	–	0.4428(0.2724)	–	–
2	1.0281(0.0572)	–	1.2528(0.0821)	–	–	–	–
3	–0.3492(0.0659)	0.9689(0.2787)	–	0.3714(0.0929)	–	0.3094(0.2915)	–
4	–0.1438(0.0642)	–	–	1.6936(0.0808)	–	–	–
5	1.0740(0.0671)	–	–	2.0166(0.0890)	–	–	–
6	0.8621(0.0661)	–	–	1.6847(0.0859)	–	–	–
7	–0.0809(0.0656)	1.7865(0.2990)	–	0.9441(0.0941)	–	0.1457(0.3131)	–
8	1.4738(0.0594)	–	1.9063(0.0895)	–	–	–	–
9	0.1172(0.0642)	–	–	1.1930(0.0801)	–	–	–
10	0.0708(0.0467)	2.0545(0.0841)	–	–	–	–	–
11	–0.0525(0.0655)	1.3287(0.2892)	–	0.9845(0.0943)	–	0.2637(0.3035)	–
12	–1.7782(0.0731)	0.5863(0.2888)	–	1.3152(0.0985)	–	0.9094(0.3008)	–
13	0.6723(0.0476)	1.6258(0.0857)	–	–	–	–	–
14	0.1837(0.0468)	1.3824(0.0807)	–	–	–	–	–
15	0.9875(0.0666)	–	–	2.1183(0.0887)	–	–	–
16	–0.0791(0.0656)	1.4896(0.2920)	–	0.8778(0.0939)	–	0.0136(0.3057)	–
17	1.3267(0.0708)	–	1.0508(0.2745)	0.6181(0.1291)	–	–	–0.1952(0.2980)
18	0.9132(0.0663)	–	–	1.4051(0.0851)	–	–	–
19	–0.1952(0.0642)	–	–	1.8412(0.0812)	–	–	–
20	–1.4189(0.0706)	1.0231(0.2775)	–	0.9529(0.0966)	–	0.6143(0.2903)	–
21	0.1639(0.0656)	1.0841(0.2886)	–	1.1344(0.0958)	–	0.0312(0.3032)	–
22	–0.8644(0.0661)	–	–	2.2256(0.0818)	–	–	–
23	0.6594(0.0558)	–	2.0529(0.0834)	–	–	–	–
24	–0.6815(0.0559)	–	1.5284(0.0758)	–	–	–	–
25	0.0953(0.0467)	1.1596(0.0792)	–	–	–	–	–
26	0.1574(0.0642)	–	–	1.1265(0.0801)	–	–	–
27	–0.8658(0.0481)	1.7058(0.0784)	–	–	–	–	–
28	0.5622(0.0650)	–	–	1.7455(0.0841)	–	–	–

Open in a new tab

Note: The values outside the parentheses represent the posterior means of the parameters, while the values inside the parentheses indicate the standard deviation.

Table 13.

The estimation results of the class membership parameters Inline graphic using the VBEM-M algorithm in the empirical example 2

	(0,0,0)	(1,0,0)	(0,1,0)	(0,0,1)	(1,1,0)	(1,0,1)	(0,1,1)	(1,1,1)
	0.2966	0.0098	0.0170	0.1318	0.0071	0.0145	0.1793	0.3439

Open in a new tab

The outcomes of the VBEM-M algorithm were more similar to those of the VB algorithm and the MCMC algorithm. Please refer to the Supplementary Material for more details. From Table 12, we found that the estimates of the interaction terms are relatively smaller compared to the main effects, indicating that the main effects have a greater influence on the probability of a correct response. Additionally, most of the interaction effects are positive, suggesting that the interactions between skills are more likely to positively affect the probability of a correct response. Furthermore, from the estimates of Inline graphic in Table 13, we can observe that the most prevalent attribute mastery patterns are (0, 0, 0), (0, 0, 1), (0, 1, 1), and (1, 1, 1). This suggests a possible linear hierarchy structure among the skills. Specifically, mastering lexical rules requires mastering cohesive rules first, and mastering morphosyntactic rules is a prerequisite for mastering cohesive rules. This finding is consistent with previous research conclusions (Gierl, Cui, et al., 2007; Gierl, Leighton, et al., 2007).

6. Discussion

In this paper, we propose the novel VBEM-M algorithm for estimating the parameters of the LCDM, which offers fast execution and excellent estimation accuracy. While Yamaguchi and Okada (2020a) introduced a VB method for estimating LCDM parameters, their approach primarily focuses on estimating the probability of correct item responses for specific attribute-mastery patterns, without directly estimating the item parameters. In contrast, our VBEM-M algorithm can simultaneously and directly estimate both attribute-mastery patterns and item parameters.

Since the posterior distributions of the item parameters in the LCDM do not have closed forms, it is difficult to execute parameter estimation using the classic VBEM algorithm. To get around this problem, in our approach, the likelihood function for the LCDM is replaced with a tight lower bound obtained by Taylor expansion, and inference is then performed. The item parameters based on the tight lower bound take on an exponential form, allowing us to use a Gaussian distribution as its conjugate prior. Additionally, a new location parameter Inline graphic is introduced in implementing the Taylor expansion, and an extra maximizing step is added to the typical VBEM algorithm to seek the optimal local point . Three simulation studies were carried out in this study: the first two focused on DINA model as the special case of the LCDM, while the third simulation study considered the saturated LCDM. The parameter recovery results from the VBEM-M algorithm were analyzed under simulated conditions. The VBEM-M algorithm was shown to be effective in terms of parameter recovery, execution time, and convergence rate. In addition, the estimation accuracy and computation time of the VB, MCMC, and EM algorithms were investigated in depth.

To begin with, it was found that the VBEM-M algorithm produces favorable results in terms of parameter recovery, providing three main benefits. First, the VBEM-M algorithm can be implemented under various sample sizes, and its accuracy improves as the sample size increases. Based on the DINA model, we found that higher attribute correlation does not affect Inline graphic estimates but improves estimation accuracy. In addition, the convergence rate of the VBEM-M algorithm is fast, and it is not sensitive to the choice of initial values. It brings considerable efficiency gains, converging to the true values in only approximately ten iterations for different simulation conditions.

The second benefit is that the VBEM-M algorithm has a considerable accuracy advantage over other algorithms, especially when the sample size is small. For instance, in the DINA model with Inline graphic , , and , under the LNL condition, the RMSEs of using VBEM-M, VB, MCMC-dina, MCMC-R2jags, EM-GDINA, and EM-CDM are 0.4500, 0.5507, 0.5470, 0.5554, 1.0239, and 1.0238, respectively. It is evident that our method shows significant advantages, particularly outperforming EM algorithms. However, this benefit diminishes as the sample size increases. This makes the VBEM-M algorithm more reliable in situations with smaller sample sizes, which are often occurs in real-world applications.

Finally, the VBEM-M algorithm stands out for its computational efficiency. While not as fast as the EM algorithms, it still holds an advantage over other algorithms. For example, based on the DINA model with Inline graphic , , , and , it takes an average of 0.3686s, 0.5136s, 93.5225s, 2061.8450s, 0.1949s, and 0.2097s for VBEM-M, VB, MCMC-dina, MCMC-R2jags, EM-GDINA, and EM-CDM, respectively, across 100 replications. Compared to the two EM algorithms, our algorithm showed the time differences of only 0.1737s and 0.1589s, respectively, and it outperformed the other algorithms. This suggests that the VBEM-M algorithm performs well in terms of computational efficiency.

While the VBEM-M algorithm has its advantages, it also has some limitations. For instance, as mentioned above, the VBEM-M algorithm could not perform as fast as EM algorithm. In addition, the VBEM-M algorithm is essentially an approximation of the posterior distribution of parameters, which works well for the DINA model and some LCDM submodels, as showed in the Supplementary Material. However, its performance in complex LCDMs with high attribute dimensions (like a 32-dimensional Inline graphic for ) still needs to be investigated.

In future studies, first, we will consider to explore whether the VBEM-M algorithm can be generalized to other types of CDMs, such as polytomous CDMs and longitudinal CDMs. Second, in this study, the Inline graphic -matrix was calibrated in advance; however, in practice, there is a potential for mis-specification (Rupp & Templin, 2008a). Therefore, we will modify the VBEM-M algorithm to simultaneously estimate the -matrix and model parameters. Third, while the VBEM-M algorithm converges quickly, it still operates slower than the EM algorithm in terms of computation time. We plan to further optimize the code associated with C++ or Fortran to increase its speed.

Supporting information

Wang et al. supplementary material

S0033312324000073sup001.pdf^{(472.3KB, pdf)}

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/psy.2024.7.

Data availability statement

Publicly available datasets were analyzed in this study. The two datasets can be found as follows: https://cran.r-project.org/web/packages/CDM/index.html.

Funding statement

This research was supported by the general projects of National Social Science Fund of China on Statistics (Grant No. 23BTJ067).

Competing interests

The author has no conflicts of interest to declare that are relevant to the content of this article.

References

Beal, M. J. (2003). Variational algorithms for approximate Bayesian inference. PhD thesis, University College London, London. https://www.cse.buffalo.edu/faculty/mbeal/thesis/ [Google Scholar]
Bishop, C. M. (2006). Pattern recognition and machine learning. Springer. [Google Scholar]
Blei, D. M. , Kucukelbir, A. , & McAuliffe, J. D. (2017). Variational inference: A review for statisticians. Journal of the American statistical Association, 112(518), 859–877. 10.1080/01621459.2017.1285773 [DOI] [Google Scholar]
Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika 75(1), 33–57. 10.1007/s11336-009-9136-x [DOI] [Google Scholar]
Carpenter, B. , Gelman, A. , Hoffman, M. D. , Lee, D. , Goodrich, B. , Betancourt, M. , Brubaker, M. , Guo, J. , Li, P. , & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1), 1–32. 10.18637/jss.v076.i01 [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen, Y. , Culpepper, S. A. , Chen, Y. , & Douglas, J. (2017). Bayesian estimation of the DINA Q matrix. Psychometrika, 83(1), 89–108. 10.1007/s11336-017-9579-4 [DOI] [PubMed] [Google Scholar]
Chen, Y. , Liu, J. , Xu, G. , & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110(510), 850–866. 10.1080/01621459.2014.934827 [DOI] [PMC free article] [PubMed] [Google Scholar]
Chiu, C. Y. & Douglas, J. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification, 30, 225–250. 10.1007/s00357-013-9132-9 [DOI] [Google Scholar]
Cho, A. E. , Wang, C. , Zhang, X. , & Xu, G. (2021). Gaussian variational estimation for multidimensional item response theory. British Journal of Mathematical and Statistical Psychology, 74(S1), 52–85. 10.1111/bmsp.12219 [DOI] [PubMed] [Google Scholar]
Chung, M. (2019). A Gibbs sampling algorithm that estimates the Q-matrix for the DINA model. Journal of Mathematical Psychology, 93, 102275. 10.1016/j.jmp.2019.07.002 [DOI] [Google Scholar]
Culpepper, S. A. (2015). Bayesian estimation of the DINA model with Gibbs sampling. Journal of Educational and Behavioral Statistics, 40(5), 454–476. 10.3102/1076998615595403 [DOI] [Google Scholar]
Culpepper, S. A. (2019). Estimating the cognitive diagnosis Q matrix with expert knowledge: Application to the fraction-subtraction dataset. Psychometrika, 84(2), 333–357. 10.1007/s11336-018-9643-8 [DOI] [PubMed] [Google Scholar]
Culpepper, S. A. & Balamuta, J. J. (2019). dina: Bayesian Estimation of DINA Model (R package version 2.0.0). https://cran.r-project.org/package=dina
Culpepper, S. A. , & Hudson, A. (2018). An improved strategy for Bayesian estimation of the reduced reparameterized unified model. Applied Psychological Measurement, 42(2), 99–115. 10.1177/0146621617707511 [DOI] [PMC free article] [PubMed] [Google Scholar]
da Silva, M. A. , de Oliveira, E. S. , von Davier, A. A. , & Bazán, J. L. (2018). Estimating the DINA model parameters using the No-U-Turn Sampler. Biometrical Journal, 60(2), 352–368. 10.1002/bimj.201600225 [DOI] [PubMed] [Google Scholar]
Dempster, A. P. , Laird, N. M. , & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–22. 10.1111/j.2517-6161.1977.tb01600.x [DOI] [Google Scholar]
de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34(1), 115–130. 10.3102/1076998607309474 [DOI] [Google Scholar]
de la Torre, J. (2011). The generalized DINA framework. Psychometrika, 76(2), 179–199. 10.1007/s11336-011-9207-7 [DOI] [Google Scholar]
de la Torre, J. , & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333–353. 10.1007/BF02295640 [DOI] [Google Scholar]
DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model. Applied Psychological Measurement, 36(6), 447–468. 10.1177/0146621612449069 [DOI] [Google Scholar]
DiBello, L. V. , Roussos, L. A. , & Stout, W. F. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In Rao C. R. and Sinharay S. (Eds.), Handbook of statistics , vol. 26 , psychometrics (pp. 979–1030). Elsevier. [Google Scholar]
Eddelbuettel, D. , & Sanderson, C. (2014). RcppArmadillo: Accelerating R with high-performance C++ linear algebra. Computational Statistics and Data Analysis, 71, 1054–1063. 10.1016/j.csda.2013.02.005 [DOI] [Google Scholar]
Eddelbuettel, D. , & Francois, R. (2011). Rcpp: Seamless R and C++ Integration. Journal of Statistical Software, 40, 1–18. 10.18637/jss.v040.i08 [DOI] [Google Scholar]
George, A. C. , Robitzsch, A. , Kiefer, T. , Groß, J. , & Ünlü, A. (2016). The R package CDM for cognitive diagnosis models. Journal of Statistical Software, 74(2), 1–24. 10.18637/jss.v074.i02 [DOI] [Google Scholar]
Gierl, M. J. , Cui, Y. , & Hunka, S. (2007). Using connectionist models to evaluate examinees’ response patterns on tests . In Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, IL. [Google Scholar]
Gierl, M. J. , Leighton, J. P. , & Hunka, S. M. (2007). Using the attribute hierarchy method to make diagnostic inferences about respondents’ cognitive skills. In Leighton J. P. & Gierl M. J. (Eds.), Cognitive diagnostic assessment for education: Theory and applications (pp. 242–274). Cambridge University Press. [Google Scholar]
Grimmer, J. (2011). An introduction to Bayesian inference via variational approximations. Political Analysis, 19(1), 32–47. 10.1093/pan/mpq027 [DOI] [Google Scholar]
Haberman, S. J. , & von Davier, M. (2007). Some notes on models for cognitively based skill diagnosis. In Rao C. R. & Sinharay S. (Eds.), Handbook of statistics , vol. 26, Psychometrics (pp. 1031–1038). Elsevier. [Google Scholar]
Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26(4), 301–321. 10.1111/j.1745-3984.1989.tb00336.x [DOI] [Google Scholar]
Hartz, S. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality [Unpublished doctoral dissertation]. University of Illinois at Urbana-Champaign. [Google Scholar]
Henson, R. A. , Templin, J. L. , & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191–210. 10.1007/s11336-008-9089-5 [DOI] [Google Scholar]
Hijikata, K. , Oka, M. , Yamaguchi, K. , & Okada, K. (2023). variationalDCM: An R package for variational Bayesian inference in diagnostic classification models. PsyArXiv. 10.31234/osf.io/f2sqd [DOI]
Jaakkola, T. S. , & Jordan, M. I. (2000). Bayesian parameter estimation via variational methods. Statistics and Computing, 10(1), 25–37. 10.1023/A:1008932416310 [DOI] [Google Scholar]
Jeon, M. , Rijmen, F. , & Rabe-Hesketh, S. (2017). A variational maximization-maximization algorithm for generalized linear mixed models with crossed random effects. Psychometrika, 82(3), 693–716. 10.1007/s11336-017-9555-z [DOI] [PubMed] [Google Scholar]
Jiang, Z. , & Carter, R. (2019). Using Hamiltonian Monte Carlo to estimate the log-linear cognitive diagnosis model via Stan. Behavior Research Methods, 51(2), 651–662. 10.3758/s13428-018-1069-9 [DOI] [PubMed] [Google Scholar]
Jordan, M. I. , Ghahramani, Z. , Jaakkola, T. S. , & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine learning, 37(2), 183–233. 10.1023/A:1007665907178 [DOI] [Google Scholar]
Junker, B. W. , & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258–272. 10.1177/01466210122032064 [DOI] [Google Scholar]
Liu, C. W. (2022). Efficient Metropolis-Hastings Robbins-Monro algorithm for high-dimensional diagnostic classification models. Applied Psychological Measurement, 46(8), 662–674. 10.1177/01466216221123981 [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu, C. W. , Andersson, B. , & Skrondal, A. (2020). A constrained Metropolis-Hastings Robbins-Monro algorithm for Q matrix estimation in DINA models. Psychometrika 85(2), 322–357. 10.1007/s11336-020-09707-4 [DOI] [PubMed] [Google Scholar]
Liu, X. , Johnson, M.S. (2019). Estimating CDMs Using MCMC. In von Davier M. and Lee Y. S. (eds) Handbook of Diagnostic Classification Models (pp. 629–649). Springer. 10.1007/978-3-030-05584-4 [DOI] [Google Scholar]
Ma, W. , & de la Torre, J. (2016). A sequential cognitive diagnosis model for polytomous responses. British Journal of Mathematical and Statistical Psychology, 69(3), 253–275. 10.1111/bmsp.12070 [DOI] [PubMed] [Google Scholar]
Ma, W. , & de la Torre, J. (2020). GDINA: An R Package for Cognitive Diagnosis Modeling. Journal of Statistical Software, 93(14), 1–26. 10.18637/jss.v093.i14 [DOI] [Google Scholar]
Ma, W. , & Guo, W. (2019). Cognitive diagnosis models for multiple strategies. British Journal of Mathematical and Statistical Psychology, 72(2), 370–392. 10.1111/bmsp.12155 [DOI] [PubMed] [Google Scholar]
Macready, G. B. , & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 2(2), 99–120. 10.3102/10769986002002099 [DOI] [Google Scholar]
Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64(2), 187–212. 10.1007/BF02294535 [DOI] [Google Scholar]
Neal, R. M. (2011). MCMC using Hamiltonian dynamics. In Brooks S. (Ed.), Handbook of Markov Chain Monte Carlo (pp. 113–162). Boca Raton, FL: CRC Press/Taylor & Francis. [Google Scholar]
Oka, M. , & Okada, K. (2023). Scalable Bayesian Approach for the Dina Q-Matrix Estimation Combining Stochastic Optimization and Variational Inference. Psychometrika, 88, 302–331. 10.1007/s11336-022-09884-4 [DOI] [PubMed] [Google Scholar]
Oka, M. , Saso, S. , & Okada, K. (2023). Variational inference for a polytomous-attribute saturated diagnostic classification model with parallel computing. Behaviormetrika, 50(1), 63–92. 10.1007/s41237-022-00164-0 [DOI] [Google Scholar]
Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In The 3rd international workshop on distributed statistical computing, vol. 124, 1–8. Retrieved from http://www.ci.tuwien.ac.at/Conferences/DSC-2003/ [Google Scholar]
R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/ [Google Scholar]
Rijmen, F. , Jeon, M. , & Rabe-Hesketh, S. (2016). Variational approximation methods. In van der Linden W. J. (Ed.), Handbook of item response theory: Statistical tools (Vol. 2, pp. 259–270). CRC Press. [Google Scholar]
Rupp, A. A. , & Templin, J. L. (2008a). Effects of Q-matrix misspecification on parameter estimates and misclassification rates in the DINA model. Educational and Psychological Measurement, 68(1), 78–96. 10.1177/0013164407301545 [DOI] [Google Scholar]
Rupp, A. A. , & Templin, J. L. (2008b). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement: Interdisciplinary Research and Perspective, 6(4), 219–262. 10.1080/15366360802490866. [DOI] [Google Scholar]
Rupp, A. A. , Templin, J. L. , & Henson, R. A. (2010). Diagnostic measurement: Theory, methods and applications. Guilford. [Google Scholar]
Su, Y. S. , & Yajima, M. (2015). R2jags: Using R to run “JAGS”. R package version 0.7-1. Retrieved from http://CRAN.R-project.org/package=R2jags
Tatsuoka, C. (2002). Data analytic methods for latent partially ordered classification models. Journal of the Royal Statistical Society. Series C: Applied Statistics, 51(3), 337–350. 10.1111/1467-9876.00272 [DOI] [Google Scholar]
Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20(4), 345–354. 10.1111/j.1745-3984.1983.tb00212.x [DOI] [Google Scholar]
Tatsuoka, K. K. (1990). Toward an integration of item-response theory and cognitive error diagnosis. In Frederiksen N., Glaser R., Lesgold A., and Shafto M. (Eds.), Diagnostic monitoring of skill and knowledge acquisition (pp. 453–488). Erlbaum. [Google Scholar]
Templin, J. , & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika, 79(2), 317–339. [DOI] [PubMed] [Google Scholar]
Templin, J. L. , & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(3), 287–305. 10.1037/1082-989X.11.3.287 [DOI] [PubMed] [Google Scholar]
Templin, J. , & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32(2), 37–50. [Google Scholar]
Thomas, A. , O’Hara, B. , Ligges, U. , & Sturtz, S. (2006). Making BUGS open. R News, 6, 12–17. Retrieved from http://mathstat.helsinki.fi/openbugs/FAQFrames.html [Google Scholar]
Urban, C. J. , & Bauer, D. J. (2021). A deep learning algorithm for high-dimensional exploratory item factor analysis. Psychometrika, 86(1), 1–29. 10.1007/s11336-021-09748-3 [DOI] [PubMed] [Google Scholar]
von Davier, M. (2008). A general diagnostic model applied to language testing data. Britsh Journal of Mathematical and Statistical Psychology, 61(2), 287–307. 10.1348/000711007X193957 [DOI] [PubMed] [Google Scholar]
von Davier, M. (2014a). The DINA model as a constrained general diagnostic model: Two variants of a model equivalency. British Journal of Mathematical and Statistical Psychology, 67(1), 49–71. 10.1111/bmsp.12003 [DOI] [PubMed] [Google Scholar]
von Davier, M. (2014b). The log-linear cognitive diagnostic model (LCDM) as a special case of the general diagnostic model (GDM). ETS Research Report Series, 2014(2), 1–13. [Google Scholar]
Wand, M. P. , Ormerod, J. T. , Padoan, S. A. , & Frühwirth, R. (2011). Mean field variational Bayes for elaborate distributions. Bayesian Analysis, 6(4), 847–900. 10.1214/11-BA631 [DOI] [Google Scholar]
Xu, G. , & Shang, Z. (2018). Identifying latent structures in restricted latent class models. Journal of the American Statistical Association, 113(523), 1284–1295. 10.1080/01621459.2017.1340889 [DOI] [Google Scholar]
Yamaguchi, K. (2020). Variational Bayesian inference for the multiple-choice DINA model. Behaviormetrika 47(1), 159–187. 10.1007/s41237-020-00104-w [DOI] [Google Scholar]
Yamaguchi, K. , & Okada, K. (2020a). Variational Bayes inference algorithm for the saturated diagnostic classification model. Psychometrika 85(4), 973–995. 10.1007/s11336-020-09739-w [DOI] [PubMed] [Google Scholar]
Yamaguchi, K. , & Okada, K. (2020b). Variational Bayes inference for the DINA model. Journal of Educational and Behavioral Statistics, 45(5), 569–597. 10.3102/1076998620911934 [DOI] [Google Scholar]
Yamaguchi, K. , & Martinez, A. J. (2023). Variational Bayes inference for hidden Markov diagnostic classification models. British Journal of Mathematical and Statistical Psychology, 00, 1–25. 10.1111/bmsp.12308 [DOI] [PubMed] [Google Scholar]
Yamaguchi, K. , & Templin, J. L. (2022). A Gibbs sampling algorithm with monotonicity constraints for diagnostic classification models. Journal of Classification, 39(1), 24–54. 10.1007/s00357-021-09392-7 [DOI] [Google Scholar]
Zhan, P. , Jiao, H. , Man, K. , & Wang, L. (2019). Using JAGS for Bayesian cognitive diagnosis modeling: A tutorial. Journal of Educational and Behavioral Statistics, 44(4), 473–503. 10.3102/1076998619826040 [DOI] [Google Scholar]
Zhang, Z. , Zhang, J. , Lu, J. , & Tao, J. (2020). Bayesian estimation of the dina model with Pólya-gamma Gibbs sampling. Frontiers in Psychology, 11, 384. https://www.frontiersin.org/articles/10.3389/fpsyg.2020.00384 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Wang et al. supplementary material

S0033312324000073sup001.pdf^{(472.3KB, pdf)}

Data Availability Statement

Publicly available datasets were analyzed in this study. The two datasets can be found as follows: https://cran.r-project.org/web/packages/CDM/index.html.

[r1] Beal, M. J. (2003). Variational algorithms for approximate Bayesian inference. PhD thesis, University College London, London. https://www.cse.buffalo.edu/faculty/mbeal/thesis/ [Google Scholar]

[r2] Bishop, C. M. (2006). Pattern recognition and machine learning. Springer. [Google Scholar]

[r3] Blei, D. M. , Kucukelbir, A. , & McAuliffe, J. D. (2017). Variational inference: A review for statisticians. Journal of the American statistical Association, 112(518), 859–877. 10.1080/01621459.2017.1285773 [DOI] [Google Scholar]

[r4] Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika 75(1), 33–57. 10.1007/s11336-009-9136-x [DOI] [Google Scholar]

[r5] Carpenter, B. , Gelman, A. , Hoffman, M. D. , Lee, D. , Goodrich, B. , Betancourt, M. , Brubaker, M. , Guo, J. , Li, P. , & Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76(1), 1–32. 10.18637/jss.v076.i01 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r6] Chen, Y. , Culpepper, S. A. , Chen, Y. , & Douglas, J. (2017). Bayesian estimation of the DINA Q matrix. Psychometrika, 83(1), 89–108. 10.1007/s11336-017-9579-4 [DOI] [PubMed] [Google Scholar]

[r7] Chen, Y. , Liu, J. , Xu, G. , & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110(510), 850–866. 10.1080/01621459.2014.934827 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r76] Chiu, C. Y. & Douglas, J. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification, 30, 225–250. 10.1007/s00357-013-9132-9 [DOI] [Google Scholar]

[r8] Cho, A. E. , Wang, C. , Zhang, X. , & Xu, G. (2021). Gaussian variational estimation for multidimensional item response theory. British Journal of Mathematical and Statistical Psychology, 74(S1), 52–85. 10.1111/bmsp.12219 [DOI] [PubMed] [Google Scholar]

[r9] Chung, M. (2019). A Gibbs sampling algorithm that estimates the Q-matrix for the DINA model. Journal of Mathematical Psychology, 93, 102275. 10.1016/j.jmp.2019.07.002 [DOI] [Google Scholar]

[r10] Culpepper, S. A. (2015). Bayesian estimation of the DINA model with Gibbs sampling. Journal of Educational and Behavioral Statistics, 40(5), 454–476. 10.3102/1076998615595403 [DOI] [Google Scholar]

[r11] Culpepper, S. A. (2019). Estimating the cognitive diagnosis Q matrix with expert knowledge: Application to the fraction-subtraction dataset. Psychometrika, 84(2), 333–357. 10.1007/s11336-018-9643-8 [DOI] [PubMed] [Google Scholar]

[r12] Culpepper, S. A. & Balamuta, J. J. (2019). dina: Bayesian Estimation of DINA Model (R package version 2.0.0). https://cran.r-project.org/package=dina

[r13] Culpepper, S. A. , & Hudson, A. (2018). An improved strategy for Bayesian estimation of the reduced reparameterized unified model. Applied Psychological Measurement, 42(2), 99–115. 10.1177/0146621617707511 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r14] da Silva, M. A. , de Oliveira, E. S. , von Davier, A. A. , & Bazán, J. L. (2018). Estimating the DINA model parameters using the No-U-Turn Sampler. Biometrical Journal, 60(2), 352–368. 10.1002/bimj.201600225 [DOI] [PubMed] [Google Scholar]

[r15] Dempster, A. P. , Laird, N. M. , & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–22. 10.1111/j.2517-6161.1977.tb01600.x [DOI] [Google Scholar]

[r16] de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational and Behavioral Statistics, 34(1), 115–130. 10.3102/1076998607309474 [DOI] [Google Scholar]

[r17] de la Torre, J. (2011). The generalized DINA framework. Psychometrika, 76(2), 179–199. 10.1007/s11336-011-9207-7 [DOI] [Google Scholar]

[r18] de la Torre, J. , & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333–353. 10.1007/BF02295640 [DOI] [Google Scholar]

[r19] DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model. Applied Psychological Measurement, 36(6), 447–468. 10.1177/0146621612449069 [DOI] [Google Scholar]

[r20] DiBello, L. V. , Roussos, L. A. , & Stout, W. F. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In Rao C. R. and Sinharay S. (Eds.), Handbook of statistics , vol. 26 , psychometrics (pp. 979–1030). Elsevier. [Google Scholar]

[r21] Eddelbuettel, D. , & Sanderson, C. (2014). RcppArmadillo: Accelerating R with high-performance C++ linear algebra. Computational Statistics and Data Analysis, 71, 1054–1063. 10.1016/j.csda.2013.02.005 [DOI] [Google Scholar]

[r22] Eddelbuettel, D. , & Francois, R. (2011). Rcpp: Seamless R and C++ Integration. Journal of Statistical Software, 40, 1–18. 10.18637/jss.v040.i08 [DOI] [Google Scholar]

[r24] George, A. C. , Robitzsch, A. , Kiefer, T. , Groß, J. , & Ünlü, A. (2016). The R package CDM for cognitive diagnosis models. Journal of Statistical Software, 74(2), 1–24. 10.18637/jss.v074.i02 [DOI] [Google Scholar]

[r25] Gierl, M. J. , Cui, Y. , & Hunka, S. (2007). Using connectionist models to evaluate examinees’ response patterns on tests . In Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, IL. [Google Scholar]

[r26] Gierl, M. J. , Leighton, J. P. , & Hunka, S. M. (2007). Using the attribute hierarchy method to make diagnostic inferences about respondents’ cognitive skills. In Leighton J. P. & Gierl M. J. (Eds.), Cognitive diagnostic assessment for education: Theory and applications (pp. 242–274). Cambridge University Press. [Google Scholar]

[r27] Grimmer, J. (2011). An introduction to Bayesian inference via variational approximations. Political Analysis, 19(1), 32–47. 10.1093/pan/mpq027 [DOI] [Google Scholar]

[r28] Haberman, S. J. , & von Davier, M. (2007). Some notes on models for cognitively based skill diagnosis. In Rao C. R. & Sinharay S. (Eds.), Handbook of statistics , vol. 26, Psychometrics (pp. 1031–1038). Elsevier. [Google Scholar]

[r29] Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26(4), 301–321. 10.1111/j.1745-3984.1989.tb00336.x [DOI] [Google Scholar]

[r30] Hartz, S. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality [Unpublished doctoral dissertation]. University of Illinois at Urbana-Champaign. [Google Scholar]

[r31] Henson, R. A. , Templin, J. L. , & Willse, J. T. (2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables. Psychometrika, 74(2), 191–210. 10.1007/s11336-008-9089-5 [DOI] [Google Scholar]

[r32] Hijikata, K. , Oka, M. , Yamaguchi, K. , & Okada, K. (2023). variationalDCM: An R package for variational Bayesian inference in diagnostic classification models. PsyArXiv. 10.31234/osf.io/f2sqd [DOI]

[r33] Jaakkola, T. S. , & Jordan, M. I. (2000). Bayesian parameter estimation via variational methods. Statistics and Computing, 10(1), 25–37. 10.1023/A:1008932416310 [DOI] [Google Scholar]

[r34] Jeon, M. , Rijmen, F. , & Rabe-Hesketh, S. (2017). A variational maximization-maximization algorithm for generalized linear mixed models with crossed random effects. Psychometrika, 82(3), 693–716. 10.1007/s11336-017-9555-z [DOI] [PubMed] [Google Scholar]

[r35] Jiang, Z. , & Carter, R. (2019). Using Hamiltonian Monte Carlo to estimate the log-linear cognitive diagnosis model via Stan. Behavior Research Methods, 51(2), 651–662. 10.3758/s13428-018-1069-9 [DOI] [PubMed] [Google Scholar]

[r36] Jordan, M. I. , Ghahramani, Z. , Jaakkola, T. S. , & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine learning, 37(2), 183–233. 10.1023/A:1007665907178 [DOI] [Google Scholar]

[r37] Junker, B. W. , & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258–272. 10.1177/01466210122032064 [DOI] [Google Scholar]

[r38] Liu, C. W. (2022). Efficient Metropolis-Hastings Robbins-Monro algorithm for high-dimensional diagnostic classification models. Applied Psychological Measurement, 46(8), 662–674. 10.1177/01466216221123981 [DOI] [PMC free article] [PubMed] [Google Scholar]

[r39] Liu, C. W. , Andersson, B. , & Skrondal, A. (2020). A constrained Metropolis-Hastings Robbins-Monro algorithm for Q matrix estimation in DINA models. Psychometrika 85(2), 322–357. 10.1007/s11336-020-09707-4 [DOI] [PubMed] [Google Scholar]

[r40] Liu, X. , Johnson, M.S. (2019). Estimating CDMs Using MCMC. In von Davier M. and Lee Y. S. (eds) Handbook of Diagnostic Classification Models (pp. 629–649). Springer. 10.1007/978-3-030-05584-4 [DOI] [Google Scholar]

[r41] Ma, W. , & de la Torre, J. (2016). A sequential cognitive diagnosis model for polytomous responses. British Journal of Mathematical and Statistical Psychology, 69(3), 253–275. 10.1111/bmsp.12070 [DOI] [PubMed] [Google Scholar]

[r42] Ma, W. , & de la Torre, J. (2020). GDINA: An R Package for Cognitive Diagnosis Modeling. Journal of Statistical Software, 93(14), 1–26. 10.18637/jss.v093.i14 [DOI] [Google Scholar]

[r43] Ma, W. , & Guo, W. (2019). Cognitive diagnosis models for multiple strategies. British Journal of Mathematical and Statistical Psychology, 72(2), 370–392. 10.1111/bmsp.12155 [DOI] [PubMed] [Google Scholar]

[r44] Macready, G. B. , & Dayton, C. M. (1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 2(2), 99–120. 10.3102/10769986002002099 [DOI] [Google Scholar]

[r45] Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64(2), 187–212. 10.1007/BF02294535 [DOI] [Google Scholar]

[r46] Neal, R. M. (2011). MCMC using Hamiltonian dynamics. In Brooks S. (Ed.), Handbook of Markov Chain Monte Carlo (pp. 113–162). Boca Raton, FL: CRC Press/Taylor & Francis. [Google Scholar]

[r47] Oka, M. , & Okada, K. (2023). Scalable Bayesian Approach for the Dina Q-Matrix Estimation Combining Stochastic Optimization and Variational Inference. Psychometrika, 88, 302–331. 10.1007/s11336-022-09884-4 [DOI] [PubMed] [Google Scholar]

[r48] Oka, M. , Saso, S. , & Okada, K. (2023). Variational inference for a polytomous-attribute saturated diagnostic classification model with parallel computing. Behaviormetrika, 50(1), 63–92. 10.1007/s41237-022-00164-0 [DOI] [Google Scholar]

[r49] Plummer, M. (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. In The 3rd international workshop on distributed statistical computing, vol. 124, 1–8. Retrieved from http://www.ci.tuwien.ac.at/Conferences/DSC-2003/ [Google Scholar]

[r50] R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/ [Google Scholar]

[r51] Rijmen, F. , Jeon, M. , & Rabe-Hesketh, S. (2016). Variational approximation methods. In van der Linden W. J. (Ed.), Handbook of item response theory: Statistical tools (Vol. 2, pp. 259–270). CRC Press. [Google Scholar]

[r52] Rupp, A. A. , & Templin, J. L. (2008a). Effects of Q-matrix misspecification on parameter estimates and misclassification rates in the DINA model. Educational and Psychological Measurement, 68(1), 78–96. 10.1177/0013164407301545 [DOI] [Google Scholar]

[r53] Rupp, A. A. , & Templin, J. L. (2008b). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement: Interdisciplinary Research and Perspective, 6(4), 219–262. 10.1080/15366360802490866. [DOI] [Google Scholar]

[r54] Rupp, A. A. , Templin, J. L. , & Henson, R. A. (2010). Diagnostic measurement: Theory, methods and applications. Guilford. [Google Scholar]

[r55] Su, Y. S. , & Yajima, M. (2015). R2jags: Using R to run “JAGS”. R package version 0.7-1. Retrieved from http://CRAN.R-project.org/package=R2jags

[r56] Tatsuoka, C. (2002). Data analytic methods for latent partially ordered classification models. Journal of the Royal Statistical Society. Series C: Applied Statistics, 51(3), 337–350. 10.1111/1467-9876.00272 [DOI] [Google Scholar]

[r57] Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20(4), 345–354. 10.1111/j.1745-3984.1983.tb00212.x [DOI] [Google Scholar]

[r58] Tatsuoka, K. K. (1990). Toward an integration of item-response theory and cognitive error diagnosis. In Frederiksen N., Glaser R., Lesgold A., and Shafto M. (Eds.), Diagnostic monitoring of skill and knowledge acquisition (pp. 453–488). Erlbaum. [Google Scholar]

[r59] Templin, J. , & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika, 79(2), 317–339. [DOI] [PubMed] [Google Scholar]

[r60] Templin, J. L. , & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(3), 287–305. 10.1037/1082-989X.11.3.287 [DOI] [PubMed] [Google Scholar]

[r61] Templin, J. , & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32(2), 37–50. [Google Scholar]

[r62] Thomas, A. , O’Hara, B. , Ligges, U. , & Sturtz, S. (2006). Making BUGS open. R News, 6, 12–17. Retrieved from http://mathstat.helsinki.fi/openbugs/FAQFrames.html [Google Scholar]

[r63] Urban, C. J. , & Bauer, D. J. (2021). A deep learning algorithm for high-dimensional exploratory item factor analysis. Psychometrika, 86(1), 1–29. 10.1007/s11336-021-09748-3 [DOI] [PubMed] [Google Scholar]

[r64] von Davier, M. (2008). A general diagnostic model applied to language testing data. Britsh Journal of Mathematical and Statistical Psychology, 61(2), 287–307. 10.1348/000711007X193957 [DOI] [PubMed] [Google Scholar]

[r65] von Davier, M. (2014a). The DINA model as a constrained general diagnostic model: Two variants of a model equivalency. British Journal of Mathematical and Statistical Psychology, 67(1), 49–71. 10.1111/bmsp.12003 [DOI] [PubMed] [Google Scholar]

[r66] von Davier, M. (2014b). The log-linear cognitive diagnostic model (LCDM) as a special case of the general diagnostic model (GDM). ETS Research Report Series, 2014(2), 1–13. [Google Scholar]

[r67] Wand, M. P. , Ormerod, J. T. , Padoan, S. A. , & Frühwirth, R. (2011). Mean field variational Bayes for elaborate distributions. Bayesian Analysis, 6(4), 847–900. 10.1214/11-BA631 [DOI] [Google Scholar]

[r68] Xu, G. , & Shang, Z. (2018). Identifying latent structures in restricted latent class models. Journal of the American Statistical Association, 113(523), 1284–1295. 10.1080/01621459.2017.1340889 [DOI] [Google Scholar]

[r69] Yamaguchi, K. (2020). Variational Bayesian inference for the multiple-choice DINA model. Behaviormetrika 47(1), 159–187. 10.1007/s41237-020-00104-w [DOI] [Google Scholar]

[r70] Yamaguchi, K. , & Okada, K. (2020a). Variational Bayes inference algorithm for the saturated diagnostic classification model. Psychometrika 85(4), 973–995. 10.1007/s11336-020-09739-w [DOI] [PubMed] [Google Scholar]

[r71] Yamaguchi, K. , & Okada, K. (2020b). Variational Bayes inference for the DINA model. Journal of Educational and Behavioral Statistics, 45(5), 569–597. 10.3102/1076998620911934 [DOI] [Google Scholar]

[r72] Yamaguchi, K. , & Martinez, A. J. (2023). Variational Bayes inference for hidden Markov diagnostic classification models. British Journal of Mathematical and Statistical Psychology, 00, 1–25. 10.1111/bmsp.12308 [DOI] [PubMed] [Google Scholar]

[r73] Yamaguchi, K. , & Templin, J. L. (2022). A Gibbs sampling algorithm with monotonicity constraints for diagnostic classification models. Journal of Classification, 39(1), 24–54. 10.1007/s00357-021-09392-7 [DOI] [Google Scholar]

[r74] Zhan, P. , Jiao, H. , Man, K. , & Wang, L. (2019). Using JAGS for Bayesian cognitive diagnosis modeling: A tutorial. Journal of Educational and Behavioral Statistics, 44(4), 473–503. 10.3102/1076998619826040 [DOI] [Google Scholar]

[r75] Zhang, Z. , Zhang, J. , Lu, J. , & Tao, J. (2020). Bayesian estimation of the dina model with Pólya-gamma Gibbs sampling. Frontiers in Psychology, 11, 384. https://www.frontiersin.org/articles/10.3389/fpsyg.2020.00384 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Efficient and Effective Variational Bayesian Inference Method for Log-Linear Cognitive Diagnostic Model

Xue Wang

Jiwei Zhang

Jing Lu

Abstract

1. Introduction

2. Cognitive diagnostic models

2.1. Log-linear cognitive diagnostic model

2.2. DINA model

3. Variational Bayesian EM-maximization algorithm for the LCDM

3.1. Variational Bayesian EM algorithm

3.2. Variational methods in Bayesian logistic regression

3.3. Tight lower bound for the LCDM

3.4. Fully Bayesian representation of the joint posterior distribution

3.5. Implementation of VBEM-M algorithm for LCDM

Table 1.

Figure 1.

4. Simulation study

4.1. Data generation

4.2. Prior distributions

4.3. Estimation software

4.4. Convergence diagnosis

4.5. Evaluation Criteria

4.6. Simulation study 1

Table 2.

Table 3.

Figure 3.

4.7. Simulation study 2

Table 4.

Table 5.

Figure 2.

Table 6.

4.8. Simulation study 3

Table 7.

Table 8.

Table 9.

Table 10.

5. Empirical example

5.1. Empirical Example 1

Table 11.

5.2. Empirical Example 2

Table 12.

Table 13.

6. Discussion

Supporting information

Supplementary material

Data availability statement

Funding statement

Competing interests

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases