Skip to main content
Springer logoLink to Springer
. 2016 Sep 22;82(1):67–85. doi: 10.1007/s11336-016-9512-2

Asymptotic Robustness Study of the Polychoric Correlation Estimation

Shaobo Jin 1,, Fan Yang-Wallentin 1
PMCID: PMC5591612  PMID: 27660261

Abstract

Asymptotic robustness against misspecification of the underlying distribution for the polychoric correlation estimation is studied. The asymptotic normality of the pseudo-maximum likelihood estimator is derived using the two-step estimation procedure. The t distribution assumption and the skew-normal distribution assumption are used as alternatives to the normal distribution assumption in a numerical study. The numerical results show that the underlying normal distribution can be substantially biased, even though skewness and kurtosis are not large. The skew-normal assumption generally produces a lower bias than the normal assumption. Thus, it is worth using a non-normal distributional assumption if the normal assumption is dubious.

Electronic supplementary material

The online version of this article (doi:10.1007/s11336-016-9512-2) contains supplementary material, which is available to authorized users.

Keywords: underlying distribution, asymptotic covariance matrix, non-normality, pseudo-maximum likelihood

Introduction

Structural equation models (SEMs) are widely used in social sciences to model latent structures. Typically, normal distributions are assumed for both latent variables and error terms. However, observed measures in surveys are often ordinal. For example, a five-point Likert scale is commonly used in psychometric studies. Conceptually, categorical data should not be incorporated into a SEM by assuming they are continuous. There have been numerous advances in the literature on SEMs with respect to analysing ordinal data as they are. The observed ordinal data are usually assumed to be counterparts of some underlying continuous distributions. A typical choice of the underlying distributions is the standard normal distribution. Olsson (1979) studied the one-step maximum likelihood estimator (MLE) and the two-step MLE of the polychoric correlation coefficient. All parameters (i.e. thresholds and polychoric correlation) are estimated simultaneously for the one-step MLE, whereas the thresholds are estimated from the marginals and the polychoric correlation is computed based on the threshold estimates for the two-step MLE. Olsson showed that under the normality assumption, the one- and the two-step MLEs produce similar polychoric correlation estimates and similar variance estimates. Jöreskog (1994) derived the estimator of the asymptotic covariance matrix of the polychoric correlation estimators for the two-step maximum likelihood procedure (for a more compact expression, see Christoffersson & Gunsjö, 1996, and related references).

The underlying normality assumption is questionable. For example, the underlying normality assumption in the Life Orientation Test dataset (Scheier & Carver, 1985) was rejected by Maydeu-Olivares (2006). In yet another example, income is commonly used in the socio-economic status studies (e.g. Chateau, Metge, Prior, & Soodeen, 2012; Hodge & Treiman, 1968; Scharoun-Lee, Adair, Kaufman, & Gordon-Larsen, 2009). A Pareto distribution is classically used to model income (Arnold, 2008). Using a normal distribution to model income is dubious because the income is bounded by a lower limit. The question regarding income, however, is commonly categorized in a questionnaire: for example, see the National Longitudinal Study of Adolescent Health dataset (Carolina Population Center, 2009) used by Scharoun-Lee et al. (2009). Thus, “income" is an ordinal indicator with a non-normal underlying distribution. The consequences of violating the underlying normality assumption have been investigated (e.g. Flora & Curran, 2004; Lee & Lam, 1988; Quiroga, 1992). Flora and Curran (2004) generated non-normal data from the Fleishman–Vale–Maurelli method (Fleishman, 1978; Vale & Maurelli, 1983) in which a standard univariate normal random variable is polynomially transformed to introduce skewness and kurtosis. The authors found that the polychoric correlation estimates are only slightly biased when the underlying distribution has a skewness of 0.75 or 1.25 and a kurtosis of 1.75 or 3.75. They found, however, that the polychoric correlation is not robust against extreme underlying non-normality (e.g. skewness = 5 and kurtosis = 50). Lee and Lam (1988) generated non-normal data from an elliptical t distribution and an elliptical contaminated normal distribution and noted that the polychoric correlation estimates based on the normality assumption are fairly robust against non-normal underlying distributions. The study of Quiroga (1992) was conducted using non-normal data from an underlying bivariate skew-normal distribution and from the Fleishman–Vale–Maurelli method. The author also suggests that the polychoric correlation estimator is robust to non-normality. These studies share two features in common. First, they assume that the underlying distribution is normal to investigate the effect of underlying non-normality. So, a non-normal distribution assumption has not been systematically studied. Second, they are simulation studies. To our knowledge, there are no robustness studies on polychoric correlations from a theoretical standpoint.

Because the polychoric correlation is not distribution-free, tests of the underlying normality assumption are desired. For example, LISREL (Jöreskog & Sörbom, 1996) uses a likelihood ratio test to assess underlying normality, which is equivalent to a Pearson χ2. Maydeu-Olivares, Forero, Gallardo-Pujol, and Renom (2009) and Maydeu-Olivares and Joe (2005, 2006) introduced a variant of the Pearson’s χ2 that is more suitable for the two-step MLE of the polychoric correlation. LISREL (Jöreskog & Sörbom, 1996) also provides the root-mean-square error of approximation (RMSEA) to assess the underlying normality assumption.

If the normality assumption fails, a new assumption of distribution is needed. Quiroga (1992) studied a new underlying distributional assumption whose marginal distributions are weighted averages of a univariate skew-normal distribution and a standard univariate normal distribution. Through an empirical example, the author showed that the polychoric correlation estimates based on the new assumption of distribution produce a smaller χ2 test statistic. The normality assumption has also been criticized in the item response theory and alternative distributions have been studied to account for the underlying non-normality (e.g. see Bolfarine & Bazán, 2010; Lucke, 2014; Woods & Thissen, 2006).

The purpose of this paper is twofold. First, we study robustness against misspecification of the underlying distribution from a theoretical perspective. The effect of distributional misspecification under the two-step maximum likelihood procedure is investigated. Because the two-step MLE is computationally easier (Olsson, 1979) and is implemented in LISREL, we focus only on the two-step MLE for its simplicity and popularity. Second, the underlying distribution is not restricted to a standard normal distribution. The t distribution and the skew-normal distribution are used as alternatives in the present study. In particular, the skew-normal distribution has been applied in the item response theory as an alternative to the normality assumption (e.g. see Azevedo, Bolfarine, & Andrade, 2011; Bázan, Branco, & Bolfarine, 2006; Molenaar, 2015; Molenaar, Dolan, & de Boeck, 2012; Santos, Azevedo, & Bolfarine, 2013). Because the underlying distribution cannot be fully determined from ordinal data, we attempt to pinpoint potential alternatives for the bivariate normal distribution assumption.

The remainder of this paper is organized as follows. General theories are presented, followed by numerical examples to illustrate our ideas. A brief conclusion ends the paper.

General Theory

Consider two ordinal variables U and V with mU and mV categories, respectively. The classic polychoric correlation estimation method assumes that there are two underlying continuous variables X and Y for U and V, respectively. The values of U and V are defined through X and Y as

U=iτi-1<Xτii=1,2,,mU,V=jξj-1<Yξjj=1,2,,mV,

where τ=(τ1,,τmU-1) and ξ=(ξ1,,ξmV-1) are thresholds such that

-=τ0<τ1<<τmU-1<τmU=,-=ξ0<ξ1<<ξmV-1<ξmV=.

The true joint distribution function is denoted by F(x,y;ρ,ζ) with two marginal distributions F1(x) and F2(y), where ρ is the correlation coefficient and ζ is the vector of other parameters (e.g. degrees of freedom, location, and scale parameters). The corresponding joint density function is f(x,y;ρ) with marginal densities f1(x) and f2(y). Because the true distribution family is unknown, we assume the underlying distribution to be H(x,y;ρ) with marginal distributions H1(x) and H2(y). The joint density function is h(x,y;ρ) with marginal densities h1(x) and h2(y), respectively. Conventionally, H(x,y;ρ) is taken to be the distribution function of a standard bivariate normal distribution. The normality assumption will be relaxed in our study. We also allow for different marginal distributions both in true underlying distributions and in the assumed ones.

Two-Step Estimation

Threshold Estimation

Let nij and pij be the observed frequency and proportion, respectively, of U=i and V=j, for i=1,,mU and j=1,,mV. If the true underlying distribution F is different from the assumed distribution H, the MLEs of thresholds will be inconsistent estimators of τ0=(τ1,0,,τmU-1,0) and ξ0=(ξ1,0,,ξmV-1,0), where the subscript 0 indicates true values. Consider the ordinal variable U first. Denote nU=(n1·,,nmU·), where ni·=j=1mVnij is the marginal total for i=1,2,,mU. The corresponding marginal proportion is pU=(p1·,,pmU·). The pseudo-maximum likelihood estimator (PMLE) of τ, denoted as τ^=(τ^1,,τ^mU-1), is obtained by maximizing

Q(τ)=i=1mUni·logτi-1τih1(x)dx.

It is easy to see that τ^ is a consistent estimator of τ, where H1τ=F1(τ0), because the observed cell probabilities are consistent estimators of F1(τ0). Similarly, ξ^ is a consistent estimator of ξ, where H2ξ=F2(ξ0). Let P be an mU×mV matrix with (ij)-th entry pij. Then

Qττ=nBUτDU-1τpU,2Qτττ=-nBUτDU-1τDpDU-1τBUτ+nS,

where n=i=1mUj=1mVnij is the total number of observations,

BU(τ)=h1(τ1)00-h1(τ1)h1(τ2)00-h1(τ2)000h1(τmU-1)00-h1(τmU-1),

DU(τ)=Diagτ0τ1h1(x)dx,,τmU-1τmUh1(x)dx,Dp=Diag(p1·,,pmU·),pU=P1mV with 1mV being an mV×1 vector of 1’s, and S is a diagonal matrix with i-th element

pi,·τi-1τih1(x)dx-pi+1,·τiτi+1h1(x)dxh1(τi)τi,

for i=1,,mU-1. The operator Diag(·) constructs a diagonal matrix using the enclosed vector as diagonal elements. The Taylor expansion of n-1/2Q(τ^)/τ around τ is

0=n-1/2Qτ^τ=n-1/2Qττ+n-1/22Qτ~τττ^-τ+op(1), 1

where τ~ lies between τ^ and τ. Because both τ^ and τ~ are consistent estimators of τ,n-12Q(τ~)/ττ is consistent for -BUτDU-1τBUτ. So, Eq. (1) implies

n1/2τ^-τ=n1/2BUτDU-1τBUτ-1BUτDU-1τpU+op(1). 2

Similar arguments applying to ξ yield

n1/2ξ^-ξ=n1/2BVξDV-1ξBVξ-1BVξDV-1ξpV+op(1), 3

where pV=P1mU. Here BV and DV are defined by substituting h1 with h2 in BU and DU. The PMLEs τ^ and ξ^ are inconsistent in the sense that τ and ξ are different from the true values τ0 and ξ0.

Polychoric Correlation Coefficient Estimation

Under the distributional assumption H, the assumed cell probability is

πij,(H)=τi-1τiξj-1ξjh(x,y)dydx,

while the true cell probability πij,(F) is obtained by substituting h(x,y) with f(x,y). Conditionally on τ^ and ξ^, the polychoric correlation ρ is estimated by maximizing

Lρ,τ^,ξ^=i=1mUj=1mVpijlogπij,(H).

Theorem 2.2 in White (1982) shows that the PMLE is a consistent estimator that minimizes the Kullback–Leibler information (Kullback & Leibler, 1951) under some regularity conditions, one of which is that the absolute value of logπij,(H) is dominated by a variable with finite expectation. Such a regularity condition is satisfied if πij,(H)=0 implies πij,(F)=0 for all (ij). Consequently, Theorem 2.2 in White (1982) shows that ρ^ converges to ρ that minimizes the Kullback–Leibler information

i=1mUj=1mVπij,(F)logπij,(F)-logπij,(H).
Theorem 1

Assume gρ,τ,ξ=i=1mUj=1mVπij,(F)logπij,(H), as a function of ρ, has a unique maximum at ρ. If πij,(H)=0 implies πij,(F)=0 for all (ij), then there exists a root ρ^ of the equation

ρi=1mUj=1mVpijlogπij,(H)=0

such that ρ^ is a consistent estimator of ρ.

That is, ρ^ is a consistent estimator of ρ that minimizes the probabilistic divergence between H and F (Kullback, 1959) in the sense of the Kullback–Leibler information. This minimized divergence implies similarities of H and F in terms of cell probabilities.

The assumption in Theorem 1 requires uniqueness of the maximum. In so doing, we rule out all cases with local maxima. If we have several stationary points, we can then only conclude that one of the stationary points minimizes the Kullback–Leibler information.

Asymptotic Variance of Polychoric Correlations

Let Lρρ,τ,ξ denote the first order partial derivative of Lρ,τ,ξ with respect to ρ. Similar symbols are used to represent other partial derivatives and higher order partial derivatives. Lρρ^,τ^,ξ^ can be expanded around ρ for a sufficiently large n,

0=n1/2Lρρ^,τ^,ξ^=n1/2Lρρ,τ^,ξ^i+n1/2ρ^-ρLρρρ~,τ^,ξ^ii+op(1), 4

where ρ~ lies between ρ and ρ^. Term i in Eq. (4) is equivalent to

n1/2Lρρ,τ^,ξ^=n1/2Lρρ,τ,ξ+n1/2Lρτρ,τ~,ξ~τ^-τ+n1/2Lρξρ,τ~,ξ~ξ^-ξ+op(1), 5

where τ~ lies between τ^ and τ and ξ~ lies between ξ^ and ξ. Hence, if πij,H=0 implies πij,F=0 for all i,j in a neighbourhood of (τ,ξ) given the correlation ρ,Lρτρ,τ~,ξ~ is consistent for gρτρ,τ,ξ and Lρξρ,τ~,ξ~ is consistent for gρξρ,τ,ξ. Thus, Eq. (5) is equivalent to

n1/2Lρρ,τ^,ξ^=n1/2Lρρ,τ,ξ+n1/2gρτρ,τ,ξτ^-τ+n1/2gρξρ,τ,ξξ^-ξ+op(1).

Likewise, Term ii can be written as

n1/2ρ^-ρLρρρ~,τ^,ξ^=n1/2ρ^-ρgρρρ,τ,ξ+op(1),

provided that πij,H=0 implies πij,F=0 in a neighbourhood of ρ,τ,ξ. Hence, combining with Eqs. (2) and (3), (4) is equivalent to

n1/2ρ^-ρ=-n1/2gρρρ,τ,ξLρρ,τ,ξ+gρτρ,τ,ξτ^-τ+gρξρ,τ,ξξ^-ξ+op1=-n1/2gρρρ,τ,ξtrΛP+op(1),

where Λ=A+Eτgρτρ,τ,ξ1mV+1mUgρξρ,τ,ξEξ with A being an mU×mV matrix with i,j-th element πij/ρ/πij,Eτ=Dτ-1BτBτDτ-1Bτ-1, and Eξ=Dξ-1BξBξDξ-1Bξ-1. Note that trΛP=vec(Λ)vec(P), where vec(·) stacks the columns of the enclosed matrix and

nvec(P)-vec(πF)dN0,Diag(P)-vec(P)vec(P)

with πF being an mU×mV matrix with (ij)-th entry πij,F. The arguments above establish the following theorem.

Theorem 2

Let ρ^ be the consistent root of Lρρ,τ^,ξ^=0 given τ^ and ξ^. Assume πij,H=0 implies πij,F=0 in a neighbourhood of ρ,τ,ξ, then

n1/2ρ^-ρdN0,σ2,

where σ2=trΛΛπF-trΛπF2/gρρρ,τ,ξ2. Here, matrix Λ is evaluated under ρ,τ,ξ. The operator implies element-wise multiplication.

Estimating the Asymptotic Covariance Matrix

Following Theorem 2, the asymptotic variance of ρ^ can be consistently estimated by

1Lρρρ^,τ^,ξ^2i=1mUj=1mVλ^ij2pij-i=1mUj=1mVλ^ijpij2,

where λ^ij is the (ij)-th element in Λ^. The polychoric correlation between variables U and V satisfies

n1/2ρ^(UV)-ρ(UV)=-n1/2gρρ(UV)ρ,τ,ξtrΛUVPUV+op1,

where the superscript (UV) emphasizes that all quantities are evaluated under the distributional assumption for U and V. Similarly, the polychoric correlation between variables K and Z satisfies

n1/2ρ^(KZ)-ρ(KZ)=-n1/2gρρ(KZ)ρ,τ,ξtrΛKZPKZ+op1.

The underlying distributional assumption for U and V can either be the same as that for K and Z or different. Thus, the asymptotic covariance between ρ^(UV) and ρ^(KZ) is consistently estimated by

a=1mUb=1mVc=1mKd=1mZλ^abUVpabcdUVKZ-pabUVpcdKZλ^cdKZLρρρ^(UV),τ^UV,ξ^UVLρρρ^(KZ),τ^KZ,ξ^KZ, 6

where pabcd(UVKZ) is the sample proportion of observing U=a,V=b,K=c, and Z=d. Under the assumption that the underlying distribution is normal and correctly specified, Eq. (6) reduces to the estimator in Jöreskog (1994).

A Variant of Two-Step Estimation

The above two-step estimation is applicable to bivariate distributions whose marginal distributions do not depend on unknown parameters. For example, the mean and variance of a bivariate normal distribution are unknown parameters and assuming a standard normal distribution fixes those parameters to known values. In many other distributions, unknown parameters are included in the marginal distributions. Consequently, the above two-step MLE cannot be obtained unless the unknown parameters are prefixed. In such a case, a variant of the two-step MLE can be obtained instead. The MLE maximizes

Lθ=i=1mUj=1mVpijlogπij,(H)

with respect to the vector θ that consists of free unknown parameters. Not all parameters in ρ and ζ are free parameters. The mean and variance of an ordinal variable are not identified. Thus, the scale and location parameters that do not contribute to the correlation coefficient are not identified. In some distributions (e.g. the skew-normal distribution introduced later), the correlation coefficient is also determined by ζ and, therefore, is not a free parameter. If Lθ is differentiable with respect to θ,

θLθ=θi=1mUj=1mVpijlogπij,(H)=0

is solved to obtain θ^. By standard calculation,

n1/2θ^-θ=-n1/2ELθθ(θ)-1C(θ)vec(P)+op(1),

where θ=(ρ,ζ) and the k-th row in C is vec(Ck) with the (ij)-th element being 1πij,(H)πij,(H)θk, provided that ELθθ(θ) is invertible. Assume that the correlation coefficient satisfies ρ=ρ(θ) in which ρθ(θ) is nonzero. The delta method (Ferguson, 1996) indicates

n1/2ρ^-ρdN0,ρθ(θ)ELθθ(θ)-1C(θ)ΩC(θ)ELθθ(θ)-1ρθ(θ),

where Ω is the asymptotic covariance matrix of vec(P). Hence, the asymptotic covariance between ρ^(UV) and ρ^(KZ) can be consistently estimated in a similar manner to Eq. (6). Let the matrix Υ be constructed through vecΥ=ρθ(θ)ELθθ(θ)-1C(θ). Then the asymptotic covariance between ρ^(UV) and ρ^(KZ) is consistently estimated by

a=1mUb=1mVc=1mKd=1mZΥ^abUVpabcdUVKZ-pabUVpcdKZΥ^cdKZ. 7

In Olsson (1979), thresholds τ and ξ are parameters for the one-step MLE. However, the thresholds are not always directly estimated for the one-step MLE for other distributions. For example, the density function of a bivariate skew-normal distribution in Azzalini and Valle (1996) is

fx,y=2ϕ2x,y;ωΦα1x+α2y, 8

where ϕ2(·,·;ω) is the density function of the bivariate standard normal distribution with correlation coefficient ω,Φ(·) is the distribution function of a standard normal distribution, and α1 and α2 control skewness and kurtosis. The covariance matrix of X and Y is

W-2π1+α12+2ωα1α2+α22R, 9

where

W=1ωω1andR=α1+ωα22α1+ωα2α2+ωα1α1+ωα2α2+ωα1α2+ωα12.

Thus, the correlation coefficient is affected by ω,α1 and α2. The marginal distributions are univariate skew-normal distributions with densities

fx=2ϕxΦαx, 10

where α=(α1+ωα2)/1+(1-ω2)α221/2 for X and α=(α2+ωα1)/1+(1-ω2)α121/2 for Y. Bazán et al. (2006), Molenaar (2015), and Molenaar et al. (2012) have applied the univariate skew-normal distribution to the item response theory. The marginal distributions are affected by ω,α1, and α2, as are the thresholds. Therefore, the thresholds are not free parameters. The vector of free parameters in the variant of two-step estimation is θ=(α1,α2,ω).

Numerical Examples

A numerical study is conducted in this section to examine the asymptotic bias under different distributional assumptions. Asymptotic limits of PMLE for polychoric correlation coefficients are numerically computed.

Distributional Assumption

Four experiments are conducted in which different true underlying distributions are investigated.

Experiment 1: Elliptical Distribution

In probability and statistics, an elliptical distribution belongs to a broad family of probability distributions. The bivariate joint density function of an elliptical distribution is of the form

12πσ11σ221-ρ21/2qz, 11

where q· is a univariate function and

z=11-ρ2(x-μ1)2σ112-2ρ(x-μ1)(y-μ2)σ11σ22+(y-μ2)2σ222

with σ11 being the variance of X and σ22 being the variance of Y. An elliptical distribution generalizes the normal distribution and keeps some properties (e.g. Balakrishnan & Lai, 2009; Fang, Kotz, & Ng, 1990; Kelker, 1970). Some examples of the bivariate elliptical distributions that will be used later are

  1. Normal distributions: qz=exp-z/2;

  2. t(v) distributions with degrees of freedom v: qz=1+z/v-(v+2)/2;

  3. Bivariate uniform distributions: qz=2I{z1} with I being an indicator function;

  4. Bivariate Logistic distributions: qz=4exp-z/1+exp-z2;

  5. Bivariate exponential power distributions: qz=2exp-zβ/2/21/βΓ(1+1/β).

The elliptical distribution family plays a very important role in robustness studies (e.g. Kano, Berkane, & Bentler, 1993). In the context of Pearson correlation estimation, Hampel, Ronchetti, Rousseeuw, and Stahel (1986) showed that the PMLE of the covariance matrix is proportional to the MLE under the true distributional assumption, provided that continuous data have been acquired. This result enables us to use any member of the family to estimate the correlation matrix, having the same estimates as if the true distribution were used. Likewise, Berkane, Kano, and Bentler (1994) claimed that “there is practically no cost in treating the distribution as multivariate t with specified (possibly small) degrees of freedom” (Berkane et al., 1994, p. 266) when the true distribution is normal and continuous data are observed. It only slightly inflates the variance of the resulting estimator. Thus, it is worth investigating the effect of an underlying elliptical distribution. Because we have only categorical data, the mean and variance are not identified. But then only the correlation coefficient ρ is the parameter of interest, so we can assume μ1=μ2=0 and σ11=σ22=1.

For some members of the elliptical distribution family, the marginal distribution is still elliptical but not of the same type (Gómez, Gómez-villegas, & Marín, 2003). The bivariate uniform distribution, the logistic distribution, and the exponential power distribution possess such properties. The support of the bivariate uniform distribution is not the whole Cartesian plane, whereas the other distributions have the whole Cartesian plane as their support. The exponential power distribution includes the normal distribution (β=1) and the Laplace distribution (β=1/2) as special cases.

Experiment 2: Skew-Normal Distribution

An elliptical distribution is symmetric. Qui-roga (1992) reported that kurtosis does not have strong effects on the polychoric correlation but that skewness increases the bias. The above elliptical distributions examine various values of kurtosis. The following distributions introduce nonzero values of skewness.

A natural generalization of a standard normal distribution is the univariate skew-normal distribution proposed by Azzalini (1985) and extended by Azzalini and Valle (1996) to a multivariate skew-normal distribution. The bivariate density function, covariance matrix and marginal density function are shown in Eqs. (8), (9), and (10), respectively. The ranges of the skewness and excess kurtosis are (-0.9953,0.9953) and [0, 0.8692), respectively (Azzalini & Capitanio, 2014, p. 32). This range is close to the low skewness and low kurtosis case in Flora and Curran (2004). The reader can refer to Azzalini (2005) for an overview of the skew-normal distribution and to Azzalini and Capitanio (2014) for the expressions of skewness and excess kurtosis. Note that the bivariate skew-normal distribution proposed by Azzalini and Valle (1996) is different from the skew-normal distribution in Quiroga (1992). The specification in Azzalini and Valle (1996) is used in the present study for its connection with the skew-t(v) distribution in the next experiment.

Experiment 3: Skew-t(v) Distribution

Skewness can also be introduced to the t(v) distribution. Azzalini and Capitanio (2003) proposed a multivariate skew-t(v) distribution whose bivariate density function is

fx,y=2tx,y;ω,vT(α1x+α2y)v+2v+(x2-2ωxy+y2)/(1-ρ2)1/2;v+2,

where t(·,·;ω,v) is the density function of a standard bivariate t distribution with correlation w and degrees of freedom v and T(·;v+2) is the distribution function of a univariate t distribution with degrees of freedom v+2. The covariance matrix of X and Y is

vv-2W-2π1+α12+2ωα1α2+α22R,

provided that v>2. Both marginal distributions are univariate skew-t distributions with density function

fx=2tx;vTαxv+1v+x21/2;v+1.

The reader is directed to Azzalini and Capitanio (2003) for the expressions of skewness and excess kurtosis.

Experiment 4: Other Distributions

The skew-normal and t(v) distributions are special cases of the skew-t(v) distribution family. There are many distributions that are not members of the skew-elliptical distribution family. In addition, the underlying distribution cannot be truly determined from the observed ordinal data. It is therefore important to investigate the effect of distributional misspecification using the distributions that do not belong to the skew-t distribution family. A Pareto distribution is commonly used to model income (Arnold, 2008) and income is commonly used as an indicator of socio-economic status. Mardia (1962) proposed a multivariate Pareto distribution in which the bivariate density function is

f(x,y;a,θ1,θ2)=(a+1)a(θ1θ2)a+1(θ2x+θ1y-θ1θ2)-(a+2),

and the marginal density function is f(x)=aθiax-(a+1), with xθ1>0,yθ2>0, and a>0. The correlation coefficient between X and Y is 1 / a, which is always positive.

Numerical Design

Three combinations of categories are used. First, both U and V have five categories with cell probabilities (0.1, 0.2, 0.4, 0.2, 0.1) and (0.1, 0.1, 0.3, 0.3, 0.2), respectively. Second, both U and V have three categories with cell probabilities (0.2, 0.5, 0.3) and (0.1, 0.3, 0.6), respectively. Third, U has three categories with cell probabilities (0.2, 0.5, 0.3) and V has five categories with cell probabilities (0.1, 0.1, 0.3, 0.3, 0.2).

In Experiment 1, β in the exponential power distribution is β=0.3,0.4,0.5,0.6. In Experiments 2 and 3, three values of α1 are considered (α1=0.1,0.5,1) and 20 evenly spaced values of α2 are considered ranging from 0.5 to 10 for both skew-normal and skew-t(v) distributions. Thus, different combinations of univariate skewness and kurtosis are investigated. In Experiments 1, 2 and 3, the degrees of freedom for the t(v) and skew-t(v) distributions are 4, 6, 8, and 10. In Experiment 4, parameters for the Pareto distribution are θ1=θ2=3.

For all experiments, two values of ρ are used: ρ0=0.4,0.6. For the purpose of illustration, the assumed underlying distributions are bivariate normal, skew-normal, and t(v) distributions. The normal assumption consists of only one unknown parameter of interest, ρ. The skew-normal assumption consists of three parameters: α1,α2, and ω that determine the correlation coefficient. The degrees of freedom in the t(v) are prefixed to be 4, 6, 8, 10 and the correlation coefficient is the only parameter of interest. The expressions of the partial derivatives of the skew-normal distribution and t distribution can be found in the supplementary materials.

Numerical Results

To assess the bias of polychoric correlation estimates, the relative bias (RB) is computed, which is defined as RB=100×ρ^-ρ0/ρ0. Following the definition in Flora and Curran (2004), RB5,5<RB10, and RB10 indicate slight, moderate, and large bias, respectively. To assess the closeness of the fit, the limit value of RMSEA

RMSEA=max2i=1mUj=1mVπij,(F)logπij,(F)/πij,(H)mUmV-mU-mV,01/2,

is computed. Owing to space limitations, here only some main results are presented and discussed in this subsection. Complete results can be found in the supplementary materials.

Experiment 1

Figures 1 displays the RB and RMSEA values when the true correlation is 0.4. As expected, assuming a wrong underlying distribution generally biases the polychoric correlation. Observe that the skew-normal distribution contains the normal distribution as a special case. Thus, both the normal and skew-normal assumptions consistently estimate the polychoric correlation when the true underlying distribution is normal. When the true underlying distribution is a normal distribution or a t distribution, all distributional assumptions produce a low RB (less than 5%). When the true underlying distribution is a uniform distribution or logistic distribution, the normal assumption generally produces a low-biased correlation estimate. However, the t assumption may produce a high RB (Figure 1). The normal and skew-normal assumptions can produce moderately biased polychoric correlations when the underlying distribution is the exponential power distribution with β=0.3 that corresponds to a distribution with high kurtosis (Figure 1). As the kurtosis in the exponential power family decreases, the magnitude of RB concomitantly decreases. When the underlying distribution is non-normal, the normal and skew-normal assumptions may produce different correlation estimates. Thus, the skew-normal distribution adjusts the underlying non-normality by introducing some degree of skewness. Consequently, the magnitude of RB may become higher but the RMSEA may become lower (Figure 1), which occurs when the number of categories is three for both ordinal variables. The polychoric correlation based on the underlying normal assumption generally underestimates the true correlation coefficient. The t(4) and t(6) assumptions sometimes outperform the normal assumption in Experiment 1.

Fig. 1.

Fig. 1

Relative bias (RB) and root-mean-square error of approximation (RMSEA) of correlation estimates when the true underlying distribution belongs to the elliptical distribution family. The true correlation coefficient is 0.4, a RB when both ordinal variables have five categories. b RMSEA when both ordinal variables have five categories. c RB when both ordinal variables have three categories. d RMSEA when both ordinal variables have three categories. Note Nor normal, Uni uniform, Logi logistic, EP(·)=exponential power distribution with the enclosed value of β.

Experiment 2

As expected, the polychoric correlation is consistently estimated when the true and assumed underlying distributions are both skew-normal (Figure 2). The normal assumption produces negatively biased correlation estimates. It can be moderately or strongly biased unless both α1 and α2 are small. Recall that α1 and α2 control the skewness and kurtosis of the underlying distribution. Small values of α1 and α2 only introduce a small departure from the bivariate normal distribution. All the t distribution assumptions produce similar RBs relative to the normal assumption. Under both the normal and t(v) distribution assumptions, three categories in both ordinal variables generally lead to a higher magnitude of the RB value than five categories in both variables. For example, the RB with five-category variables does not exceed -15 when α1=1 and ρ0=0.4, whereas the RB with three-category variables frequently exceeds -25 under the same condition (Figure 2). As the true value of the correlation increases while the other conditions remain the same, the RB generally becomes smaller (see Figures 7, 8, and 9 in the supplementary materials).

Fig. 2.

Fig. 2

Relative bias (RB) of correlation estimates when the true underlying distribution is skew-normal. The true correlation coefficient is 0.4. a α1=0.1 and both ordinal variables have five categories. b α1=0.5 and both ordinal variables have five categories. c α1=1 and both ordinal variables have five categories. d α1=0.1 and both ordinal variables have three categories. e α1=0.5 and both ordinal variables have three categories. f α1=1 and both ordinal variables have three categories.

In Experiment 2, the RMSEA can be misleading when the number of categories is three in both variables. Consider the normal assumption as an example. The magnitude of RB may exceed 10 when α1=0.1,ρ0=0.4, and both ordinal variables have three categories (Figure 2), whereas the RMSEA is still below 0.05 (Figure 3). The pattern is more dramatic when α1=1. The RB is almost -20 when α2=1.5 and ρ0=0.4, but the RMSEA is slightly below 0.05. Thus, the estimated probabilities can be rather close to the true probabilities but the polychoric correlation can be largely biased. This event occurs because RMSEA only measures the closeness between the estimated and true category probabilities, and is not a direct measure of the correlation estimate.

Fig. 3.

Fig. 3

Root-mean-square error of approximation (RMSEA) of correlation estimates when the true underlying distribution is skew-normal. The true correlation coefficient is 0.4. Both ordinal variables have three categories. a α1=0.1. b α1=0.5. (c) α1=1.

On the other hand, although the skew-normal assumption consistently estimates the polychoric correlation in Experiment 2, the numerical difficulties (such as non-convergence and local maximizer) are encountered in the present study. The fit function Lθ can be fairly flat (see Figure 13 in the supplementary materials as an illustration). A bad choice of the starting value for the numerical optimization process can lead to the aforementioned issues. Thus, 20 starting values are employed. As a result, the skew-normal assumption is computationally much more intensive than the normal assumption.

Experiment 3

When the true underlying distribution is a skew-t(4) distribution, the normal and t(v) underlying distributional assumptions lead to a largely biased polychoric correlation, except when both α1 and α2 are small (Figure 4). A small pair of (α1,α2) only introduces a small skewness and kurtosis to the underlying distribution, which is similar to a t(4) distribution. As known from Experiment 1, the normal and t(v) underlying distributional assumptions are only slightly biased when the true underlying distribution is a t distribution. The skew-normal assumption may produce not so biased correlations when both ordinal variables have three categories and α1 is small (Figure 4). In general, the skew-normal assumption is less biased than the normal and t(v) assumptions. As the degrees of freedom of the skew-t(v) distribution increases, all distributional assumptions become less biased, and the skew-normal assumption in particular is often robust (See the figures in the supplementary materials). This effect is expected from the fact that the skew-normal distribution corresponds to the skew-t () distribution. Nevertheless, the normal and t(v) assumptions still produce moderately or largely biased polychoric correlations. Similar to the conclusions from the underlying skew-normal distribution, a wrong distributional assumption tends to underestimate the polychoric correlation. Three categories in both ordinal variables generally lead to a higher RB in magnitudes than five categories in both ordinal variables; and a higher value of the true correlation coefficient generally leads to less biased estimates. Similar to the case in Experiment 2, the RMSEA can be misleading as well. A low RMSEA does not necessarily indicate a low RB (e.g. see Figure 26 in the supplementary materials).

Fig. 4.

Fig. 4

Relative bias (RB) of correlation estimates when the true underlying distribution is skew-t(4). The true correlation coefficient is 0.4. a α1=0.1 and both ordinal variables have five categories. b α1=0.5 and both ordinal variables have five categories. c α1=1 and both ordinal variables have five categories. d α1=0.1 and both ordinal variables have three categories. e α1=0.5 and both ordinal variables have three categories. f α1=1 and both ordinal variables have three categories.

Experiment 4

Table 1 shows that all the underlying distributional assumptions tend to be extremely biased when the true underlying distribution is a Pareto distribution. Similar to Experiments 2 and 3, the polychoric correlation tends to be underestimated across all conditions in Experiment 4. The skew-normal assumption produces a lower RB than the normal and t(v) assumptions, although all assumptions generally produce a large RB. The value of RMSEA tends to be small despite the heavily biased polychoric correlation. In particular, the RMSEA produced by the skew-normal assumption is always low. Note that the Pareto distribution is skewed. Thus, the skew-normal distribution assumption mimics the skewed pattern, although the true correlation coefficient is inconsistently estimated.

Table 1.

Relative bias (RB) and root-mean-squared error of approximation (RMSEA) of polychoric correlations in Experiment 4.

mU mV ρ0 Assumed distribution
Normal t(4) t(6) t(8) t(10) Skew-normal
RB
3 3 0.4 -47.18 -45.98 -46.29 -46.47 -46.59 -32.89
0.6 -51.23 -50.17 -50.42 -50.57 -50.68 -38.78
3 5 0.4 -38.23 -37.83 -37.60 -37.60 -37.65 -32.12
0.6 -43.50 -43.12 -42.92 -42.93 -42.97 -37.99
5 5 0.4 -36.26 -38.19 -36.94 -36.48 -36.28 -31.60
0.6 -41.96 -43.35 -42.36 -42.01 -41.86 -37.41
RMSEA
3 3 0.4 0.04 0.07 0.05 0.05 0.04 0.00
0.6 0.05 0.08 0.06 0.06 0.06 0.01
3 5 0.4 0.04 0.05 0.04 0.04 0.04 0.01
0.6 0.05 0.06 0.05 0.05 0.05 0.01
5 5 0.4 0.03 0.04 0.04 0.03 0.03 0.01
0.6 0.04 0.05 0.04 0.04 0.04 0.01

Asymptotic Variance

In this subsection, the asymptotic variance is illustrated in Figure 5 when the true underlying distribution is a skew-normal distribution and both ordinal variables have five categories. The skew-normal assumption produces a lower asymptotic variance than do the other assumptions of distribution. The normal assumption often produces a similar asymptotic variance to the t assumption when ρ0=0.4. Otherwise, the normal assumption tends to be slightly less variable than the t assumption. However, Figure 5 shows that the asymptotic variances under the skew-normal assumption can be substantially higher than the asymptotic variances of other assumptions of distribution when both ordinal variables have three categories. Recall that the normal assumption is asymptotically biased (Figure 2); however, a lower variance may lead to a lower mean squared error than the skew-normal assumption. Thus, although the skew-normal assumption is asymptotically unbiased, the correlation estimate is likely to have a larger departure from the true value than the normal assumption because of the large variation.

Fig. 5.

Fig. 5

Asymptotic variances of correlation estimators when the true underlying distribution is skew-normal. The true correlation coefficient is 0.4. a α1=0.1 and both ordinal variables have five categories. b α1=0.5 and both ordinal variables have five categories. c α1=1 and both ordinal variables have five categories. d α1=0.1 and both ordinal variables have three categories. e α1=0.5 and both ordinal variables have three categories. f α1=1 and both ordinal variables have three categories.

Conclusion and Discussion

In this paper, we study robustness of polychoric correlation estimation against misspecification of underlying distributions. The asymptotic polychoric correlation and its asymptotic (co)variance are derived under the conditions of the support of assumed distributions. Unlike the continuous case, the correlation structure is not asymptotically unbiased any more. Although the bias is sometimes small, a large bias can occur, especially when the true underlying distribution is skewed but a bivariate normal or t distribution is assumed. It is seen from the numerical example that the skew-normal assumption performs as well as the conventional normal assumption when the true underlying distribution is a t distribution and improves the normal assumption when skewness exists.

Both Flora and Curran (2004) and Quiroga (1992) found that the normal assumption is robust against non-normal data generated from the Fleishman–Vale–Maurelli method. For example, the largest skewness and kurtosis considered in Flora and Curran (2004) are 1.25 and 3.75, respectively. The RB is lower than 10 in most conditions and is lower than 5 when the number of categories is five and ρ0=0.49 (Flora & Curran, 2004, Table 2). Our results show that the polychoric correlation can be largely underestimated using the normal assumption when the true underlying distribution is a skew-normal distribution skewness and kurtosis of which are bounded by some small values. The bias becomes even higher when the true underlying distribution is skew-t(4) or a Pareto distribution in which cases the kurtosis is not well defined. Although the skew-normal assumption is also largely biased sometimes, it greatly improves the conventional normal assumption. Still, the skew-normal assumption has a much higher variance than the normal assumption when the number of categories is small. Thus, the volatility is high under the skew-normal assumption. Obviously, more studies are needed to investigate small sample volatility in order to provide suggestions for practice.

Lee and Lam (1988) suggested using the correct underlying distributional assumption to estimate more accurately the polychoric correlation if the ordinal data are asymmetric. Because the ordinal data indicate the loss of information when comparing with continuous data, we cannot have visual inspections of the underlying distribution. If the tests of the underlying distribution were rejected, the underlying distributional assumption is questionable, and an alternative distributional assumption should be used. In practice, several assumptions of underlying distribution can be tested and then the most plausible one chosen.

The normal distribution is a special case of the skew-normal distribution. We have shown that both distributions consistently estimate the polychoric correlation when the true distribution is normal. Thus, the skew-normal assumption, which is able to model skewness and kurtosis, is a natural extension to the conventional normal assumption and frequently outperforms the normal assumption. However, three parameters are simultaneously estimated in the skew-normal distribution. Because the thresholds are determined through α1,α2, and ω, the gradient and Hessian matrix involve derivatives of the thresholds with respect to α1,α2, and ω. Accordingly, it is computationally more difficult than the normal assumption. Besides, non-convergence and local optimizers are encountered in the present study and multiple starting values are used to obtain the correlation estimate.

Although only the t and skew-normal assumptions are illustrated as non-normal alternatives in the present study, other distributions that are differentiable with respect to unknown parameters can be used to estimate the correlation coefficient by the aid of Theorem 1 or Eq. (6). Its asymptotic variance and covariance can be estimated using Theorem 2 or Eq. (7). For example, the logistic distribution can be assumed in the two-step estimation and the skew-t distribution can be assumed in the variant of the two-step estimation. It will be of interest to derive analytical expressions for the skew-elliptical distribution family that consists of the skew-normal and skew-t distributions. Our numerical results demonstrate that the skew-normal assumption generally improves the conventional normal assumption in the imaginary case where n is infinite. It is worthy to conduct a simulation study to investigate the small sample bias in estimating the correlation coefficient and its effects on the bias of parameters in a SEM with ordinal data.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Acknowledgments

The research reported in this article has been supported by the Swedish Research Council (VR) under the program: Structural Equation Modeling with Ordinal Variables, 421-2011-1727.

Open Access

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Contributor Information

Shaobo Jin, Phone: +46184711038, Email: shaobo.jin@statistik.uu.se.

Fan Yang-Wallentin, Phone: +46184715158, Email: fan.yang@statistik.uu.se.

References

  1. Arnold BC. Pareto and generalized pareto distributions. In: Chotikapanich D, editor. Modeling income distributions and Lorenz curves. New York: Springer; 2008. pp. 119–145. [Google Scholar]
  2. Azevedo CLN, Bolfarine H, Andrade DF. Bayesian inference for a skew-normal IRT model under the centred parameterization. Computational Statistics & Data Analysis. 2011;55:353–365. doi: 10.1016/j.csda.2010.05.003. [DOI] [Google Scholar]
  3. Azzalini A. A class of distributions which includes the normal ones. Scandinavian Journal of Statistics. 1985;12:171–178. [Google Scholar]
  4. Azzalini A. The skew-normal distribution and related multivariate families. Scandinavian Journal of Statistics. 2005;32:159–188. doi: 10.1111/j.1467-9469.2005.00426.x. [DOI] [Google Scholar]
  5. Azzalini A, Capitanio A. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2003;65:367–389. doi: 10.1111/1467-9868.00391. [DOI] [Google Scholar]
  6. Azzalini A, Capitanio A. The skew-normal and related families. Cambridge: Cambridge University Press; 2014. [Google Scholar]
  7. Azzalini A, Valle AD. The multivariate skew-normal distribution. Biometrika. 1996;83:715–726. doi: 10.1093/biomet/83.4.715. [DOI] [Google Scholar]
  8. Balakrishnan N, Lai CD. Continuous bivariate distributions. 2. New York, NY: Springer; 2009. [Google Scholar]
  9. Bazán JL, Branco MD, Bolfarine H. A skew item response model. Bayesian Analysis. 2006;1:861–892. doi: 10.1214/06-BA128. [DOI] [Google Scholar]
  10. Berkane M, Kano Y, Bentler PM. Pseudo maximum likelihood estimation in elliptical theory: Effects of misspecification. Computational Statistics & Data Analysis. 1994;18:255–267. doi: 10.1016/0167-9473(94)90175-9. [DOI] [Google Scholar]
  11. Bolfarine H, Bazán JL. Bayesian estimation of the logistic positive exponent irt model. Journal of Educational and Behavioral Statistics. 2010;35:693–713. doi: 10.3102/1076998610375834. [DOI] [Google Scholar]
  12. Carolina Population Center. (2009). National Longitudinal Study of Adolescent to Adult Health (Add Health) [Data file and code book]. http://www.cpc.unc.edu/projects/addhealth
  13. Chateau D, Metge C, Prior H, Soodeen RA. Learning from the census: The Socio-economic Factor Index (SEFI) and health outcomes in Manitoba. Canadian Journal of Public Health. 2012;4:23–27. doi: 10.1007/BF03403825. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Christoffersson A, Gunsjö A. A short note on the estimation of the asymptotic covariance matrix for polychoric correlations. Psychometrika. 1996;61:173–175. doi: 10.1007/BF02296965. [DOI] [Google Scholar]
  15. Fang KT, Kotz S, Ng KW. Symmetric multivariate and related distributions. New York, NY: Chapman and Hall; 1990. [Google Scholar]
  16. Ferguson TS. A course in large sample theory. New York, NY: Chapman and Hall; 1996. [Google Scholar]
  17. Fleishman AI. A method for simulating non-normal distributions. Psychometrika. 1978;43:521–532. doi: 10.1007/BF02293811. [DOI] [Google Scholar]
  18. Flora DB, Curran PJ. An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychological Methods. 2004;9:446–491. doi: 10.1037/1082-989X.9.4.466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Gómez E, Gómez-villegas MA, Marín JM. An survey on continuous elliptical vector distributions. Revista Matemática Complutense. 2003;16:345–361. doi: 10.5209/rev_REMA.2003.v16.n1.16889. [DOI] [Google Scholar]
  20. Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA. Robust statistics: The approach based on influence functions. New York, NY: Wiley; 1986. [Google Scholar]
  21. Hodge RW, Treiman DJ. Social participation and social status. American Sociological Review. 1968;33:722–740. doi: 10.2307/2092883. [DOI] [Google Scholar]
  22. Jöreskog KG. On the estimation of polychoric correlations and their asymptotic covariance matrix. Psychometrika. 1994;59:381–389. doi: 10.1007/BF02296131. [DOI] [Google Scholar]
  23. Jöreskog KG, Sörbom D. Lisrel 8: User’s reference guide. Chicago, IL: Scientific Software International; 1996. [Google Scholar]
  24. Kano Y, Berkane M, Bentler PM. Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations. Journal of the American Statistical Association. 1993;88:135–143. [Google Scholar]
  25. Kelker D. Distribution theory of spherical distributions and a location-scale parameter generalization. Sankhyā: The Indian Journal of Statistics, Series A. 1970;32:419–430. [Google Scholar]
  26. Kullback S. Information theory and statistics. New York, NY: Wiley; 1959. [Google Scholar]
  27. Kullback S, Leibler RA. On information and sufficiency. Annals of Mathematical Statistics. 1951;22:79–86. doi: 10.1214/aoms/1177729694. [DOI] [Google Scholar]
  28. Lee SY, Lam ML. Estimation of polychoric correlation with elliptical latent variables. Journal of Statistical Computation and Simulation. 1988;30:173–188. doi: 10.1080/00949658808811095. [DOI] [Google Scholar]
  29. Lucke, J. F. (2014). Positive trait item response models. In R. E. Millsap, L. A. van der Ark, D. M. Bolt & C. M. Woods (Eds.), New Developments in Quantitative Psychology: Presentations from the 77th Annual Psychometric Society Meeting (Vol. 66, pp. 199–213). New York: Springer.
  30. Mardia KV. Multivariate pareto distributions. The Annals of Mathematical Statistics. 1962;33:1008–1015. doi: 10.1214/aoms/1177704468. [DOI] [Google Scholar]
  31. Maydeu-Olivares A. Limited information estimation and testing of discretized multivariate normal structural models. Psychometrika. 2006;71:57–77. doi: 10.1007/s11336-005-0773-4. [DOI] [Google Scholar]
  32. Maydeu-Olivares A, Forero CA, Gallardo-Pujol D, Renom J. Testing categorized bivariate normality with two-stage polychoric correlation estimates. Methodology. 2009;5:131–136. doi: 10.1027/1614-2241.5.4.131. [DOI] [Google Scholar]
  33. Maydeu-Olivares A, Joe H. Limited-and full-information estimation and goodness-of-fit testing in 2×n contingency tables: A unified framework. Journal of the American Statistical Association. 2005;100:1009–1020. doi: 10.1198/016214504000002069. [DOI] [Google Scholar]
  34. Maydeu-Olivares A, Joe H. Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrika. 2006;71:713–732. doi: 10.1007/s11336-005-1295-9. [DOI] [Google Scholar]
  35. Molenaar D. Heteroscedastic latent trait models for dichotomous data. Psychometrika. 2015;80:625–644. doi: 10.1007/s11336-014-9406-0. [DOI] [PubMed] [Google Scholar]
  36. Molenaar D, Dolan CV, de Boeck P. The heteroscedastic graded response model with a skewed latent trait: Testing statistical and substantive hypotheses related to skewed item category functions. Psychometrika. 2012;77:455–478. doi: 10.1007/s11336-012-9273-5. [DOI] [PubMed] [Google Scholar]
  37. Olsson U. Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika. 1979;44:443–460. doi: 10.1007/BF02296207. [DOI] [Google Scholar]
  38. Quiroga, A. M. (1992). Studies of the polychoric correlation and other correlation measures for ordinal variables. Unpublished Doctoral dissertation, Uppsala University, Uppsala.
  39. Santos JRS, Azevedo CLN, Bolfarine H. A multiple group item response theory model with centered skew-normal latent trait distributions under a bayesian framework. Journal of Applied Statistics. 2013;40:2129–2149. doi: 10.1080/02664763.2013.807331. [DOI] [Google Scholar]
  40. Scharoun-Lee M, Adair LS, Kaufman JS, Gordon-Larsen P. Obesity, race/ethnicity and the multiple dimensions of socioeconomic status during the transition to adulthood: A factor analysis approach. Social Science & Medicine. 2009;68:708–716. doi: 10.1016/j.socscimed.2008.12.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Scheier FM, Carver CS. Optimism, coping, and health: Assessment and implications of generalized outcome expectancies. Health Psychology. 1985;4:219–247. doi: 10.1037/0278-6133.4.3.219. [DOI] [PubMed] [Google Scholar]
  42. Vale CD, Maurelli VA. Simulating multivariate nonnormal distributions. Psychometrika. 1983;48:465–471. doi: 10.1007/BF02293687. [DOI] [Google Scholar]
  43. White H. Maximum likelihood estimation of misspecified models. Econometrica. 1982;50:1–25. doi: 10.2307/1912526. [DOI] [Google Scholar]
  44. Woods CM, Thissen D. Item response theory with estimation of the latent population distribution using spline-based densities. Psychometrika. 2006;71:281–301. doi: 10.1007/s11336-004-1175-8. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials


Articles from Psychometrika are provided here courtesy of Springer

RESOURCES