Skip to main content
Educational and Psychological Measurement logoLink to Educational and Psychological Measurement
. 2017 Aug 31;79(1):200–210. doi: 10.1177/0013164417725127

Thanks Coefficient Alpha, We Still Need You!

Tenko Raykov 1,, George A Marcoulides 2
PMCID: PMC6318747  PMID: 30636788

Abstract

This note discusses the merits of coefficient alpha and their conditions in light of recent critical publications that miss out on significant research findings over the past several decades. That earlier research has demonstrated the empirical relevance and utility of coefficient alpha under certain empirical circumstances. The article highlights the fact that as an index aimed at informing about multiple-component measuring instrument reliability, coefficient alpha is dependable then as a reliability estimator. Therefore, alpha should remain in service when these conditions are fulfilled and not be abandoned.

Keywords: coefficient alpha, measuring instrument, population discrepancy, reliability, single-factor model


Quality of measurement is of paramount relevance in the behavioral, business, educational, medical, and social sciences and related disciplines. Reliability, as a main index of measurement quality, has attracted an enormous amount of interest among methodologists and substantive researchers over the past century (e.g., McDonald, 1999). As a consequence, a voluminous body of literature has accumulated that is not easy to integrate. It remains essential, however, that due attention be paid to this literature when attempting to extend it with new contributions, particularly those aimed at suppressing main findings that have been well documented over the past several decades.

In spite of the extensive literature on reliability and particularly on coefficient alpha, sweeping claims have been made that measurement quality should not be assessed from this standpoint. Attempts to legitimize such claims are frequently supported by imprecise, incomplete, and incorrect interpretations and references to the original published works. Although it would be nice to think that such problematic viewpoints would be weeded out during the article review process, for a variety of reasons this does not always happen. Recently, McNeish (2018) offered what can be seen in parts as an overly simplified and problematic treatment of reliability and particularly coefficient alpha as well as its relationship to the former, ultimately calling for the abandonment of alpha because it is obsolete. In addition to missing out on a significant portion of alpha and reliability-related literature, his discussion is at times one-sided and may even be viewed as potentially misleading then. It is the aim of the present note to indicate shortcomings of his account and of exclusively critical recent research on alpha, as well as to point out the continuing relevance of coefficient alpha for reliability- and measurement-related research under appropriate conditions that are examinable in empirical settings. These conditions allow a scholar to assess whether he or she is in a situation where the point and interval estimates of coefficient alpha can be treated as dependable in practical terms, as corresponding estimates of reliability for a given multicomponent measuring instrument (referred to henceforth as “scale”). Although some of our remarks might appear to be excessively critical, like Cudeck (1989) we strongly believe “that it is good for one’s character, not bad for it, to acknowledge past errors and clearly be capable of learning” (p. 317). Ultimately our concerns are genuine and pertinent to promoting the enrichment of the concepts and practice for determining quality measurements.

Misunderstandings of Coefficient Alpha and Reliability

The Measure Continuity and Normality “Assumption” Underlying Alpha

Currently the most widely used index purporting to inform about scale reliability is Cronbach’s coefficient alpha (α; e.g., Cronbach, 1951; see also Guttman, 1945). Although this coefficient was popularized by Cronbach (1951) through his derivations from several different perspectives, as Cronbach (2004) himself noted, he did not invent the coefficient. A number of other equivalent coefficients had already been reported in the literature prior to his seminal article (Brennan, 2011). Cronbach (1951), however, did name the coefficient as being the first relative to five other analogous coefficients designated in accordance with the Greek alphabet as β, γ, δ, ε, and ζ. While its popularity among empirical behavioral, educational, and social scientists is more than impressive, it is surprising to find views being circulated that α is based on the assumption of continuity and normality for the components comprising a scale or composite under consideration (McNeish, 2018).

To set the stage for the following discussion, as is widely known coefficient alpha is defined in a studied population as follows:

a=pp1[1i=1pVar(Yi)/Var(Y)],

where Y1, …, Yp (p > 1) is a given set of components of interest, for example, of a prespecified scale in question, with a sum score denoted by Y=Y1++Yp and Var(.) symbolizing variance.1

We believe it to be especially crucial when talking of “assumptions” underlying α to examine first its definition at the population level, that is, in terms of Equation (1). From this equation, it is readily seen that the only formal assumption that α is based on is the following one (recall, p > 1 was assumed earlier in this note):

Var(Y)>0,

that is, the existence of individual differences on the sum score, which can be seen as a typically fulfilled condition in most empirical research settings. As can be also realized from a thorough reading of Cronbach (1951; see also Guttman, 1945), α has no essential assumption of (a) item continuity or (b) normality of the items (compare with the declaration made by McNeish, 2018, claiming the opposite but without substantiation). (A simple alternative way to see α’s lack of Assumptions (a) and (b) is to realize that the well-known Kuder–Richardson reliability estimators/formulas KR-20 and KR-21 are applicable in, and developed for, the case of binary items, and in fact represent special cases of α; e.g., Cronbach, 1951.)

As shown rigorously in Novick and Lewis (1967), α does not equal in general the reliability coefficient for Y, denoted ρY, which as is well-known is defined as

ρY=Var(TY)/Var(Y),

where TY is the true score pertaining to Y (e.g., Raykov & Marcoulides, 2011; Zimmerman, 1975).

We stress that just like α, the reliability coefficient in Equation (3)—at times also referred to as scale or composite reliability—is defined if and only if Inequality (2) holds, that is, if and only if there are individual differences on Y. (In the remainder of this note, when referring to “reliability” we will mean the coefficient ρY defined in Equation 3, unless stated otherwise, and will assume that Inequality 2 holds.) Thereby, there is no assumption of continuity or normality of Y, which is needed for this reliability definition. In actual fact, reliability of any scale component, and of Y, is defined in any contemporary empirical research setting. The reason is that the only random variables one obtains (measures) in such a setting are bounded both from below and above, and for such variables all their moments exist (e.g., DasGupta, 2008); hence, both the numerator and denominator of the reliability definition Equation (3) are finite numbers and therefore their ratio exists. We would also like to stress that reliability of any discrete scale component (e.g., a binary item) is just as well defined as that of a continuous component, following the definition Equation (3). (Note that the pertinent true score exists for any given respondent, whether Yj is discrete or continuous, since Yj is a bounded random variable; j = 1, …, p; e.g., Raykov & Marcoulides, 2011.) Hence, scholars can talk of and be interested in determining (evaluating) scale reliability regardless of whether item or component continuity or normality holds. In precisely the same way and sense, it is meaningful to speak of coefficient alpha as well, that is, regardless of whether these two conditions hold (i.e., irrespective of whether conditions (a) or (b) above are fulfilled). (Of course, it is important to note that this fact does not imply that reliability equals α in the general case.)

By way of summary, neither coefficient alpha nor the scale reliability coefficient are based on assumptions of item continuity and normality, as implied by McNeish (2018) with respect to his comments on coefficient alpha. (In particular, the sample coefficient alpha is a consistent estimator of its population counterpart α in Equation (1), regardless of whether component continuity or normality holds; cf. Raykov, 2012.)

The Definition of Reliability

In order for the field of behavioral, educational, and social measurement to fittingly progress, it is in our opinion essential that a single definition of reliability be used. Historically, logically, and methodologically, we find that this definition can only be the one given in Equation (3). In particular, it is potentially restrictive to define or describe reliability as the squared correlation between true score T and observed score Y (cf. McNeish, 2018), since this is obviously only meaningful when also

Var(T)>0

holds. The reason is that otherwise—that is, when there are no individual true differences—this correlation simply does not exist to begin with (cf. Raykov & Marcoulides, 2011). At the same time, behavioral measuring situations where Var(T) = 0 is true are entirely meaningful and far from impossible. This is because they only characterize a lack of true individual differences, which is certainly possible in some settings (e.g., when considering certain (sub)populations or with populations of interest that are defined in a sufficiently restrictive but substantively meaningful way). In those cases, accepting the definition of reliability as the correlation between true and observed scores would not be satisfactory since this correlation is not defined, as just pointed out, while it is perfectly meaningful to talk then of reliability being 0 as follows from its definition in Equation (3) that underlies this note as well.

Similarly, we believe that there is little benefit if any in speaking of a notion of “internal consistency reliability” (e.g., McNeish, 2018). This term has been used in the literature in effect only as a reference to coefficient alpha, and has not been rigorously and formally defined as a concept independently of the latter and in a way that adds significantly to the definition Equation (3) of reliability (see also McDonald, 1981). Yet this reliability-related term is unfortunately surprisingly widely circulated in the psychometric and applied behavioral and social research literature. We find that use of this term can do notable disservice to the measurement field and practice, in that it suggests more than one distinct definition of reliability or types of reliability that would be equally relevant (cf. Raykov, 2012). This kind of conclusion would be, however, not logical or correct, and in fact could be misleading due to the fact that alpha and reliability are in general not identical (e.g., Novick & Lewis, 1967).

Last but not least, the theory and practice of behavioral and social measurement cannot benefit from a rendition of the reliability definition as the correlation of two consecutive repeated measurements if “the respondent does not recall their answers from the first administration” (McNeish, 2018). This failed attempt to describe meaningfully reliability apparently relates to the property that for two parallel measures (with uncorrelated errors) the reliability of either is their correlation (e.g., McDonald, 1999). However, measure parallelism is a markedly stronger assumption than lack of recalling one’s answers, and this lack of recall does not ensure their parallelism. The reason is that parallelism of two measures requires that they (a) evaluate the same true score, (b) with the same precision of measurement, and (c) have uncorrelated error terms (i.e., Cov(E1, E2) = 0 holds, where Ej = YjTj is the error score associated with the observed score Yj and pertinent true score Tj, j = 1, 2; e.g., Raykov, Marcoulides, & Patelis, 2015).2 At the same time, merely requiring not recalling one’s answers does not imply any of the (a), (b), or (c) conditions given in the last sentence. In actual fact, one, two, or all these three conditions (a) through (c) can be violated when only requesting not recalling one’s answers.

To summarize the discussion in this section, McNeish’s (2018) treatment of reliability in terms of its used definition is unsatisfactory even for an attempted “tutorial,” which was the aim of his cited account, and in fact is in our opinion quite misleading for this purpose as well.

Misinterpretation of the Relationship Between Coefficient Alpha and Reliability

The relationship between population alpha and reliability has attracted a great deal of research starting perhaps with the seminal paper by Novick and Lewis (1967; see also Note 1). In their fundamental contribution, they showed a strong and important result that we believe has not received as yet the necessary attention it deserves, especially among empirical behavioral and social scientists. As demonstrated in Novick and Lewis (1967), if the error terms of the scale components Yj do not correlate, alpha is lower in the population than reliability except in the case of a unidimensional scale with equal loadings of the components on their common factor when alpha and reliability coincide (j = 1, …, p). This result has apparently generated the wide-spread reference to alpha as a lower bound of reliability, but we stress that it only holds if errors do not correlate (i.e., when any two error terms Ej and Ek correlate 0 in the population; j, k = 1, …, p; j ≠ k). When at least two error terms correlate, however, alpha can as a matter of fact exceed reliability in the population, as repeatedly documented in the literature (e.g., Raykov, 2012, and references therein). We would like to also note here that zero, or alternatively nonzero, error covariance is a testable assumption that need not hold in a given empirical setting, and that there is no assumption of error uncorrelatedness that follows from the classical test theory (e.g., Zimmerman, 1975).

While alpha is in general a lower bound of reliability for single-factor models with uncorrelated errors, as shown by Novick and Lewis (1967) and mentioned above, coefficient alpha and reliability coincide in the case of true-score equivalent tests with uncorrelated errors (McDonald, 1999), that is, when in addition the loadings are equal for all observed measures (scale components).3 The far-reaching work by Novick and Lewis (1967) may be seen as being in a sense of qualitative nature, however, since it is focused on the general lower bound property and when it holds. Their work does not evaluate the extent of population discrepancy of alpha from reliability in terms of how this discrepancy depends on model parameters. In actual fact, it is entirely possible that this discrepancy may be negligible in practical terms under certain suitable conditions. (Observe that this discrepancy is nonexistent, or equals 0, in the above case of true-score equivalent measures with uncorrelated errors; Novick & Lewis, 1967.) This insight is what McNeish (2018) also misses out on.

To address this theoretically and practically important population alpha-to-reliability discrepancy (ARD), Raykov (1997) examined it in quantitative terms and found that the ARD can be treated as empirically ignorable in single-factor models with uncorrelated errors where the loadings are uniformly high, that is, not only when the loadings are equal but also when they are sufficiently similar and high while being unequal (see next). In particular, Table 1 in Raykov (1997) consists of upper bounds for the ARD for a great deal of applied settings of practical relevance (with single-factor models and uncorrelated errors, whereby the underlying metric is generated by the assumption of unit latent variance). In addition, and for the general case, the expression in Equation (19) in Raykov (1997) allows straightforward determination or empirical evaluation (viz., both point and interval estimation) of the ARD as a function of the factor loadings and related model parameters. That formal expression can be used by anyone who is (a) interested in finding out whether alpha can be seen as practically identical to reliability, assuming they know (or can assume knowing to a sufficiently trustworthy degree) the values of the needed parameters for evaluating the ARD or (b) willing to point and interval estimate the ARD in an empirical study using sample data (in case of a tenable single factor model with uncorrelated errors for a scale under consideration; see Note 4).

Based on these results in Raykov (1997), Raykov and Marcoulides (2015; see also Raykov, West, & Traynor, 2015, for the complex sampling design study case) provided a latent variable modeling (LVM; Muthén, 2002) based procedure for evaluating whether the amount by which alpha is lower than reliability is negligible. Their approach uses instrumentally confidence intervals and permits a researcher to find out in an empirical setting whether alpha would be a dependable index (estimator) of scale reliability (see preceding paragraph for details). That approach is directly applicable when the single-factor loadings for the scale components (with uncorrelated errors) are unequal, which is when its utility is highest. (That coefficient alpha can be used in a completely trustworthy way as an estimator of reliability in case of equal loadings in this model, and uncorrelated errors, has been known for at least 50 years; e.g., Novick & Lewis, 1967.)

Should We Really Abandon the Use of Coefficient Alpha?

A main attempted message by McNeish (2018) is the general call for abandoning the use of coefficient alpha because it is obsolete. Based on the cited research work highlighted in the preceding section of this note, however, and which is surprisingly missed out in his account, his argument is seen as not only unjustifiable but also potentially misleading. In particular, under the assumption of a single-factor model with uncorrelated errors where the conditions indicated in Raykov and Marcoulides (2015) are fulfilled, there is no need—contrary to McNeish (2018) and other exclusively critical recent discussions of coefficient alpha—to abandon the use of α as an index of reliability. In simple terms, the conditions in Raykov (1997, Table 1, p. 344; see also Raykov & Marcoulides, 2015, for their empirical examination) go beyond the assumption of equal factor loadings (in a single-factor model with uncorrelated errors) and are more likely to be fulfilled in more advanced stages of instrument development and revision, as well as possibly in other circumstances. In these scale development phases, the single-factor model with uncorrelated errors is more likely to be plausible and the used individual measures perhaps associated with (sufficiently) uniformly high loadings on the underlying common factor. In all these cases, which are not rare in empirical behavioral and social research, coefficient alpha still serves well and it will thus be unjustifiable not to continue to use it. A numerical example of such a circumstance is provided and discussed in detail for instance in Raykov and Marcoulides (2015), where LVM is used to obtain point and interval estimates of coefficient alpha as (practically) dependable corresponding estimates of scale reliability. Last but not least, an Mplus source code is provided in the appendix for point and interval estimation of an upper bound of the ARD in an empirical setting where the single factor model with uncorrelated errors is plausible.4

Conclusion

This note refutes the general call for abandoning the use of coefficient alpha in empirical behavioral and social research, which is the essence of the recent discussion provided by McNeish (2018). His argumentation misses out on extensive relevant literature over the past couple of decades or so and its important findings that show the (practical) dependability of coefficient alpha as a reliability index under certain empirically examinable conditions. It is therefore not justifiable, based on the evidence provided in that omitted research, to simply frivolously abandon the use of coefficient alpha under these conditions, which are explicated in Raykov (1997, Equation [19] and Inequality [31]; see also Table 1, p. 344) and can be assessed in practice as discussed in Raykov and Marcoulides (2015; see also Raykov, 2012; Raykov, West, et al., 2015).

Without doubt the relationship between coefficient alpha and scale reliability is rather complex and the extant literature findings on it admittedly hard to integrate, as evinced by the enormous body of research related to alpha that has developed over the past 60 years or so. That research does not allow, however, overly simplistic general calls for abandoning alpha—or alternatively for using it all the time—as an index informing about scale reliability. The alpha-to-reliability relationship is further complicated by the simple fact that an empirical behavioral or social researcher typically has only access to a single sample (or at the most a very limited number of samples) from a studied population that cannot be exhaustively observed and measured on variables of interest (scale components). In that sample, due to the ubiquitous sampling error the relationship between sample alpha and sample reliability is unrestricted, that is, it can go in either direction (e.g., Raykov, 2012). Whereas this is a serious epistemological issue, we believe that it is nonetheless important to be aware of the empirically examinable conditions where coefficient alpha is sufficiently close to scale reliability so as to consider both of them as practically the same. These conditions include but are not limited to the case of equal loadings within a single-factor model with uncorrelated error terms and are found explicated in Raykov (1997, Equation 19 and Inequality 31; see also Table 1) and in Raykov and Marcoulides (2015) along with methods for their assessment (see also Raykov, West, et al., 2015, for the complex sampling design study case of increasing relevance in contemporary behavioral and social research). Well-informed reliability-related research, therefore, needs to keep them in mind, rather than fall prey to simplistic calls for abandoning generally the use of coefficient alpha. Such calls are unjustifiable and misleading, but unfortunately not rare in the recent measurement literature. It is this type of simplistic and not well-informed calls that contribute to creating and ultimately widening the gap between measurement methodology and measurement practice, not necessarily the alleged lack of quantitative sophistication by empirical behavioral and social scientists as espoused by some methodological literature as of late.

While the aim of the present note is to refute a general call for abandoning the use of coefficient alpha, this article does not state, claim, or imply that (a) alpha should always be used instead of alternative available reliability estimation methods when they are applicable (see, e.g., Raykov, 2012, and Raykov & Marcoulides, 2016, for some of them, including modifications applicable under assumption violations); or (b) that the deficiencies of coefficient alpha are in general less serious than discussed in the rigorous literature (e.g., Novick & Lewis, 1967; see also Raykov, 1997); or (c) that coefficient alpha has any optimality features that have not been explicated already in that literature. With that said, however, we do not believe there are reasons not to use coefficient alpha, both with its point and interval estimates—for instance, as discussed in Raykov and Marcoulides (2015) when their LVM method is applicable—when suitable empirically examinable conditions are fulfilled, for example, in more advanced scale construction and development phases with single-factor models with uncorrelated errors and uniformly high loadings (see preceding discussion in this note and Raykov, 1997, Equation 19, Inequality (31), and Table 1 for details).

Acknowledgments

We are grateful to V. Savalei for a valuable discussion on reliability estimation.

Appendix

Mplus Source Code for Point and Interval Estimation of an Upper Bound of the Population Discrepancy Between Coefficient Alpha and Scale Reliability

TITLE: EVALUATION OF AN UPPER BOUND FOR THE ARD.

 (SEE INEQUALITY (31) IN RAYKOV, 1997, AND NOTE BELOW).

DATA: FILE = <NAME OF DATA FILE>;

VARIABLE: NAMES = V1-V_P; ! P = # SCALE COMPONENTS (SEE MAIN TEXT).

MODEL: F BY V1*(L1)

 V2-V_P (L2-L_P);

 F@1;

ANALYSIS: BOOTSTRAP = 2000;

MODEL CONSTRAINT: NEW(UPPRBND, M, ML); ! UPPRBND = UPPER BOUND OF THE ARD.

 ML = (L1+L2+…+L_P)/P; ! MEAN LOADING.

 M = (1/P)*((L1-ML)^2+(L2-ML)^2+…+(L_P-ML)^2); ! SEE

 UPPRBND = M/((P-1)*ML^2); ! INEQUALITY (31) IN RAYKOV(1997).

OUTPUT: CINTERVAL(BCBOOTSTRAP); ! YIELDS CIs FOR UPPER BOUND OF ARD.

Note. CI = confidence interval; when using this command file, enter as “_P” the number of components for the scale under consideration and adjust the VARIABLE, MODEL, and middle pair of rows in the MODEL CONSTRAINT section, by correspondingly fully writing them out.

1.

The discussion in this note is developed at the population level unless specifically indicated otherwise. In particular, whenever referring to coefficient alpha and/or reliability, we mean their population values, as defined in Equations (1) and (3), respectively, unless expressly stated otherwise. Similarly, the components of a scale under consideration are assumed to be as usual fixed, that is, prespecified, rather than randomly drawn form a larger pool of components (e.g., Raykov, 1997). In case of a scale with p = 2 components being under consideration, appropriate additional restrictions are needed for identification of the single-factor model of relevance later in this discussion.

2.

Testing parallelism of two given (indivisible) measures is not possible with contemporary covariance structure modeling methods, as discussed in detail in Raykov, Marcoulides, and Patelis (2015). This test is possible within an appropriate unidimensional model comprising more than two measures. However, the test of unidimensionality then, as an overall test statistic, is in general not sensitive to “local” violations of homogeneity that are sufficiently weak not to be detected in that statistic. Hence, testing parallelism of two (indivisible) measures is at present either impossible with covariance structure analysis methods or not in general dependable when based on overall goodness of fit indices.

3.

In the single occasion measurement setting that is underlying this note, the congeneric measure model (congeneric test model; Jöreskog, 1971) and the single-factor model are not empirically distinguishable, and so we use or refer to these concepts (models) synonymously.

4.

An application of the alpha point and interval estimation method in Raykov and Marcoulides (2015), along with the method in Raykov (2012) for such estimation of scale reliability, will yield readily and within the same modeling session point and interval estimates of the ARD for a plausible single-factor model with uncorrelated errors (see also the appendix).

Footnotes

Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

  1. Brennan R. L. (2011). Generalizability theory and classical test theory. Applied Measurement in Education, 24, 1-21. [Google Scholar]
  2. Cronbach L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297-334. [Google Scholar]
  3. Cronbach L. J. (2004). My current thoughts on coefficient alpha and successor procedures. Educational and Psychological Measurement, 64, 391-418. [Google Scholar]
  4. Cudeck R. (1989). Analysis of correlation matrices using covariance structure models. Psychological Bulletin, 105, 317-327. [Google Scholar]
  5. DasGupta A. (2008). Asymptotic theory of statistics and probability. New York, NY: Springer. [Google Scholar]
  6. Guttman L. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10, 255-282. [DOI] [PubMed] [Google Scholar]
  7. Jöreskog K. G. (1971). Statistical analysis of sets of congeneric tests. Psychometrika, 36, 109-133. [Google Scholar]
  8. McDonald R. P. (1981). The dimensionality of tests and items. British Journal of Mathematical and Statistical Psychology, 34, 100-117. [Google Scholar]
  9. McDonald R. P. (1999). Test theory. A unified treatment. Mahwah, NJ: Erlbaum. [Google Scholar]
  10. McNeish D. (2018). Thanks coefficient alpha, we’ll take it from here. Psychological Methods, 23, 412-433. [DOI] [PubMed] [Google Scholar]
  11. Muthén B. O. (2002). Beyond SEM: General latent variable modeling. Behaviormetrika, 29, 87-117. [Google Scholar]
  12. Novick M. R., Lewis C. (1967). Coefficient alpha and the reliability of composite measurement. Psychometrika, 32, 1-13. [DOI] [PubMed] [Google Scholar]
  13. Raykov T. (1997). Scale reliability, Cronbach’s coefficient alpha, and violations of essential tau-equivalence for fixed congeneric components. Multivariate Behavioral Research, 32, 329-354. [DOI] [PubMed] [Google Scholar]
  14. Raykov T. (2012). Scale development using structural equation modeling. In Hoyle R. (Ed.), Handbook of structural equation modeling (pp. 472-492). New York, NY: Guilford Press. [Google Scholar]
  15. Raykov T., Marcoulides G. A. (2011). Introduction to psychometric theory. New York, NY: Taylor & Francis. [Google Scholar]
  16. Raykov T., Marcoulides G. A. (2015). A direct latent variable modeling based procedure for evaluation of coefficient alpha. Educational and Psychological Measurement, 75, 146-156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Raykov T., Marcoulides G. A. (2016). Scale reliability evaluation under multiple assumption violations. Structural Equation Modeling, 23, 302-313. [Google Scholar]
  18. Raykov T., Marcoulides G. A., Patelis T. (2015). The importance of the assumption of uncorrelated errors in psychometric theory. Educational and Psychological Measurement, 75, 634-647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Raykov T., West B. T., Traynor A. (2015). Evaluation of coefficient alpha for multiple component measuring instruments in complex sample designs. Structural Equation Modeling, 22, 429-438. [Google Scholar]
  20. Zimmerman D. W. (1975). Probability spaces, Hilbert spaces, and the axioms of test theory. Psychometrika, 40, 395-412. [Google Scholar]

Articles from Educational and Psychological Measurement are provided here courtesy of SAGE Publications

RESOURCES