Abstract
In this article, I argue for a number of important changes to the conceptual foundations of construct validity theory. I begin by suggesting that construct validity theorists should shift their attention from the validation of constructs to the process of evaluating scientific theories. This shift in focus is facilitated by distinguishing construct validation (understood as theory evaluation) from test validation, thereby freeing it from its long-standing focus on psychological measurement. In repositioning construct validity theory in this way, researchers should jettison the outmoded but superficially popular notion that theories are nomological networks in favor of a more plausible pragmatic view of their natures, such as the idea that theories are explanatorily coherent networks. Consistent with this shift in understanding the nature of theories, my recommendation is that construct validation should embrace an explanationist perspective on the theory evaluation process to complement its focus on hypothetico-deductive theory testing. On this view, abductive research methods have an important role to play. The revisionist perspective on construct validity proposed here is discussed in light of relevant developments in scientific methodology and is applied to an influential account of the validation process that has shaped research practice.
Keywords: construct validity, nomological networks, pragmatic theories, theory evaluation, abductive methods, scientific methodology, argument-based validity
Perhaps it is best to view scientific theories as a motley mix, deployed in research practices in very different ways.
—Barker & Kitcher (2014, p. 37)
Validity is a matter of inference and the weighing of evidence; however in my view, explanatory considerations guide our inferences.
—Zumbo (2009, p. 69)
The notion of construct validity is of signal importance to psychology because it is often taken to be the driving force behind the evaluation of theoretical ideas in relation to measurement endeavors. As is well known, Cronbach and Meehl (1955) provided the first detailed statement of the philosophy and methodology of construct validity. Prior to the publication of their article, validity studies focused heavily on predicting criterion behavior from tests and measurements. By contrast, Cronbach and Meehl’s formulation of the idea of construct validity signaled an important conceptual shift from a concern with predictive devices to one in which empirical evidence was taken as a “sign” of underlying theoretical constructs. Influenced to a great extent by the standard interpretation of mid-20th century philosophy of logical empiricism, 1 they portrayed construct validation as a theory-testing endeavor in which hypotheses about unobservable psychological states were embedded in networks of laws and empirically tested using hypothetico-deductive methods.
Even though Cronbach and Meehl’s original formulation of construct validity exerted an enormous influence on the field, and remains the standard reference in the validity literature, 2 it has been modified and criticized by many validity theorists in a number of ways since its inception. One important early criticism of Cronbach and Meehl’s (1955) work was Embretson’s (1983) push for its redirection to modeling cognitive processes underlying task performance. More recently, Kane’s (2013) argument-based approach to validity, which depicted the process of validation as the development of sound arguments, provided an attractive framework and methodology for researchers seeking guidance in validity, assessment, and measurement research. Also important among the more recent proposals made by psychologists are the ideas that construct validity is moribund and that test validity should be the proper focus of validity inquiries (Borsboom et al., 2009), that construct validity centrally involves making inferences of an explanatory nature (Zumbo, 2009), and that a philosophy of scientific realism, tailored to the demands of psychology, provides the most appropriate metascientific outlook on construct validity (Slaney, 2017).
More recently, a number of philosophers of science have also examined the conceptual foundations of construct validity theory. Alexandrova and Haybron (2016) claimed that construct validity research on well-being is “theory avoidant” and that more philosophical work on the topic is needed. In the face of confusion and debate in the construct validity literature, Stone (2019) has stressed the importance of distinguishing between the construct validity of a measure and construct legitimacy, which is a property of a construct in relation to its surrounding theory. Further, Feest (2020) has undertaken an examination of various aspects of construct validity theory in relation to the Implicit Association Test. She argued that construct validity comes in degrees and that a construct may be worth developing even if its associated test has low construct validity. Finally, Schaffner (2020) depicted construct validation as a “construct-progressivity assessment” exercise in which theories and models are evaluated in terms of their “theoretical” virtues. He applied this perspective to a comparison of two neurobiological theories of fear and anxiety.
In this article, I, too, argue for a number of important changes to the conceptual foundations of construct validity theory—changes that should be of interest to validity theorists and reflective practitioners. In doing so, I will refer briefly to a number of the contributions mentioned above. I begin by suggesting that construct validity theorists should shift their attention from the validation of constructs to the process of evaluating scientific theories. This shift in focus is facilitated by distinguishing the enterprise of construct validation (understood as theory evaluation) from test validation, thereby freeing it from its long-standing focus on psychological measurement. In repositioning construct validity theory in this way, researchers should jettison the outmoded but superficially popular notion in psychology that theories are nomological networks in favor of a more plausible pragmatic view of their natures, such as the idea that theories are explanatorily coherent networks. Consistent with this shift in understanding the nature of theories, my recommendation is that construct validation should embrace an explanationist perspective on the theory evaluation process to complement its focus on hypothetico-deductive theory testing. On this view, abductive research methods have an important role to play. The revisionist perspective on construct validity proposed here is discussed in light of relevant developments in scientific methodology.
The specific goals of this article are to identify a wide range of different explanatory theories found in psychological science and to present a number of underappreciated abductive research methods that usefully serve to evaluate those explanatory theories. To this end, I provide an example of comparative theory evaluation that illustrates the kind of research afforded by the explanationist perspective. I also indicate how my explanationist theory of construct validation provides a supportive theoretical framework for Kane’s (2013) influential argument-based approach to validation practice.
In order to forestall a possible misunderstanding, I use the term validity in the expression “construct validity” to denote a property of being, such as a scientific explanation. I use the term validation to signify the process of evaluating the epistemic worth of cognitive products, such as appraising explanatory theories by employing the method of inference to the best explanation (cf. Zumbo, 2007).
Distinguishing Construct Validity From Test Validity
It is well known that the notion of construct validity was strongly shaped by its origins in educational and psychological measurement. In this regard, Cronbach and Meehl’s (1955) influential conception of construct validity was formulated in the immediate light of the “Technical Recommendations for Psychological Tests and Diagnostic Techniques” (American Psychological Association, 1954). Construct validity continues to be thought of as an important—some scholars say the most important—form of test validity, displaying a strong concern with educational and psychological measurement. At the same time, there is a less visible strain running through the validity literature that focuses on the validation of constructs (and to a lesser extent, the theories that contain them). In this way, the validity literature constitutes a mix of measurement concerns and theory evaluation endeavors. Historical contingencies are responsible for this intertwining, which modest conceptual engineering can tease apart. I believe that, for now, there are benefits to separating these two aspects of validity and viewing them as complementary undertakings. By creating a division of cognitive labor in this way, the prospects of advancing theory validation thinking and practice, without an immediate concern with measurement, might well improve.
Interestingly, validity theorists occasionally express a preference for separating construct validity from measurement concerns. The revisionist work by Borsboom and his collaborators (Borsboom et al., 2004, 2009) is a prominent case in point. These authors forcefully argued that the notion of construct validity is deeply problematic and that it should be abandoned in favor of retaining and strengthening the idea of test validity, which they take to be the proper concern of validity. For them, construct validity is a property of test-score interpretations understood in terms of constructs and represents the strength of the evidence for such interpretations. By contrast, Borsboom et al. take test validity to capture the old idea that a test is valid if it measures what it purports to measure.
I think Borsboom et al.’s (2009) strong critique of construct validity carries considerable force against Cronbach and Meehl’s mid-20th century formulation of the idea. However, rather than attempt to improve on the nomological network view of scientific theory, they take the inadequacy of that view as a major reason for abandoning the notion of construct validity altogether. My preference is to revise our view of construct validity by improving our conception of scientific theories and, relatedly, providing a deeper understanding of their appropriate validation methods.
Although construct validity and test validity are typically thought of as one undertaking, there is genuine merit in distinguishing them, as Borsboom et al. (2009) did. However, although they are sometimes seen as competitors (as they are in Borsboom et al.’s critique), I do not think that they should be. As I see it, construct validation is directly concerned with employing scientific inference to obtain evidence in the course of evaluating constructs in theories. By contrast, test validation is concerned with ascertaining the accuracy of measuring instruments (tests) in establishing the truth about psychological attributes. Given that each performs a different function, they should be able to coexist peacefully, complement each other, and interact constructively, where appropriate. Indeed, construct validation, understood as theory appraisal, should be used to evaluate the cognitive theories of test performance that Borsboom at al. rightly desire (Borsboom, 2006; Borsboom et al., 2009). This view of construct validation also brings to mind Embretson’s (1983) construct-modeling approach to the topic mentioned in the introduction. Her account emphasized the importance of construct representation, which “is concerned with identifying the theoretical mechanisms that underlie task performance” (p. 180). It also introduced the complementary idea of nomothetic span, which “refers to the network of relationships of a test to other measures” (p. 180). Further, Embretson’s idea of construct representation informatively captured the notion of mechanistic theories described later in this article. Mechanistic theories, it will be seen, explicitly model in detail the causal processes involved in the production of effects.
Additionally, although measurement is unquestionably of major important in science, a good deal of reputable science focuses on qualitative attributes and, as a consequence, is not directly concerned with measurement. Construct validation, understood as the process of theory appraisal, is largely a qualitative undertaking in which explanatory theories are evaluated using criteria related to explanatory goodness. Borsboom et al. (2009) note that the construct validation literature tells us little about evaluative criteria, such as parsimony. However, an updating of methods of theory appraisal (e.g., the theory of explanatory coherence) would correct matters in this regard, a project that Borsboom has recently engaged in (Borsboom et al., 2021; Maier et al., 2023). Given that scientific methods lie at the heart of Cronbach and Meehl’s (1955) take on construct validity, the appropriateness of the methods used to establish construct validity should be regarded as an indispensable part of its reconceptualization.
Another way of appreciating that the process of theory construction is different from a direct concern with measurement is to invoke the important methodological distinctions between data, phenomena, and theory (Bogen & Woodward, 1988; Haig, 2014; Woodward, 1989). Measurement is often employed in the process of moving from data to empirical phenomena. By contrast, theory construction is typically undertaken to explain empirical phenomena, not data. Here, measurement and theory are usefully contrasted in terms of the different roles they often perform in science: When present, measurement tends to operate in a data-based inductive process, whereas theory construction is a phenomena-based explanatory process. In this way, the methodological gap between data and theory can serve as a natural means for separating measurement concerns from theory construction endeavors. However, it is important to appreciate that empirical phenomena mediate the gap between data and theory, but they do not close it.
From Construct Validity to Theory Evaluation
In the concluding section of their 1955 article, Cronbach and Meehl stated that “the investigation of a test’s construct validity is not essentially different from the general scientific procedures for developing and confirming theories” (p. 300). This assertion deserves examination for several reasons. First, Cronbach and Meehl’s claim simultaneously speaks of the validity of constructs and theories. Despite this dual focus, the ensuing literature on construct validity has focused more on the explication of constructs than seriously engaging in the construction of theories; indeed, constructs have become psychology’s favored unit of analysis. Here, I take constructs to be components of theories rather than identical with them. On this interpretation, the meaning of constructs is inextricably bound up with the theories in which they are embedded, which theories provide the appropriate context for their proper understanding. 3 Second, validity theorists and researchers continue to speak of their theories as “nomological nets,” but they show little interest in taking the corresponding idea seriously or in articulating alternative accounts of the nature of theories. Third, as just noted, the Cronbach and Meehl quotation suggests that employing scientific methods lies at the heart of construct validity, which, by small extension, leads to the idea that theories of scientific method should be taken seriously by construct validity theorists. By and large, this has not happened. Finally, Cronbach and Meehl’s assertion suggests that the methods of construct validation should be concerned with the development as well as the confirmation of theories—something that has received insufficient attention in the literature.
These matters will be addressed in this article, but for now I want to press the general point that establishing construct validity should focus primarily on the process of evaluating scientific theories rather than the constructs embedded in them. As a consequence, two major questions need to be addressed: (a) What are scientific theories? and (b) How should they be evaluated? This article concentrates on answering these questions.
What Are Scientific Theories?
Although psychologists have recently shown an increased interest in how to construct scientific theories—the special issue of Perspectives on Psychological Science on “Theory in Psychological Science” (Prouix & Morey, 2021) is a prominent recent example—they have given considerably less attention to the question, “What are scientific theories?” However, this question has been a long-standing focus of philosophers of science, many of whom believe that scientific theories are our most important epistemic achievements. 4 Most prominent among their answers to this question are the two contrasting ideas that theories are sentential syntactic structures cast as formal axiomatic systems and that they are (on one influential formulation) nonsentential semantic entities composing families of models (e.g., Suppe, 1977, 2000). A third, and rather different, view of theories has more recently garnered attention and is now regarded as an attractive outlook. For want of a more exact term, it can be called the pragmatic view of theories (Winther, 2020). As will be seen shortly, it is a view that resists a uniform and straightforward characterization.
In the next section of the article, I briefly consider construct validity’s official embrace of the idea that theories are syntactic structures, depicted as nomological networks, and argue that it is inappropriate for psychology. I then outline the pragmatic perspective on theories, which is better suited to psychology. I set the semantic alternative to one side, partly because I think it assigns too great a role in science to models as formal devices but also because science makes use of a considerable variety of models, some of which have little to do with theories.
Theories as nomological networks
The idea that theories are nomological networks is central to Cronbach and Meehl’s (1955) conception of construct validity, and it has been endorsed by countless validity researchers. Revealingly, this endorsement is in name only; researchers and methodologists seldom, if ever, attempt to characterize their theories as nomological networks in which observational statements are linked to theoretical laws via coordinating definitions.
Nomological networks are quasiformal expressions of the standard logical empiricist idea that theories are interpreted syntactic structures cast as axiomatic systems. As such, they conform, roughly, to what Feigl (1970) called “the orthodox view” of theories. Feigl’s pictorial representation of such theories displays a network of theoretical hypotheses, or postulates, containing primitive concepts, which are connected to empirical concepts grounded in the “soil” of observation or experience. The theoretical hypotheses are scientific laws or approximations thereof. The primitive concepts are named by theoretical terms, which purportedly designate theoretical, or latent, entities. Statements about observed matters of fact are related to the theoretical claims through implicit, or coordinating, definitions of the theoretical concepts.
Tellingly, real-life examples of psychological theories formulated as nomological networks are virtually impossible to come by and for good reason: Nomological networks are rational reconstructions devised by philosophers of science to express their view of how one ought to think about scientific theories. Moreover, given their nature, these networks cannot provide plausible reconstructions of how psychological scientists understand the different kinds of theories they formulate in their research endeavors.
In particular, the nomological network conception of scientific theories inherits a number of problems that have been identified with the syntactic view of theories. These include heavy reliance on a restrictive notion of what counts as a scientific law, adherence to a problematic formulation of the distinction between observation and theory statements, and with it, a notion of correspondence rules linking the two that confusingly moves among concerns with meaning, measurement, and causal relations (Suppe, 2000). 5
True to its name, the idea of a nomological network is centrally concerned with the concept of a scientific law, and I will focus my attention for now on that idea. The first thing to be said about laws is that their understanding is a highly contested matter (e.g., Cartwright et al., 2005). Historically, laws were interpreted as exceptionless regularities of universal scope. However, this account is overly restrictive in at least two ways: (a) It is highly idealized and does not apply to real systems, even in physics (Cartwright, 1983), and (b) its Humean character discourages the search for connections in nature that bind the empirical regularities. A realistic response to the first restriction is to speak of lawlike regularities and empirical generalizations of more-or-less limited scope. One popular response to the second restriction is to interpret regularities as events connected by natural necessity (necessities of nature) rather than logical necessity.
Two further proposed solutions to the problem of laws should be mentioned here. One solution strongly deemphasizes the importance of laws in science and promotes models as its main representational device (e.g., Giere’s, 1999, conception of the semantic view of theories). The other takes an approach to laws that acknowledges that different kinds of empirical regularities are found in science. In this vein, Mitchell (1997, 2000) argued that regularities are distributed on a continuum of spatiotemporal stability, with most falling in the middle ground between universal and accidental generalizations. She called these lawlike regularities “pragmatic laws” in recognition of the contingency and complexity that characterizes much of the biological world. I think that this outlook offers a better way of thinking about laws in psychology’s multiplex subject matter than the traditional logical empiricist account assumed in much of the construct validity literature. Although a few areas of psychology, such as psychophysics, might reasonably claim to have produced stable laws (e.g., Weber’s law and Stevens’s power law), present-day psychology is not a nomothetic science in the strict sense (Teigen, 2002). Instead, for the most part, it settles for the weaker goal of securing empirical generalizations that have acceptable degrees of robustness.
A pragmatic view of theories
In sharp contrast to the restrictive prescriptions of the nomological network view of scientific theories, the pragmatic alternative adopts a pluralist view of what theories are. Invoking the words of Barker and Kitcher’s (2014) aphorism presented at the beginning of this article, we might say that pragmatic theories are “a motley mix, deployed in research practices in very different ways” (p. 37). In keeping with this outlook, pragmatic theories are taken to comprise a varied, and changing, set of elements, which reflect the heterogeneous nature of theories in science. These elements include sentences about empirical phenomena, models of causal mechanisms, explanatory coherence relations, research problems, skill sets, and exemplars.
Thus, it should come as no surprise that the pragmatic view of theories cannot be confined to a single characterization, as is the case with the nomological network alternative (e.g., Mormann, 2007; Thagard, 2005; Winther, 2020). Further, the pragmatic account refrains from offering an idealized description of theory structure; indeed, it admits of multiple structures and, moreover, need not assign a high priority to articulating a theory’s structure by and of itself. Furthermore, there is no requirement that a theory has to be cast in formal terms, though it may have formal elements. Importantly, a theory will likely have a number of nonformal aspects, including such things as the articulation of researcher goals, statements of theory construction practices, exemplars of sound reasoning, and conditions of possible application. 6
Varieties of pragmatic theory
Because of their variety, many pragmatic theories in psychology comprise a set of weak family resemblances. Notably, classificatory theories, instrumentalist theories, dispositional theories, mechanistic theories, and global theories have different natures and purposes, yet they all qualify as pragmatic theories in the broad sense adopted here. A brief word about each will convey something of the heterogeneity involved.
In data-intensive science, data curators often perform the important role of constructing explanatory theories in order to enable data to “travel” between scientists both within and across disciplines. Leonelli (2016) suggestively calls these classificatory theories. To perform this role, theorists construct “ontologies” (e.g., the “gene ontology” in bioinformatics). The gene ontology is a labeling device based on biological processes and entities that provides the stability needed to search, retrieve, and transport data from its databases. The increasing acceptance of big data thinking in psychology 7 is likely to see theories of this sort grow in prominence.
Advocates of instrumentalist theories are commonly understood to subscribe to the antirealist doctrine that scientific theories are neither true nor false. They are not explanatory theories that appeal to hidden theoretical entities but are more or less useful devices for summarizing and predicting empirical relationships. Radical behaviorist learning theory (Skinner, 1984) is generally taken to be a prominent example of this outlook in psychology. (For a hint at a different interpretation of theories for Skinner, see Note 10.)
Psychology also features dispositional theories, which are given that name because their referents dispose people to do different sorts of things. Commonly regarded as rudimentary theories, they refer to the existence, although not the nature, of causes that plausibly underlie and give rise to behavior (e.g., Rozeboom, 1984). Early formulations of trait theory in personality psychology and interpreted latent variables in structural equation models appeal to theories that are more or less dispositional in character.
Another important class of theories, known as mechanistic theories, is also employed in psychology to trace the causal processes involved in the production of effects (Wright & Bechtel, 2007). In this way, mechanistic theories offer a more informative causal story than dispositional theories do. I will have more to say about mechanistic theories in the next section.
Finally, I draw attention to the presence of global theories in science (Hooker, 1975). Global theories are wide ranging and are typically composed of a worldview, a methodology and theory of instruments, a specification of what is observable, and provision of a language for data reports. Theories as different as psychoanalytic theory and radical behaviorist theory can be instructively understood as global theories, although it is not common to do so.
Theories as explanatory coherence networks
In this section, I sketch the primary elements of one important formulation of the pragmatic view of theories, known as the explanatory coherentist account (Haig, 2014; Thagard, 1988, 2005). The explanatory coherentist formulation draws interdisciplinary insights from cognitive science, particularly the disciplines of artificial intelligence, psychology, and contemporary realist philosophy of science. It, therefore, fundamentally differs from the nomological network account of theories.
As already noted, the nomological network account of theories links theoretical and empirical claims through coordinating definitions. By contrast, the coherentist account assigns no major role to definitions and meaning more generally. 8 Instead, the propositions in a theory’s network are held together by coherence relations. Because such theories are explanatory, their constituent claims enter into relations of explanatory coherence, not logical or semantic coherence, as the nomological network view would have it.
The explanatory coherence view of theories primarily consists of claims about empirical phenomena, models of causal mechanisms, and coherence relations between propositions in the theories. The following description of the three major elements of this formulation contrasts markedly with the idea that theories are syntactically structured nomological networks.
Logical empiricists placed great store in the epistemic importance of observation in science and subscribed to a hard and fast distinction between observation and theory. Accordingly, nomological networks were taken to explain and predict facts about observed data. By contrast, the coherentist pragmatic conception of theories outlined here assigns no special epistemic role to observation. Instead, what matters is the reliability of the justificatory process involved, irrespective of whether data are observed or not (Bogen & Woodward, 1988; Haig, 2014).
Relatedly, the common twofold distinction between observation and theory endorsed by orthodox logical empiricism and featured in nomological networks is rejected in favor of the threefold distinction between data, phenomena, and theory mentioned earlier. Data are observable, but phenomena typically are not because they are abstracted from the data. Importantly, many of them are robust empirical generalizations rather than universal laws. 9
Because data are idiosyncratic to particular investigative contexts, it is the stability and generality of empirical phenomena that makes them, rather than data, the appropriate targets of explanation. Data, by contrast, serve as evidence for phenomena rather than theory. For example, consistent intergenerational IQ score gains serve as data-based evidence for the abstracted phenomenon of the Flynn effect. The Flynn effect, in turn, serves as evidence for the Dickenson-Flynn theory of intelligence, which was devised to explain that effect.
On the standard view of logical empiricism, models are of heuristic value in science, but they are not accorded a genuine epistemic role. Thus, for Hempel (1965), analogical models do not play an integral role in the “systematic statement of scientific explanations.” They can, however, contribute to their pragmatic effectiveness in “the context of discovery” where they “may provide effective heuristic guidance in the search for new explanatory principles” (p. 441). For this reason, models do not figure prominently in the nomological network account of theories.
The current strong philosophical interest in the nature and place of models in science reveals a number of different (and contested) accounts of the relation between theories and models: Models are variously understood as equivalent to theories, as components of theories, or as independent entities that may or may not interact with theories (Bailer-Jones, 2009; Frigg, 2023). In the explanatory coherentist view of theories, analogical models are understood as components of theories concerned with detailing knowledge about the causal mechanisms they purportedly refer to. My discussion of the strategy of analogical modeling in the next section adopts an interactionist view of the relation between models and theories.
The logical empiricist view of nomological networks took theories to be explanatory structures, where scientific explanation involved subsuming the events to be explained under covering laws rather than providing causal explanations. By contrast, the explanatory coherentist view of theories typically seeks to articulate causal mechanisms. By tracing the processes involved in the production of phenomena, the appeal to causal mechanisms is an important, time-honored scientific strategy used to understand the empirical world. Saying what causal mechanisms are has been the subject of intense philosophical investigation (Andersen, 2014a, 2014b). For present purposes, it is sufficient to understand mechanisms as structured systems of components and their operations whose orchestrated workings operate in combination to produce empirical phenomena. Although psychology is often faulted for producing uninformative theories, it has in fact produced a good number of causal-mechanistic theories, albeit in varying degrees of detail. In doing so, a research strategy that is sometimes adopted involves the decomposition of hierarchically organized systems into their components and operations and then building models to better understand the organization that comprises the mechanisms’ activities (Bechtel & Richardson, 2010). The wide-ranging information-processing perspective in psychology has often sought mechanistic explanations using such a strategy (Wright & Bechtel, 2007).
Two additional features of pragmatic theories
A couple of additional features of the pragmatic take on theories are worth noting. One interesting feature of these theories is that they use different representational modalities. Theories in science are often presented as structures of propositions, which are expressed as statements. As already noted, logical empiricism takes theories to be formal axiomatic structures expressed in logico-mathematical terms. However, in psychology, pragmatic theories are often a mix of propositional and nonpropositional representations, with numerical notation, diagrams, and pictures often interleaved with running verbal descriptions.
Mention of this mix of representational features brings to mind an important but largely neglected point that some pragmatic theories in science are more concerned with directly describing and explaining their objects of investigation than they are with developing formally valid argument patterns to justify conclusions they might reach. In this regard, Bogen and Woodward (2005) took issue with the long-standing and pervasive tradition in the philosophy of science that theory-related activities, such as testing, predicting, and explaining, should be understood as a matter of “inferential relations among sentences” (p. 233). This outlook can be clearly seen in the logical empiricist commitment to the covering-law account of explanation, and the hypothetico-deductive theory of confirmation. On the inferential relations account, explanation is conceived as a deductive (or inductive) inferential relation between statements about the explanandum (the event to be explained) and the explanans (the explanation of the event). In this account, confirmation is cast as a relation between statements of observational evidence and the relevant hypotheses or theories. By contrast, the mechanistic view of theories just considered depicts explanations as schematic representations of causal processes that produce empirical phenomena. For example, Darden’s (2006) case history of the history of the development of the theory of spatial memory displays this more direct focus on the objects of investigation.
The other interesting feature of pragmatic theories is that they can be illuminatingly cast as cognitive entities. Many logical empiricists, in search of objective knowledge, rejected accounts of knowledge that appealed to mental processes on the grounds that they were subjective. Popper (1972), who most forcefully criticized psychological conceptions of knowledge, also favored an impersonal epistemology that made no reference to “the knowing subject.” Not surprisingly, his controversial proposal that theories literally exist in a Platonic “third world” of abstract cultural artifacts has been strongly criticized (Gadenne, 2016).
By contrast, the pragmatic outlook on theories presented here contends that theories can be understood as representations that form part of a scientist’s cognitive makeup (e.g., Thagard, 1988, 2005; see also Giere, 1988). The idea outlined above, that theories are explanatorily coherent representations, is of this type. According to Thagard, theories are computational systems that structure complexes of rules, concepts, and stored problem solutions. Rules are the propositional components of the theories, concepts are understood as prototypes (not sets of necessary and sufficient conditions), and stored problem solutions are heuristics that facilitate moving from starting conditions to goals. Thagard (1988) showed how these elements operate in a computational depiction of the wave theory of light. In psychology, cognitive architectures such as Adaptive Control of Thought–Rational (ACT-R) and Soar are prominent examples of theories that depict mental systems as computational processes. Finally, it should be noted that theories can also be regarded as distributed cognitions that are stored in computer data bases and located in scientific communities, as well as the minds of individual scientists.
The pragmatic theories outlined above are explanatory and serve the major scientific goal of scientific understanding. For this reason, I have confined my attention to the justification of explanatory theories that is provided by abductive methods. 10
An Explanation-Centered View of Construct Validation
Although Cronbach and Meehl (1955) explicitly claimed that the process of construct validation centrally involved the employment of scientific methods to develop and confirm scientific theories, limited attention has been paid to this matter in the foundational literature. For example, the methods of statistical significance testing and factor analysis, although mentioned by Cronbach and Meehl (1955), received most of their subsequent conceptual examination outside of validity theory, and these gains have had limited impact on our understanding of construct validation.
More generally, there has been a decided reluctance in the validity literature to take major theories of scientific method seriously. For the most part, it has been uncritically assumed that construct validation involves a straightforward appeal to hypothetico-deductive methods. Some 30 years after the appearance of Cronbach and Meehl’s (1955) article, Landy (1986) went so far as to declare that “the [entire] validation process be considered nothing more and nothing less than traditional hypothesis testing” (p. 1183). A strong commitment to the idea that the validation process is essentially one of hypothetico-deductive testing continues to this day. Later, I will suggest a modification to the standard hypothetico-deductive account of method that would make it useful for evaluating explanatory theories, but for now I want to address a major negative consequence of the widespread adherence to minimalist hypothetico-deductive testing: namely, a failure to sufficiently appreciate the importance of explanatory inference in the validation of scientific theories.
Despite the dominance of the hypothetico-deductive outlook on construct validation, one or two authors have explicitly recommend the adoption of an explanation-centered alternative. Most prominently, Zumbo (2007, 2009) expressed his view that our construct validation efforts should be guided by explanatory considerations in which the goodness of our explanatory theories is assessed by a process of inference to the best explanation. The second epigram presented at the beginning of the article clearly states his position on this matter. Relatedly, in a previous article (Haig, 1999), I briefly argued for the adoption of a broad explanationist outlook on construct validation in which the generation, development, and comparative appraisal of theories are carried out by different forms of abductive reasoning. An outline of this outlook will be presented shortly.
Interestingly, as I shall indicate later, Meehl (2002) himself attested in his later work to the importance of inference to the best explanation in science and endorsed the idea that scientific theories are appraised in terms of their ability of explain the facts. More recently, and as noted earlier, Schaffner (2020) reinterpreted construct validation as a process in which the epistemic appraisal of competing models or theories over time is undertaken in respect of both empirical and superempirical evaluative criteria. His broad outlook on theory appraisal might reasonably be taken to accommodate inference to the best explanation.
Obviously, a credible account of validating theories by explanatory means has to come to grips with the important but neglected idea of abductive reasoning. Abductive inference is a protean form of reasoning in science that is explanatory in nature. It is widely employed in different ways to generate, develop, and appraise explanatory hypotheses and theories. Thagard (1988) nicely captured the demands of these three theory construction tasks by distinguishing among existential abduction, analogical abduction, and inference to the best explanation. About the first two forms of abduction, he stated, “existential abduction postulates the existence of previously unknown objects, such as new planets, . . . [whereas] analogical abduction uses past cases of hypothesis formation to generate hypotheses similar to existing ones” (p. 54). Because of their role in fashioning knowledge claims, these two forms of abduction can be considered creative abductions. They are different from inference to the best explanation, which Thagard (along with many philosophers of science) took to involve the comparative evaluation of existing hypotheses and theories.
One visible sign of the growth in our knowledge about the role that abduction plays in science is the fact that we now have at our disposal a number of abductive methods, whose use facilitates the making of abductive inferences. I want to show something of this methodological gain by indicating how the three forms of abduction just mentioned can be implemented by employing different abductive research methods that are tailored to their different research ends. A more extensive description of these methods can be found in Haig (2014, 2022).
Exploratory factor analysis
Exploratory factor analysis is a well-known statistical method that facilitates the postulation of latent variables that are thought to produce patterns of correlations in domains of manifest variables (Mulaik, 2009). Intellectual abilities, personality traits, and social attitudes are all theoretical interpretations of latent variables that result from the use of exploratory factor analysis.
However, the basic form of inference captured by exploratory factor analysis is almost always unstated in characterizations of the method. I have argued that the method is best construed as an abductive method of theory generation (Haig, 2005). This characterization is consistent with its classification as a latent variable method. In particular, I maintain that exploratory factor analysis facilitates explanatory reasoning to existential abductions as described above. Recall that existential abductions permit researchers to hypothesize the existence of unknown entities but not their natures. Further research is needed to expand on the initial primitive conception of these entities. The initial postulation of Spearman’s g, followed by its articulation in terms of information-processing speed, is a clear case in point. Given the elementary nature of the hypotheses it spawns, we may say that exploratory factor analysis bequeaths us dispositional theories in the sense described earlier.
Importantly, the existential abductions advanced by exploratory factor analysis are empowered by the method’s employment of an important methodological principle known as the principle of the common cause (Haig, 2005; Sober, 1988). Briefly, this principle advises researchers to hypothesize one or more common causes in order to explain correlated events unless there is good reason not to do so. The proviso cautions us that exploratory factor analysis is a restricted causal maxim; it is applicable only in those domains deemed to have a common causal structure.
Concisely stated, then, exploratory factor analysis is an existential abductive method that employs the principle of the common cause to generate dispositional theories in order to explain correlated variables. 11
Analogical modeling
Contemporary studies of scientific practice frequently accord analogical models an important role in furthering scientific knowledge (Abrantes, 1999; Harré, 1976). Their importance can be seen in the explicit use of a strategy of analogical modeling as a resource for developing newly generated explanatory theories. For example, the rudimentary nature of dispositional theories might be improved by providing a more informative account in terms of causal mechanisms. This can be done by building models of the presumed causal mechanisms.
With this pragmatic research strategy, scientists expand their understanding of the nature of the unknown causal mechanisms by conceiving them in terms of what is already known and well understood. That is to say, with analogical modeling, one builds a model of the unknown subject (the causal mechanisms) based on positive analogies derived from the known nature and behavior of a familiar source. Examples of analogical models abound. They include the molecular model of gases, based on an analogy with billiard balls in a container, and the popular computational model of the mind, based on an analogy with the computer. Harré (Harré & Secord, 1972) employed this strategy of analogical modeling and fashioned a rule model of microsocial interaction in social psychology. He appealed to Goffman’s (1969) dramaturgical perspective as the source model for understanding the causal mechanisms at play in ceremonial, argumentative, and other forms of social behavior.
Examples such as these can be reconstructed to conform to the following general argument schema (Haig, 2014, p. 101):
Hypothesis H* about property Q was correct in situation S1.
Situation S1 is like situation S2 in relevant respects.
Therefore, an analogue of H* might be appropriate in situation S2.
Because the reasoning captured by this schema involves explanatory hypotheses about the source and subject, the analogical reasoning involved takes a particular form known as analogical abduction, not abduction in a generic sense. Further, judging the soundness of the argument captured by analogical abduction provides us with an important means for assessing the provisional plausibility of our models of the sought-after causal mechanisms.
Inference to the best explanation
Inference to the best explanation is often employed in science to justify claims about the credibility of theories. As with existential and analogical abduction, inference to the best explanation focuses on the explanatory worth of theories. However, unlike the aforementioned two forms of abduction, inference to the best explanation does this in a comparative manner. It accepts a hypothesis or theory on the grounds that it explains the evidence better than its rivals do. A prominent case in point is Darwin’s (1958) argument for the acceptance of his theory of natural selection because it provided a more coherent explanation of the vast array of relevant evidence than the alternative of divine creation.
In his perspective on construct validity noted earlier, Zumbo (2009) strongly emphasized the centrality of explanatory inference. He said, “Explanation acts as a regulative ideal; validity is the explanation for the test-score variation, and validation is the process of developing and testing the explanation” (p. 69). I strongly agree with his emphasis on explanation, and I want to complement his outlook by considering inference to the best explanation as a method of theory appraisal. 12
Thagard’s (1992) seminal work deserves to be given priority here. He developed a detailed method of inference to the best explanation that he called the theory of explanatory coherence, which provides the means for accepting the better of competing theories. It is important to appreciate that the notion of explanatory coherence is different from the more familiar ideas of logical and probabilistic coherence. The core idea of explanatory coherence is that the constituent propositions of a theory “hold together” because of the explanatory relations among them.
The theory of explanatory coherence adopts a multicriterion perspective on theory appraisal. Broadly speaking, three criteria are employed to determine the explanatory coherence of a theory. They are explanatory breadth, simplicity, and analogy. A theory has greater explanatory breadth than its rivals if it explains more facts than they do, it is simpler if it makes fewer ad hoc assumptions, and its explanatory value is enhanced by its analogical connection to already understood processes.
At a more specific level, determination of the relations of explanatory coherence in a theory is achieved by applying seven principles: symmetry, explanation, analogy, data priority, contradiction, competition, and acceptability. The theory of explanatory coherence is implemented in a connectionist computer program. Simulation studies of a number of prominent cases of theory appraisal in the history of science indicate that the theory of explanatory coherence is a reliable method for judging the best of competing explanations (Thagard, 1992).
Meehl on inference to the best explanation
It should be of interest to construct validity theorists to learn that, throughout his career, Meehl expended considerable effort in spelling out a satisfactory logic of theory testing. In his 1955 validity article with Cronbach, he spoke of the move from statements of evidence to constructs as broadly inductive in nature. Then, briefly influenced by Karl Popper and his falsificationist conception of the hypothetico-deductive method, and subsequently by Imre Lakatos’s neo-Popperian methodology of scientific research programs, Meehl (1990) went beyond them both and formulated a more liberal brand of theory appraisal. However, in arresting fashion, and seemingly opposed to his focus on theory testing, Meehl also stated that the appraisal of first-order scientific theories involved their ability to explain the facts. In one of his last published works on the subject (Meehl, 2002), he averred, “I am firmly convinced that [inference to the best explanation] is the core of all our reasoning in psychology and other empirical sciences” (p. 400). 13 The clue to understanding the apparent tension here is contained in Meehl and Waller’s (2002) following remark: “[Inference to the best explanation] is the reasoning required in all applications of significance tests and confidence intervals, all successful (or “close”) point predictions, and all surprising qualitative observations, so we are taking the principle for granted” (p. 297, emphasis added). I think Meehl’s position on theory appraisal was a witting mixture of hypothetico-deductive reasoning and abductive inference. Given that Meehl was mostly interested in deep-structural, or postulational, explanatory theories, it should come as no surprise that the relevance of explanatory criteria in evaluating their explanatory goodness was a methodological given for him. I believe that his commitment to the importance of inference to the best explanation has gone unnoticed as an important background presupposition in his outlook on theory appraisal. From the vantage point of Meehl’s perspective on cliometric metatheory, one gets a clear sense that he developed a view of theory appraisal that focused on explanatory goodness as well as predictive success in both the short- and long-run assessment of theories. I add that Meehl never embraced the idea that the context of discovery can be a methodological space in which abductive reasoning can occur (Rozeboom, 2005).
Combining inference to the best explanation and the hypothetico-deductive method
It is well known that, on its standard formulation, the hypothetico-deductive method judges the goodness of theories solely in terms of predictive accuracy. As previously noted, a central feature of the hypothetico-deductive method is its depiction of the relationship between theory and evidence as one of logical entailment. By contrast, the method of inference to the best explanation takes the relationship between theory and evidence to be explanatory in nature. 14
However, it is noteworthy that those who advocate or employ the hypothetico-deductive method sometimes employ evaluative criteria in addition to predictive success. These include simplicity, scope, and fertility. When a criterion, such as simplicity, is employed, an explanatory consideration is introduced, resulting in a combination of hypothetico-deductive reasoning and inference to the best explanation. A researcher who adopts a combined approach and who assigns a heavier weighting to prediction than to explanation is likely to regard their method as basically hypothetico-deductive in character. By contrast, a researcher who gives primary weighting to explanatory considerations will be inclined to see the approach as an implementation of inference to the best explanation. Either way, a theory that rates well on predictive success and explanatory worth is likely to be judged a better theory than a competing theory that satisfies just one of them.
The method of structural equation modeling is primarily portrayed as a model-testing procedure that is embedded within a hypothetico-deductive framework. However, the method is sometimes presented as a mix of hypothetico-deductive reasoning and inference to the best explanation. In particular, the goodness of fit of the model to the empirical evidence counts as a measure of empirical adequacy and thus satisfies a major requirement of the hypothetico-deductive method. At the same time, parsimony indices can be employed to provide a weighting of the fit statistics (Kaplan, 2000), thereby satisfying the demand for considering the criterion of explanatory worth.
Markus et al. (2008) instructively argued that structural equation modeling can be understood as an operational procedure for implementing inference to the best explanation. They took explanatory power, which is central to explanatory inference, to be a combination of model fit and model parsimony. They illustrated the employment of structural equation modeling in this way by evaluating the comparative worth of two competing factorial models of psychopathy.
It should be clear from the above sketchy treatment of the three abductive methods of exploratory factor analysis, analogical modeling, and inference to the best explanation, that the criteria they employ in the appraisal of theories will vary depending on the goals they serve and the investigative context in which they are located. In the context of theory generation, exploratory factor analysis confers judgments of initial plausibility on the theories it spawns, whereupon they are deemed worthy of further pursuit in their ongoing development and appraisal. In the context of theory development, the strategy of analogical modeling looks to go beyond the nascent state of factorial theories and helps model the process of operative causal mechanisms. It does this through a process of analogical abduction that, if successful, provides an augmented plausibility assessment of the theories in question. And in the context of theory appraisal, the systemic evaluation of well-developed competing theories is undertaken by using methods of inference to the best explanation.
Explanation-Centered Theory Validation: An Example
We are now in a position to see how the explanation-centered view of construct validation in psychology might be carried out in practice. On the perspective presented in the present article, the validation process is primarily seen as an exercise in comparative theory appraisal, where the contending theories are evaluated in terms of their explanatory coherence. An instructive example of this approach to theory appraisal was provided by Maier et al. (2023), who employed their own theory of explanatory coherence in order to compare the g-factor theory of intelligence (e.g., Jensen, 1998) with the dynamical theory of intelligence proposed by van der Maas et al. (2006). 15 Both of these are general, pragmatic theories of intelligence that compete to explain a number of relevant empirical phenomena.
Most immediately, the two theories compete to explain the robust empirical phenomenon of the positive manifold. This phenomenon captures the fact that almost all tests of ability correlate positively with one another to a significant degree. The g theory explains this fact primarily in terms of a latent variable (the g factor) interpreted as a common cause that gives rise to the positive manifold. By contrast, the dynamical theory explains the positive manifold in terms of mutualism, that is, a process of reciprocal causation between cognitive processes, which develop over time.
Importantly, the theory of explanatory coherence places considerable weight on the evaluative criterion of explanatory breadth, which refers to the number or classes of facts explained by a theory. Although both the g-factor theory and the dynamical theory explain more than the positive manifold (and, therefore, have explanatory breadth), Maier et al. (2023) showed that the dynamical theory explains a greater range of facts that the g-factor theory. For example, it can deal with the Flynn effect of intergenerational IQ score gains, which the g-factor theory cannot. Also, unlike the g-factor theory, the dynamical theory can explain a number of developmental effects, such as the fact that intelligence is difficult to predict from early childhood performance. Given that explanatory breadth is the major criterion of explanatory coherence, the authors concluded, for this reason among others, that the dynamical theory is better than the g-factor theory. This judgment is broadly consistent with van der Maas et al.’s (2006) evaluation of the merits of the two theories.
Notice with the example just given, that construct validation as an explanation-focused undertaking typically accepts that the measurements of “constructs” (the customary focus of test validation) has been successfully undertaken and that the focus is primarily on validating the merits of the competing theories; that is, the empirical phenomena in need of an explanation have been established and the competing theories that purportedly explain those facts are evaluated in terms of their explanatory coherence.
Kane’s Argument-Based Approach to Validation
I noted in my introduction that Kane’s (2013) argument-based approach to validation offers valuable guidance to validation researchers. However, it does not explicitly concern itself with foundational theories of validity. Before concluding my article, I will briefly indicate how my explanationist theory of construct validity might add value to Kane’s approach. Linking Kane’s formulation of validity practice to my own theoretical perspective might afford validity researchers the opportunity to embed elements of their Kane-inspired practice in a new and supportive theoretical framework.
Kane’s current model of argument-based validity considers both the interpretation of theory-based arguments and their application. Because my article restricts its attention to the nature of theories and their validation, I confine my attention to Kane’s theory-based phase. Further, Kane’s perspective on validity is broad in its scope. Here, I limit my attention to construct validation understood as an explanatory enterprise. Thus, in drawing links between my work and Kane’s work, I refer to mechanistic theories, dispositional theories, and the logic of abductive arguments and methods.
Kane’s approach to theory validation allows for a number of options: describing observable attributes without appeal to explanatory theory; formulating low-grade explanatory theories, such as unspecified trait ascriptions (causal or noncausal); and, constructing powerful explanatory theories that specify the causal mechanisms involved. However, it is noteworthy that Kane’s description of these options makes sparing use of the relevant foundational literatures in methodology.
From my perspective, Kane’s characterization of descriptive research accepts the popular but simplistic contrast between observation and theory without acknowledging the important role that empirical phenomena play in science. Recall that phenomena are abstracted from data and serve as evidence for theories. Also, Kane’s worry about the cognitive inferiority of trait theories can be allayed by regarding them as dispositional theories, in the sense described earlier. Dispositional theories are often regarded as explanatorily suspect, and Kane acknowledges that this can be a problem. However, properly understood, dispositional ascriptions have the value of extending our referential reach to the existence of new theoretical entities, even though they are characterized indirectly in terms of their presumed effects. Understood in this way, dispositional theories are assessed in terms of their initial plausibility, thereby providing a warrant for their further investigation and development. Such theories are deemed to have modest but genuine explanatory worth.
As noted earlier, a major development in recent philosophy of science has centered on understanding the nature of mechanistic theories—that is, causal process theories that detail how the components of mechanisms combine to produce their effects. Also noted earlier was the fact that Embretson (1983) subscribes to a causal process perspective on construct validation. Mechanistic theories offer an intuitively compelling account of an important class of causal explanatory theories. The explanatory coherentist view of theories considered earlier adopts a strategy for articulating causal mechanisms. It does this by constructing analogical models as a means for understanding the nature of the causal mechanisms under study. This is one way of putting real flesh on the bones of Kane’s general talk of the validation of causal explanatory theories.
Finally, I point out that the explanatory logic embedded in the abductive methods considered in my article provides an antidote to Kane’s tendency to cast the validation process in abstract hypothetico-deductive terms. I regard it as a feature of my article that it deals with abductive methods that are actually used in the construction of explanatory theories. The availability of these methods should be welcomed by practicing validity researchers.
Conclusion
Summary
In this article, I argued that further progress on the conceptual foundations of construct validity theory could be made by giving increased attention to the process of constructing and evaluating scientific theories rather than focusing on their constituent constructs. In order to do this, I suggested that it would be helpful to distinguish construct validation, understood as theory appraisal, from test validation, with its strong concern with measurement. Although measurement can undoubtedly have a bearing on theory appraisal, it is often not a direct part of that process. One of my two major claims was that the pragmatic view of the nature of psychological theories provides a realistic account of science as it is practiced. Most importantly, the pragmatic perspective accommodates the varied natures of psychological theories, whereas the standard logical empiricist alternative of nomological networks imposes a uniform treatment on them. All of the different types of pragmatic theory I identified provide explanations of empirical phenomena. With this important scientific goal in mind, I elaborated on my second major claim that abductive methods of theory appraisal, tailored to the explanatory natures of these theories, should feature prominently in our theory validation efforts. Accordingly, I identified and discussed a number of abductive methods that can be productively employed in such an undertaking.
Discussion
In this final section, I want to highlight three of the many important features of science that are underappreciated in the foundational literature on construct validity. 16 Taking them seriously would enrich our understanding of the nature of construct validation.
Scientific research is carried out by fallible creatures who, with limited cognitive resources, labor in imperfect social institutions to make sense of varied and complex subject matters. It should come as no surprise, then, that scientific knowledge is at once presumptive, conjectural, partial, and corrigible. These features of scientific knowledge are especially evident at the frontiers of scientific research. Frontier research is a neglected topic in the conceptual change literature on scientific theories (Nickles, 2009), and it has received little attention by construct validation theorists. Cryptically stated, frontier epistemology tolerates a high degree of epistemic uncertainty, is tasked with formulating ill-structured problems, regularly faces the demands of creative innovation, emphasizes the generative justification of knowledge claims, and has to settle for weak modes of reasoning and suboptimal methods. These characteristics of frontier research force one to reconsider the nature of conceptual change in science, including the phasic evaluation of theories.
In recognition of the importance of conceptual change in science, I want to draw attention to the relevance of the notion of epistemic iteration for the theory validation process: something that is explicitly acknowledge by both Feest (2020) and Schaffner (2020) in their above-mentioned work on construct validation. One of the many ways in which epistemic iteration occurs can be appreciated by noting, again, that the different methods of existential and analogical abduction as well as inference to the best explanation considered in this article confer epistemic appraisals of different strength. Existential abductions deliver judgments of initial plausibility. In the context of pursuit, analogical abductions bestow stronger plausibility assessments, whereas inferences to the best explanation confer still stronger evaluations of epistemic worth. Importantly, these progressive assessments can be organized into an extended theory evaluation process in which successive stages of knowledge build on the preceding stages in order to improve the credentials of the resulting products (Haig, 2014). Strategies of epistemic iteration are widely employed in good science as an engine of scientific progress (Chang, 2004). Psychologists would improve the field’s understanding of the construct validation process by taking strategies of epistemic iteration more seriously.
An examination of the conceptual foundations of construct validation theory cannot pass without a brief word about the relevance of the philosophy of scientific realism. I acknowledge that there are a number of contemporary nonrealist philosophies that deserve serious attention (Nickles, 2017). However, realist philosophical thinking has animated the foundational literature on construct validity more than any other philosophy, so I will limit my attention to it here. This prominent philosophy underpinned Cronbach and Meehl’s (1955) original article on construct validity as well as the work of a number of subsequent validity theorists (Slaney, 2017).
More often than not, validity theorists have made selective use of global scientific realism (e.g., Psillos, 1999), which is a philosophy of science thought to be applicable to all the sciences. However, it is important to realize that local formulations of realism have surfaced more recently, and they hold considerable promise of a better understanding of the various sciences (e.g., Asay, 2019; Haig & Evers, 2016; Kincaid, 2000; Mäki, 2005). I mentioned at the outset that Slaney (2017) promoted a discipline-specific brand of realism that she thinks appropriate for psychology’s subject matter. An additional local formulation of realism due to Mäki (2005) warrants attention here. Mäki has developed a wide-ranging and systematic form of realism tailored to the special characteristics of economics, but it has considerable relevance for psychology (Haig & Evers, 2016). In depicting realism in local terms, Mäki adopts a strategy that enhances the resourcefulness of realism: He proposes a minimal characterization of realism that will have global application. This is done with reference to the ideas of possible existence (considerable progress is required before one can express confidence in a new entity’s existence), mind dependence (large tracts of the social and behavioral sciences—e.g., beliefs, marriage, and money—are not mind independent, though they are science independent), and possible truth (our theories may not be true right now, but they might be true in the future).
Mäki’s minimal characterization of realism also asserts that there are no requirements about having to study unobservable entities or achieve technological success. Additionally, Mäki insists that formulations of realism will be discipline- or domain-specific, resulting in a number of local realisms. Importantly for him, all local realisms should meet the requirements of minimal global realism just mentioned, as well as heeding the peculiar characteristics of the discipline or field under study. The value for psychology of Mäki’s minimal realism is that it enables one to regard developing disciplines as genuine science, despite their slow and uneven progress, while offering a metatheory that preserves core insights from the broad literature on scientific realism. Exploring the strengths and limitations of different formulations of local scientific realism (and nonrealism, for that matter) hold considerable promise as useful resources for undergirding validity theory.
A final word: Over time, the notion of construct validity became bloated and unwieldy, making it difficult, if not impossible, to do justice to both construct and test validity at the same time. With this in mind, I argued for the strategic separation of construct validity and test validity in order to give greater attention to the former. This allowed me to highlight the importance of my preferred interpretation of construct validation as theory appraisal. Of course, construct validity and test validity should be brought back together in their more developed forms. Stone (2021) recently proposed that this rejoining might be done by deploying ideas in coherentist epistemology, in particular, Thagard’s (1992) theory of explanatory coherence. In fact, my comparison of the g-factor and mutualism theories of intelligence in terms of explanatory coherence points to one way in which this can be done.
Acknowledgments
I thank Bruno Zumbo and Fran Vertue for extensive helpful comments on an earlier draft of this article.
Many influential critics of logical empiricism targeted its standard formulation. They largely ignored the different formulations of this philosophy, the subtleties in the thinking of its major proponents, and the changes in their views over time. Mormann (2007) provides an instructive discussion of the important contributors to logical empiricism on these matters. My primary purpose in this article is not to criticize logical empiricism but to go beyond its usual formulation in characterizing the nature of scientific theory and its associated methods of appraisal. For this reason, my comments on logical empiricism are based on the standard account.
One index of the enormous influence of Cronbach and Meehl’s (1955) article on the social and behavioral sciences is its very high Google Scholar citation count. At the time of writing, it had garnered 17,730 citations. Relatedly, Campbell and Fiske’s (1959) companion article on the multitrait-multimethod matrix was cited 28,198 times.
See Slaney and Racine (2013) for an informative account of the history, variability, and problematic use of the term “construct” in psychology.
In his introduction to the first edition of The structure of scientific theories, Frederick Suppe (1977) declared that “It is only a slight exaggeration to claim that a philosophy of science is little more than an analysis of theories and their roles in the scientific enterprise” (p. 3). Since then, philosophers of science have given considerable attention to scientific experimentation and other diverse scientific practices. However, the nature and role of theories continues to be of considerable interest to philosophers of science, sometimes in conjunction with the current heavy focus on models in science.
As an interesting point of comparison, Markus and Borsboom’s (2013) extended examination of test validity theory (as distinct from construct validity theory) explicitly focuses on meaning, measurement, and causation as the essential components of test validity theory, and it does so without conflating them.
Partly because of this informality, many theories in psychology are typically subjected to weak forms of appraisal. The lack of stringent theory testing has led to a number of recent recommendations that theories in psychology should be cast in more rigorous formalized mathematical and/or computational structures. The special issue of Perspectives on Psychological Science mentioned earlier contains a number of articles that make this recommendation. For a detailed critical response to these proposals, see Oude Maatman (2021).
For instance, Poldrack and Yarkoni (2016) argued for the value of using formal cognitive ontologies to help define and test theories of mental structure. They predicted that the use of formal ontologies, and their associated large databases, would become increasingly common in psychology. More recently, the National Academies of Sciences, Engineering, and Medicine (2022) issued an extensive report aimed at improving the development and application of formal ontologies in the behavioral sciences.
In my view, construct validity theory has assigned too much importance to the meaning of terms. Plausibly, this has stemmed from the general but restrictive influence of semantic realism and the mistaken belief that we must define, even reach agreement on, key terms before inquiry can begin (this is not to deny the relevance of definition in specific inquiry contexts, such as the adoption of formal ontologies; see Note 7). In a discussion of construct validity, Cronbach (see Glymour, 1980) reported Meehl as saying we should probably abandon the troublesome term “defined.” Glymour (1980) stated that most philosophers of his time would agree that focusing on definition (and meaning) often resulted in conceptual confusion.
Empirical phenomena can be things other than empirical generalizations. For instance, cognitive science often attaches more importance to explaining capacities (e.g., the capacity to learn a second language) than empirical regularities. Because phenomena comprise different sorts of existents, it is more appropriate to emphasize the roles they play in explanation and prediction rather than in respect of their natures.
Skinner’s (1984) instrumentalist conception of scientific theories brings with it a strong reservation about the routine employment of the hypothetico-deductive method. He objected to its widespread use in psychology on the grounds that its logical reconstruction of the research process was often a misleading description of scientific inquiry as practiced and gave us false assurances about the credibility of the theories it tested. In its place, he advocated an inductive view of scientific inquiry and regarded theories as well-ordered summaries of empirical relationships. For Skinner, the causal explanatory import of these instrumentalist theories was confined to a consideration of environmental factors (Chiesa, 1992).
Although my article highlights exploratory factor analysis as an abductive method of theory generation, it does not depend on it. There are other abductive methods of theory generation, such as processes of induction (PI; Thagard, 1988) and grounded theory method, to name but two.
I note that although Zumbo (2009) does not focus directly on the inferential nature of inference to the best explanation, he does attend to statistical methods that facilitate drawing explanatory inferences. Additionally, he rightly emphasizes the importance of explanation as a pragmatic endeavor.
I believe that Meehl (2002) was correct in asserting that inference to the best explanation is widespread in human reasoning, but I think he went too far in claiming that it is the core of all of our reasoning. There are other forms of inference that occupy an important role in our reasoning practices, notably inductive inference (in its different forms) and deductive inference. Meehl most assuredly knew this. Perhaps he had in mind that inference to the best explanation features prominently in justifying relevant background knowledge, even when our immediate focus is on others forms of ampliative (content-increasing) inference.
Rozeboom (1997) forcefully argues against the use of the hypothetico-deductive method in psychological research. His criticisms speak to the infirmities of the method understood as a theory of confirmation. However, one can view hypothetico-deductivism as a theory of scientific method instead (Godfrey-Smith, 2021). On this reading, one is free to expand on its standard minimalist description, as I do here.
Maier et al.’s (2023) model of explanatory coherence is a revision of Thagard’s (1992) theory. It is based on the accessible Ising model of computation (Ising, 1925), which differs from the connectionist architecture of Thagard’s model. Maier et al. modified the various principles of Thagard’s model. They maintained that their model of explanatory coherence is an improvement on Thagard’s model because of its ability to deal with the changing methodological requirements of theory evaluation across contexts. For example, it can facilitate inference to the only explanation, where a theory has no genuine competitor.
These might include further work on network formulations of theories (Borsboom et al., 2022), examination of different theory/theory relations (e.g., incorporation, sublation, replacement, and disregard; Thagard, 1992), and foundational work on alternative methods and methodologies (Haig, 2018).
Transparency
Action Editor: Leonel Garcia-Marques
Editor: Interim Editorial Panel
The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article.
References
- Abrantes P. (1999). Analogical reasoning and modeling in the sciences. Foundations of Science, 4, 237–270. [Google Scholar]
- Alexandrova A., Haybron D. M. (2016). Is construct validation valid? Philosophy of Science, 83, 1098–1109. [Google Scholar]
- American Psychological Association. (1954). Technical recommendations for psychological tests and diagnostic techniques. Psychological Bulletin, 51(2, Pt. 2), 1–38. 10.1037/h0053479 [DOI] [PubMed] [Google Scholar]
- Andersen H. (2014. a). A field guide to mechanisms: Part 1. Philosophy Compass, 9, 274–283. [Google Scholar]
- Andersen H. (2014. b). A field guide to mechanisms: Part 2. Philosophy Compass, 9, 284–293. [Google Scholar]
- Asay J. (2019). Going local: A defense of methodological localism about scientific realism. Synthese, 196, 587–609. [Google Scholar]
- Bailer-Jones D. M. (2009). Scientific models in philosophy of science. University of Pittsburgh Press. [Google Scholar]
- Barker G., Kitcher P. (2014). Philosophy of science: A new introduction. Oxford University Press. [Google Scholar]
- Bechtel W., Richardson R. C. (2010). Discovering complexity: Decomposition and localization as strategies of scientific research (2nd ed.). MIT Press. [Google Scholar]
- Bogen J., Woodward J. (1988). Saving the phenomena. Philosophical Review, 97, 303–352. [Google Scholar]
- Bogen J., Woodward J. (2005). Evading the IRS. In Jones M. R., Cartwright N. (Eds.), Idealization XII: Correcting the model (pp. 233–268). Rodopi. [Google Scholar]
- Borsboom D. (2006). The attack of the psychometricians. Psychometrika, 71, 425–440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Borsboom D., Cramer A. O. J., Fried E. I., Isvoranu A.-M., Robinaugh D. J., Dalege J., van der Maas H. L. J. (2022). Network perspectives. In Isvoranu A.-M., Eskamp S., Waldorp L., Borsboom D. (Eds.), Network psychometrics with R (pp. 9–27). Routledge. [Google Scholar]
- Borsboom D., Cramer A. O. J., Klievit R. A., Zand Scholten A., Franic S. (2009). The end of construct validity. In Lissitz R. W. (Ed.), The concept of validity: Revisions, new directions, and applications (pp. 135–170). Information Age Publishers. [Google Scholar]
- Borsboom D., Mellenbergh G. J., van Heerden J. (2004). The concept of validity. Psychological Review, 111, 1061–1071. [DOI] [PubMed] [Google Scholar]
- Borsboom D., van der Maas H. L. J., Dalege J., Kievit R. A., Haig B. D. (2021). Theory construction methodology: A practical framework for building theories in psychology. Perspectives on Psychological Science, 16, 756–766. [DOI] [PubMed] [Google Scholar]
- Campbell D. T., Fiske D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105. [PubMed] [Google Scholar]
- Cartwright N. (1983). How the laws of physics lie. Clarendon Press. [Google Scholar]
- Cartwright N., Alexandrova A., Efstathiou S., Hamilton A., Muntean I. (2005). Laws. In Jackson F., Smith M. (Eds.), The Oxford handbook of contemporary philosophy (pp. 792–818). Oxford University Press. [Google Scholar]
- Chang H. (2004). Inventing temperature: Measurement and scientific progress. Oxford University Press. [Google Scholar]
- Chiesa M. (1992). Radical behaviorism and scientific frameworks: From mechanistic to relational accounts. American Psychologist, 47, 1287–1299. [DOI] [PubMed] [Google Scholar]
- Cronbach L. J., Meehl P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281–302. [DOI] [PubMed] [Google Scholar]
- Darden L. (2006). Reasoning in biological discoveries: Essays on mechanisms, interfield relations, and anomaly resolution. Cambridge University Press. [Google Scholar]
- Darwin C. (1958). On the origins of species (6th ed.). Mentor. [Google Scholar]
- Embretson S. (1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin, 93, 179–197. [Google Scholar]
- Feest U. (2020). Construct validity in psychological tests – The case of implicit social cognition. European Journal for the Philosophy of Science, 10, Article 4. 10.1007/s13194-019-0270-8 [DOI] [Google Scholar]
- Feigl H. (1970). The “orthodox” view of theories: Remarks in defense as well as critique. In Radner M., Winokur S. (Eds.), Theories and methods of physics and psychology (pp. 3–16). University of Minnesota Press. [Google Scholar]
- Frigg R. (2023). Models and theories: A philosophical inquiry. Routledge. [Google Scholar]
- Gadenne V. (2016). Is Popper’s third world autonomous? Philosophy of the Social Sciences, 46, 288–303. [Google Scholar]
- Giere R. N. (1988). Explaining science: A cognitive approach. University of Chicago Press. [Google Scholar]
- Giere R. N. (1999). Science without laws. University of Chicago Press. [Google Scholar]
- Glymour C. (1980). The good theories do (with discussion). In Maslow A. P., McKillip R. H., Thatcher M. (Eds.), Construct validity in psychological measurement (pp. 13–21). Educational Testing Service. [Google Scholar]
- Godfrey-Smith P. (2021). Theory and reality: An introduction to the philosophy of science (2nd ed.). University of Chicago Press. [Google Scholar]
- Goffman E. (1969). The presentation of self in everyday life. Penguin. [Google Scholar]
- Haig B. D. (1999). Construct validation and clinical assessment. Behaviour Change, 16, 64–73. [Google Scholar]
- Haig B. D. (2005). Exploratory factor analysis, theory generation, and scientific method. Multivariate Behavioral Research, 40, 303–329. [DOI] [PubMed] [Google Scholar]
- Haig B. D. (2014). Investigating the psychological world: Scientific method in the behavioral sciences. MIT Press. [Google Scholar]
- Haig B. D. (2018). Method matters in psychology: Essays in applied philosophy of science. Springer Nature. [Google Scholar]
- Haig B. D. (2022). Abductive research methods in psychological science. In Magnani L. (Ed.), Handbook of abductive cognition. Springer Nature. 10.1007/978-3-030-68436-5_64-1 [DOI] [Google Scholar]
- Haig B. D., Evers C. W. (2016). Realist inquiry in social science. Sage. [Google Scholar]
- Harré R. (1976). The constructive role of models. In Collins L. (Ed.), The use of models in the social sciences (pp. 16–43). Tavistock. [Google Scholar]
- Harré R., Secord P. F. (1972). The explanation of social behavior. Blackwell. [Google Scholar]
- Hempel C. G. (1965). Aspects of scientific explanation and other essays in the philosophy of science. Free Press. [Google Scholar]
- Hooker C. A. (1975). On global theories. Philosophy of Science, 42, 152–179. [Google Scholar]
- Ising E. (1925). Bietrag zur theorie des feromagnetismus [Contributions to the theory of ferromagnetism]. Zeitschrift für Physik, 31, 253–258. [Google Scholar]
- Jensen A. R. (1998). The g factor: The science of mental ability. Praeger. [Google Scholar]
- Kane M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–73. [Google Scholar]
- Kaplan D. (2000). Structural equation modeling: Foundations and extensions. Sage. [Google Scholar]
- Kincaid H. (2000). Global arguments and local realism about the social sciences. Philosophy of Science (Supplement), 67, 667–678. [Google Scholar]
- Landy F. J. (1986). Stamp collecting versus science: Validation as hypothesis testing. American Psychologist, 41, 1183–1192. [Google Scholar]
- Leonelli S. (2016). Data-centric biology: A philosophical study. University of Chicago Press. [Google Scholar]
- Maier M., van Dongen N., Borsboom D. (2023). Comparing theories with the Ising model of explanatory coherence. Psychological Methods. Advance online publication. 10.1037/met0000543 [DOI] [PubMed]
- Mäki U. (2005). Reglobalizing realism by going local, or (how) should our formulations of scientific realism be informed about the sciences? Erkenntnis, 63, 231–251. [Google Scholar]
- Markus K. A., Borsboom D. (2013). Frontiers of test validity theory: Measurement, causation, and meaning. Routledge. [Google Scholar]
- Markus K. A., Hawes S. W., Thasites R. J. (2008). Abductive inferences to psychological variables: Steiger’s question and best explanations of psychopathy. Journal of Clinical Psychology, 64, 1069–1088. [DOI] [PubMed] [Google Scholar]
- Meehl P. E. (1990). Appraising and amending theories: The strategy of Lakatosian defense and two principles that warrant it. Psychological Inquiry, 1, 108–141. [Google Scholar]
- Meehl P. E. (2002). Cliometric metatheory II: Criteria scientists use in theory appraisal and why it is rational to do so. Psychological Reports, 91, 339–404. [DOI] [PubMed] [Google Scholar]
- Meehl P. E., Waller N. G. (2002). The path analysis controversy: A new statistical approach to strong appraisal of verisimilitude. Psychological Methods, 7, 283–300. [DOI] [PubMed] [Google Scholar]
- Mitchell S. D. (1997). Pragmatic laws. Philosophy of Science, 64, S468–S479. [Google Scholar]
- Mitchell S. D. (2000). Dimensions of scientific law. Philosophy of Science, 67, 242–265. [Google Scholar]
- Mormann T. (2007). The structure of scientific theories in logical empiricism. In Richardson A., Uebel T. (Eds.), The Cambridge companion to logical empiricism (pp. 136–162). Cambridge University Press. [Google Scholar]
- Mulaik S. A. (2009). Foundations of factor analysis (2nd ed.). Chapman & Hall/CRC. [Google Scholar]
- National Academies of Sciences, Engineering, and Medicine. (2022). Ontologies in the behavioral sciences: Accelerating research and the spread of knowledge. National Academies Press. [PubMed] [Google Scholar]
- Nickles T. (2009). Life at the frontier: The relevance of heuristic appraisal to policy. Axiomathes, 19, 441–464. [Google Scholar]
- Nickles T. (2017). Cognitive illusions and nonrealism: Objections and replies. In Agazzi E. (Ed.), Varieties of scientific realism: Objectivity and truth in science (pp. 151–163). Springer. [Google Scholar]
- Oude Maatman F. J. W. (2021). Psychology’s theory crisis, and why formal modelling cannot solve it. PsyArXiv. 10.31234/osf.io/puqvs [DOI] [Google Scholar]
- Poldrack R. A., Yarkoni T. (2016). From brain maps to cognitive ontologies: Informatics and the search for mental structure. Annual Review of Psychology, 67, 587–612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Popper K. R. (1972). Objective knowledge: An evolutionary approach. Clarendon Press. [Google Scholar]
- Prouix T., Morey R. D. (2021). Beyond statistical ritual: Theory in psychological science. Perspectives on Psychological Science, 16, 671–681. 10.1177/17456916211017098 [DOI] [PubMed] [Google Scholar]
- Psillos S. (1999). Scientific realism: How science tracks the truth. Routledge. [Google Scholar]
- Rozeboom W. W. (1984). Dispositions do explain: Picking up the pieces after Hurricane Walter. In Royce J. R., Mos L. P. (Eds.), Annals of theoretical psychology (Vol. 1, pp. 205–224). Plenum. [Google Scholar]
- Rozeboom W. W. (1997). Good science is abductive, not hypothetico-deductive. In Harlow L. L., Mulaik S. A., Steiger J. H. (Eds.), What if there were no significance tests? (pp. 335–391). Erlbaum. [Google Scholar]
- Rozeboom W. W. (2005). Meehl on metatheory. Journal of Clinical Psychology, 61, 1317–1354. [DOI] [PubMed] [Google Scholar]
- Schaffner K. F. (2020). A comparison of two neurobiological models of fear and anxiety: A “construct validity” application? Perspectives on Psychological Science, 15, 1214–1227. [DOI] [PubMed] [Google Scholar]
- Skinner B. F. (1984). Methods and theories in the experimental analysis of behavior. Behavioral and Brain Sciences, 7, 511–546. [Google Scholar]
- Slaney K. L. (2017). Validating psychological constructs: Historical, philosophical, and practical dimensions. Palgrave. [Google Scholar]
- Slaney K. L., Racine T. P. (2013). What’s in a name? Psychology’s ever evasive construct. New Ideas in Psychology, 31, 4–12. [Google Scholar]
- Sober E. (1988). The principle of the common cause. In Fetzer J. H. (Ed.), Probability and causality (pp. 211–229). Reidel. [Google Scholar]
- Stone C. (2019). A defense and definition of construct validity in psychology. Philosophy of Science, 86, 1250–1261. [Google Scholar]
- Stone C. M. (2021). Psychological construct validity (Publication No. 2463) [Doctoral dissertation, Washington University in St. Louis]. Open Scholarship Arts & Sciences Electronic Theses and Dissertations. https://openscholarship.wustl.edu/art_sci_etds/2463 [Google Scholar]
- Suppe F. (Ed.). (1977). The structure of scientific theories (2nd ed.). University of Illinois Press. [Google Scholar]
- Suppe F. (2000). Understanding scientific theories: An assessment of developments, 1969–1998. Philosophy of Science, 67, S102–S115. [Google Scholar]
- Teigen K. H. (2002). One hundred years of laws in psychology. American Journal of Psychology, 115, 103–118. [PubMed] [Google Scholar]
- Thagard P. (1988). Computational philosophy of science. MIT Press. [Google Scholar]
- Thagard P. (1992). Conceptual revolutions. Princeton University Press. [Google Scholar]
- Thagard P. (2005). What is a medical theory? In Paton R., McNamara L. (Eds.), Multidisciplinary approaches to theory in medicine (pp. 47–62). Elsevier. [Google Scholar]
- van der Maas H. L. J., Dolan C. V., Grasman R. P. P. P., Wicherts J. M., Huizenga H. M., Raijmakers M. E. J. (2006). A dynamical model of general intelligence: The positive manifold of intelligence by mutualism. Psychological Review, 113, 842–861. [DOI] [PubMed] [Google Scholar]
- Winther R. G. (2020). The structure of scientific theories. In Zalta E. N. (Ed.), Stanford Encyclopedia of philosophy (Spring 2021 ed.). https://plato.stanford.edu/archives/spr2021/entries/structure-scientific-theories
- Woodward J. (1989). Data and phenomena. Synthese, 79, 393–472. [Google Scholar]
- Wright C., Bechtel W. (2007). Mechanisms and psychological explanation. In Thagard P. (Ed.), Philosophy of psychology and cognitive science (pp. 31–79). Elsevier. [Google Scholar]
- Zumbo B. D. (2007). Validity: Foundational issues and statistical methodology. In Rao C. R., Sinharay S. (Eds.), Handbook of statistics: Vol. 26. Psychometrics (pp. 45–79). Elsevier. [Google Scholar]
- Zumbo B. D. (2009). Validity as contextualized and pragmatic explanation, and its implications for validation practice. In Lissitz R. W. (Ed.), The concept of validity: Revisions, new directions, and applications (pp. 65–82). Information Age Publishers. [Google Scholar]
