A Mixture Approach to Vagueness and Ambiguity

Steven Verheyen; Gert Storms

doi:10.1371/journal.pone.0063507

. 2013 May 7;8(5):e63507. doi: 10.1371/journal.pone.0063507

A Mixture Approach to Vagueness and Ambiguity

Steven Verheyen ^1,^*, Gert Storms ¹

Editor: Kevin Paterson²

PMCID: PMC3646747 PMID: 23667627

Abstract

When asked to indicate which items from a set of candidates belong to a particular natural language category inter-individual differences occur: Individuals disagree which items should be considered category members. The premise of this paper is that these inter-individual differences in semantic categorization reflect both ambiguity and vagueness. Categorization differences are said to be due to ambiguity when individuals employ different criteria for categorization. For instance, individuals may disagree whether hiking or darts is the better example of sports because they emphasize respectively whether an activity is strenuous and whether rules apply. Categorization differences are said to be due to vagueness when individuals employ different cut-offs for separating members from non-members. For instance, the decision to include hiking in the sports category or not, may hinge on how strenuous different individuals require sports to be. This claim is supported by the application of a mixture model to categorization data for eight natural language categories. The mixture model can identify latent groups of categorizers who regard different items likely category members (i.e., ambiguity) with categorizers within each of the groups differing in their propensity to provide membership responses (i.e., vagueness). The identified subgroups are shown to emphasize different sets of category attributes when making their categorization decisions.

Vagueness and Ambiguity

Many - if not most - of the words we regularly use are vague. Although we are familiar with their meaning, we can at times be uncertain whether they apply to a particular instance or not. Adjectives like tall and bald are textbook examples [1]–[3]. We have no difficulty distinguishing women who are clearly tall from women who are clearly not, but might find it difficult to say whether a woman who is slightly above average height is tall or not. A man with zero hairs is definitely bald and a man with a full head of hair is definitely not. Many a hairdo resides at the boundary of bald and not bald though.

Vagueness is not restricted to adjectives. Nouns too pick out categories with vague boundaries, which leaves the membership status of some instances unclear [4]–[10]. Perhaps the example that comes easiest to mind is that of the tomato. Is it a vegetable or is it not? Research in psychology has shown that individuals answer this question differently, despite having all indicated that they know the meaning of the terms involved [11], [12]. Inter-individual differences in the answer to category membership questions like these have become an important source of information about the vagueness of words, dating back to the nineteen-thirties [13].

Both in philosophy and psychology the vagueness of language terms is generally thought to result from individuals adopting different cut-offs in separating category members from non-members [3]. According to these prevailing views a word like tall is vague because individuals differ with respect to the height from which they start to call women tall. While one individual may require a woman to be 175 cm in height to be termed tall, someone else may require a woman to be at least 180 cm. In this example, one would of course expect that the proportion of individuals who call a woman tall increases with her height. Typically, candidate instances for a vague word can be organized along a particular dimension of variation (e.g., height), with the word being endorsed more often the further an instance is positioned along the relevant dimension.

In treatments of vagueness one often presumes to know the dimension of variation along which the cuts are made. For tall it might appear trivial that one would use height as a criterion. For bald the issue is already less trivial. One could suggest the number of hairs on one's scalp as a criterion for determining whether someone is rightfully called bald or not. However, the position of the hairs along the scalp and the manner in which the hairs are organized might also matter (see the lengths to which some men go to construct elaborate comb-overs). The issue is even more pronounced for nouns. Membership in noun categories is not determined by necessary and sufficient criteria, but is based on a number of attributes that are merely characteristic of the category [14]. Wittgenstein's treatment of games has become the paradigmatic example [15]. Games share attributes, but there is not one attribute they all have in common. Many games have a competitive element, for instance, but not all (e.g., solitaire). Like most games, solitaire does come with rules and is played for amusement. Instances are regarded better examples of a category, the more of these shared characteristic attributes they possess [16], [17].

The existence of several dimensions of variation opens the door to a notion that is related to, but different from vagueness: ambiguity [3], [18], [19]. Ambiguity arises when individuals employ different criteria to determine whether a word applies to an instance or not. For instance, a difference of opinion as to whether a man is truly bald is the result of ambiguity, not vagueness when one judge uses the number of hairs as a criterion and the other the distribution of hairs across the scalp. Note that the resolution of ambiguity does not resolve vagueness. Even if one agrees that the number of hairs should be used as the criterion for baldness, vagueness may persist when individuals do not agree on the required number of hairs.

The example above employs two dictionary senses of bald (<lacking hair on the head> and <lacking a natural or usual covering>), but ambiguity may be more subtle. The literature on noun categories contains ample evidence that even one's current goals or interests and recent or typical interactions with instances of a category can affect what attributes are accentuated in semantic categorization [20]–[23]. As a result it has proven notoriously difficult to disentangle vagueness and ambiguity [24]. Many accounts of vagueness therefore shy away from the problem by presuming the dimension of variation along which the cuts are made to be known or by leaving it unspecified. That is, they prefer to focus on how vagueness may persist when everyone is assumed to use the same criteria to judge whether a word applies to a particular instance.

In what follows we too want to propose an account of vagueness, but one that does not discard ambiguity from the start. Rather, the account allows for the identification of groups of individuals who employ different dimensions of variation to judge whether a word applies to a particular instance or not. Within each of the identified groups vagueness is thought to arise from individuals adopting different cut-offs along a common dimension of variation. Both in the exposition of the account and in its application we have decided to focus on noun categories because, as was noted above, the vagueness-ambiguity issue seems more pronounced for these parts of speech. The application to other word classes is straightforward, however. As we will see, for the approach to work one only requires categorization decisions towards a set of candidate instances that elicit considerable inter-individual differences. Such borderline items constitute the natural choice of materials in any study on vagueness.

Introduction to the Approach

Imagine we asked a number of individuals to judge whether the label sports applies to a set of candidate items. To start off, we assume that there is no ambiguity in play, only vagueness. That is, everyone uses the same criteria to decide on category membership. The use of different cut-offs to separate members from non-members is the only source of inter-individual differences in categorization. Say all respondents require candidate instances to be activities that are physically demanding. Certain individuals may want to see more evidence of this requirement than others, though. Seeing that hiking is physically more demanding than darts is, some respondents might only deem hiking physical enough to be considered a sport, while others might find both darts and hiking demanding enough. Across all respondents one would expect hiking to be more often endorsed than darts as it meets the category requirements better.

It is clear that in this hypothetical example the individuals' response patterns are informative with respect to the dimension of variation along which the cuts are made. Notably, the responses of any individual would follow a Guttman structure if they were arranged according to the proportion of individuals who endorsed them as category members. A Guttman structure with n entries consists of a series of k zeros, followed by a series of n–k ones (e.g., Inline graphic ). The order of instances is invariant across individuals. It suggests that all individuals employ the same criteria to decide whether to endorse an instance or not, with a higher probability of being endorsed, the better an instance meets the requirements. The value of k may differ between individuals, suggesting we are dealing with a vague concept without generally agreed upon cut-off between members and non-members. For instance, patterns Inline graphic and would indicate that the first respondent imposes a higher requirement than the second respondent does. In our hypothetical example we presumed that all individuals were judging category membership based on how demanding an activity is. The more demanding activities would then be placed more to the right if activities were to be arranged according to the proportion of individuals who endorsed them as sports. If one were not to know beforehand the dimension of variation individuals were using to judge category membership, one could thus infer it from the Guttman structure by establishing that activities are positioned more to the right (are endorsed more often) the more demanding they are.

Let us now assume that the concept of sports is not only vague, but also ambiguous. For instance, among the respondents there are those who feel a sport is an activity that is physically demanding and those who place more emphasis on elements such as rules or competition. These distinct views on what are considered representative category members are expected to result in marked categorization differences between the groups, because instances that satisfy some requirements, do not necessarily satisfy others. Among the former individuals hiking is expected to be more frequently endorsed as an example of sport than darts is, seeing that it is the physically more demanding activity. Among the latter individuals darts is expected to be more frequently endorsed than hiking is since the rule and competition requirements are better met by darts than by hiking. Disagreements like the one whether hiking or darts is the more likely sport can thus be capitalized on to identify ambiguity. In groups that take distinct views on category membership, arranging the candidate instances according to categorization proportions is likely to yield different organizations with different interpretations of the dimensions of variation used to decide on category membership.

Of course, when one has actual categorization decisions at one's disposal one does not know the extent to which the inter-individual differences in categorization are the result of vagueness and ambiguity. One does not know whether there are individuals who employ different criteria for categorization or who employs which criteria. One would like to check the respondent sample for the existence of latent groups, with the understanding that individuals within a group display consistent categorization behavior (i.e., share the same criteria) that is different from the categorization behavior of other groups (i.e., they employ different criteria). Mixture models are appropriate tools to accomplish this. Mixture models are statistical models for representing the presence of sub-populations within an overall population, without requiring that the observed data set should identify the subpopulation to which an individual belongs [25].

The mixture modeling framework we propose allows one to partition a participant sample in a number of latent groups that are different in terms of the dimension of variation they use to judge category membership (i.e., ambiguity). To do so it capitalizes on the insight that the use of different criteria is likely to result in different organizations of the candidate instances (see above). Contrary to our hypothetical example, however, the different organizations of candidate instances are arrived at without prior knowledge of the criteria that are in use in the participant sample or any other a priori division of the participants. Instead, they are inferred from the data. Participants who are placed together in a group are consequently understood to use the same criteria for categorization. These categorizers can, however, display varying degrees of propensity to endorse items as category members (i.e., vagueness).

We will use the mixture approach to verify whether earlier approaches have justly discarded ambiguity when accounting for the vagueness of natural language categories. When our approach suggests a categorization data set is best accounted for by a single group, there is no evidence for ambiguity. All the inter-individual categorization differences are then due to vagueness. When a solution with more than one group is retained, this constitutes evidence for ambiguity in addition to vagueness. Categorization differences between individuals in the same group are believed to be due to vagueness. Categorization differences between individuals from different groups are believed to be due to ambiguity.

The goal of Study 1 is to establish the extent to which vagueness and ambiguity are responsible for the inter-individual differences present in an apparently homogeneous sample of categorizers. Foreshadowing our main result, it will allow us to demonstrate that contrary to what is customarily assumed these inter-individual differences are not only due to vagueness, but to ambiguity as well. With the identification of latent groups of categorizers who employ different criteria for categorization, the nature of these criteria differences (i.e., the dimensions of variation employed) are not yet identified, however. As we already mentioned, it is straightforward to investigate these differences by inspecting the relative positions of the instances and relating these positions to external information one might have about the instances. The goal of Study 2 is therefore to uncover the nature of the between-group categorization differences. The different subgroups will be shown to emphasize different sets of category attributes when making their categorization decisions. Before turning to the details of Studies 1 and 2, we will elaborate on the formal details of our approach.

Model Details

From the general class of mixture models we propose to use a mixture item response theory model [26], [27]. Mixture item response theory models are traditionally employed to assess individuals' aptitudes and dispositions in response to a number of test items. However, the one-group variant has also been applied to semantic categorization [28], [29]. With traditional approaches to vagueness this particular model has in common that it discards ambiguity from the start and presumes that only vagueness is into play. That is, it assumes that all individuals command the same category requirements and differ only in the degree to which it needs to be expressed in an instance to endorse it as a category member. Thus, participants do not differ in terms of the criteria they use, but may do so in the categorization cut-off they employ. This particular model uses the information that is contained in the individuals' response patterns to organize both individuals and instances along a latent dimension, much like the procedure that was outlined for our hypothetical example did. The main difference with this earlier model is that in the mixture model the assumption that all participants adhere to the same dimension is relaxed. Instead, it is assumed that the participants divide in subgroups with a different organization of instances each. Within each subgroup, individuals are still thought to differ in terms of the employed categorization cut-off. The model in Equation (1), then, is a mixture of differently parameterized vagueness-only models of the kind that has already been applied to semantic categorization [28], [29]. It allows for ambiguity in addition to vagueness.

Binary categorization decisions Inline graphic constitute the input for the mixture model. takes value 1 when categorizer c decides that instance i is a member of the target category and takes value 0 when c decides that i is not a member. Every one of these categorization decisions is considered the outcome of a Bernoulli trial with the probability of a positive categorization response:

(1)

The model in Equation (1) organizes the candidate instances along a dimension according to their likelihood of being endorsed. A separate dimension is extracted for each group g of categorizers that is inferred from the data. Inline graphic indicates the position of instance i along the dimension for group g. Higher values for indicate instances that are more likely to be endorsed. It is assumed that individuals in a group employ the same criteria, and that the organization of the instances can thus be conceived of as reflecting the extent to which they meet these requirements. The better an instance meets the criteria, the more likely it is to be endorsed and consequently the higher its Inline graphic estimate.

Groups with different criteria will value different attributes in instances, which in turn will affect the relative likelihood with which various instances are endorsed. The model therefore identifies subgroups that require separate Inline graphic estimates. An instance i that meets many of the criteria of group g will often be selected by the members of g, resulting in a high estimate. The same instance might just meet a couple of the criteria of a different group g'. As i will then not be endorsed by the members of g' the estimate of Inline graphic will be low.

Individuals who employ the same criteria may still differ regarding the number of instances that make up their selection, depending on the cut-off they use in separating members from non-members. They may select a large or small number of items, depending on whether they require instances to meet the requirements to a small or to a large extent, respectively. Above, we identified the latent dimension with the membership criteria and the positions of instances along the dimension with the extent to which they meet these criteria. In a similar vein, individuals are awarded a position along the dimension, indicating the extent to which they require instances to meet the criteria in order to be endorsed. In Equation (1) Inline graphic indicates the position of categorizer c along the dimension for the group c is placed in. With the positions of the instances fixed for all categorizers that belong to the same group, high estimates (i.e., high standards) correspond to small extensions, while low estimates (i.e., low standards) correspond to large extensions. The Inline graphic 's in Equation (1) thus capture the degree of liberalness/conservatism categorizers display.

The model in Equation (1) is a probabilistic one. It requires that individuals' response patterns have a probabilistic Guttman structure. That is, an instance that is positioned to the right of the cut-off will not necessarily be endorsed as a category member. Neither does a position to the left of the cut-off imply that the instance will definitely not be endorsed as a category member. Each categorization decision is considered the outcome of a Bernoulli trial with the probability of a positive categorization response determined by the relative position of instance and cut-off. An instance is more likely to be endorsed as a category member the more to the right of the cut-off it is positioned. The more to the left of the cut-off an instance is positioned, the less likely it will be endorsed. Across respondents the probability of selection increases from left to right. A separate Inline graphic for each group determines the shape of the response function that relates the unbounded extent to which an object surpasses/falls short of the cut-off () to the probability of categorization (bounded between 0 and 1). Unlike the 's and the 's, the 's in Equation (1) can only take on positive values.

Study 1: Modeling Inter-Individual Differences in Categorization

In Study 1 we will revisit the semantic categorization data the vagueness-only model has already been applied to [29], using the mixture version of the model in Equation (1). We will consider solutions with 1, 2, 3, 4, and 5 latent subgroups. The one-group solution is the equivalent of the vagueness-only analysis that has already been undertaken. For there to be evidence in favor of both ambiguity and vagueness, an analysis of the categorization data with the mixture model would have to yield at least two subgroups of categorizers.

Method

Ethics statement

Study 1 was conducted with the approval of the review board of the University of Leuven. Written informed consent was obtained from all participants.

Participants

Two hundred and fifty first year psychology students at the University of Leuven completed a categorization task as part of a course requirement.

Materials

The materials consisted of 8 categories with 24 items each. The categories included two animal categories (fish and insects), two artifact categories (furniture and tools), two borderline artifact-natural-kind categories (fruits and vegetables), and two activity categories (sciences and sports). The corresponding items included both clear members, clear non-members, and borderline cases. Note that throughout the text we will continue to employ an italic typeface to denote items and a small capital typeface to denote categories.

Procedure

Each of the participants was handed an eight page questionnaire to fill out. They were told to carefully read through the 24 items on each page and to decide for each item whether or not it belonged in the category printed on top of the page. Participants indicated their answer by either circling 1 for member or 0 for non-member. Five different orders of category administration were combined with 2 different orders of item administration, resulting in 10 different questionnaires. Each of these was filled out by 25 participants. The categorization data are available for download from the first author's website: http://ppw.kuleuven.be/concat/.

Model analyses

Each category's categorization data were analyzed separately using the model in Equation (1). For every category solutions with 1, 2, 3, 4, and 5 latent subgroups were obtained. This was done using WinBUGS [30] following the procedures that have been outlined for the Bayesian estimation of mixture item response theory models [31]. These include suggestions for the specification of the priors for the model parameters, which we adopted:

with G the number of latent groups (1 to 5), I the number of candidate items (24 for each category) and C the number of categorizers (250 for each category). Latent group membership was parameterized as a multinomially distributed random variable with Inline graphic reflecting the probability of membership in subgroup g. is the latent variable that does the group assignment.

The results are based on 3 chains of 10,000 samples each, with a burn-in of 4,000 samples. The chains were checked for convergence and label switching. All reported values are posterior means, except for group membership which is based on the posterior mode of Inline graphic .

Results

Model comparisons

When determining the required number of latent groups, both fit and complexity need to be considered [32]. With additional subgroups come additional parameter estimates. That might provide for an improvement in fit, but not necessarily for a better understanding. The resulting account might end up being too complex for the data. To determine the suitable number of latent groups we will rely upon the Bayesian Information Criterion (BIC). The BIC provides an indication of the balance between goodness-of-fit and model complexity for every solution [33]. Results of a recovery study have shown that the BIC can be used to choose among mixture solutions with a different number of subgroups [31]. The solution to be preferred is that with the lowest BIC. Table 1 holds for every category five BIC values, corresponding to five partitions of increasing complexity. For each category the lowest BIC is set in bold typeface.

Table 1. BIC values for five partitions of the categorization data.

	BIC
Category	1 group	2 groups	3 groups	4 groups	5 groups
fish	3940	3780	3794	3971	4096
fruits	3304	3464	3613	3762	3911
furniture	3438	3602	3750	3586	3735
insects	4546	4406	4483	4626	4776
sciences	4398	4140	4032	4170	4311
sports	3531	3290	3413	3562	3711
tools	3826	3606	3726	3873	4020
vegetables	3490	3610	3757	3906	4054

Open in a new tab

There were three categories for which the BIC indicated that a one-group solution was to be preferred. This was the case for fruits, vegetables, and furniture. This suggests that the inter-individual categorization differences for these categories only reflect vagueness. The categorizers employed different cut-offs, but used the same criteria to determine category membership. For the remainder of the categories the BIC indicated that multiple groups could be discerned among the categorizers. In the case of insects, sports, fish, and tools the BIC suggested there were two such groups. In the case of sciences the BIC suggested there were three. Hence, for five of the eight categories there is evidence for ambiguity in addition to vagueness. The categorizers could be partitioned in groups that organized the items differently with respect to the target category. This suggests they employed different criteria for category membership (ambiguity). Within each group the categorizers employed different cut-offs (vagueness).

The participants divided in two groups of about equal size for the categories of sports and tools. In the case of sports one group comprised 51% of the participants. The second group comprised the remaining 49%. In the case of tools these percentages equaled 54 and 46. The same participants divided in a larger and smaller group for the categories of fish and insects. In the case of fish the dominant group comprised 79% of the participants. The second group comprised the remaining 21%. For insects these percentages equaled 72 and 28. The three groups for the sciences category comprised 46%, 31%, and 23% of the participants. The magnitudes of these percentages indicate that the model analyses did not just pick out a few oddly behaving categorizers. The smallest of the groups comprised 52 participants ( Inline graphic participants in the second fish group). The BIC is a conservative model selection heuristic that heavily penalizes complex models. As such it protects against a large number of groups with a small number of idiosyncratic categorizers each. Since every group comes with 24 additional parameters, a considerable number of participants had to employ a different set of criteria for them to end up in a separate group.

Group comparisons

With a considerable number of participants in each group, the dimensions of variation employed do not have to differ dramatically. A small reorganization of the items that allows for a better account of the data of a sizeable group of participants, suffices for a more complex model to be chosen. When the goal is to model categorization decisions regarding natural language categories this is a desirable property. After all, we do not expect individuals to have radically different ideas about the extension of these categories. That would seriously hamper daily communication. The correlation between the mean Inline graphic estimates from each group gives an indication of the difference between their dimensions of variation. This correlation equaled 91 for sports and 89 for tools. These numbers clearly illustrate that a small difference in item organization may suffice for a group to become separated when this group is sizeable enough. (The participants divided in two groups of about equal size for the categories of sports and tools.) The correlation equaled 73 for fish and.79 for insects. For these categories the participants were divided in a larger and a smaller group. For sciences the correlation between the mean Inline graphic estimates from the largest group and those from the second largest group equaled .77. Both groups correlated .73 with the smallest group.

The correlations between the mean Inline graphic estimates from each group indicate that there is common ground among the language users in the different groups: All the correlations were significant at the level (one-tailed t). This does not come as a surprise. All participants are part of the same language community where they presumably exchange these category terms without experiencing major misunderstandings. Nevertheless, there appeared to be reliable differences regarding their organization of items with respect to familiar categories like sports, tools, fish, insects, and sciences. For each of these categories the participant sample could be partitioned in subgroups of distinctly behaving categorizers. How much the subgroups differed varied from one category to the other, with the smallest difference emerging for the category of sports and the greatest difference emerging for the category of fish. In both categories one can find clear examples of items that were regarded differently by the participants in the two subgroups, however. The larger group for fish considered whales Inline graphic likelier category members than oysters , while the smaller group had the opposite opinion and , respectively). For sports a similar difference held for the items hiking and darts, with their respective 's −1.25 and .38 in one group and .45 and −.95 in the other.

Model fit

The BIC is a relative measure of fit. For a given data set it indicates which model from of a set of candidate models is to be preferred in terms of fit and complexity. The BIC is not an absolute measure of fit, however. It doesn't indicate whether the preferred model adequately describes the data it was fitted to. We used the posterior predictive distribution to see whether this was the case. The posterior predictive distribution represents the relative probability of different observable outcomes after the model has been fitted to the data. It allowed us to assess whether the solutions with multiple groups fit the categorization data in absolute terms. In addition, it was insightful to include the posterior predictive distribution for the one-group model to see how it compares with the more complex models. This is illustrated for sports in Figure 1.

Items are ordered along the horizontal axes according to the number of participants out of 250 who endorsed them as category members. Filled black circles show per item the proportion of participants from Group 1 who provided a positive categorization response. Filled gray squares show per item the proportion of participants from Group 2 who provided a positive categorization response. Outlines of circles and squares represent the posterior predictive distribution of positive categorization decisions for Group 1 and Group 2, respectively. The size of these outlines is proportional to the posterior mass that is given to the various categorization probabilities.

In Figure 1 the 24 candidate items are placed along the horizontal axes in increasing order of endorsement (across all 250 respondents). The BIC indicated that for sports the categorizers divided in two groups of equal size. For each item a filled black circle represents the proportion of participants from the first group who provided a positive categorization response. Filled gray squares represent the proportion of participants from the second group who provided a positive categorization response. The two panels in Figure 1 are identical with respect to these data. Whether a positive or a negative response was favored could depend on the group of categorizers. Item 12 (darts), for instance, was considered a category exemplar by many of the categorizers in Group 1 (black circle), while many of the categorizers in Group 2 (gray square) did not consider it an exemplar. Similar divergences between the groups occurred for items 7 (chess), 9 (billiards), and 11 (hiking). These notable categorization differences support the division the BIC suggested.

The upper panel in Figure 1 shows the posterior predictive distribution of positive categorization decisions resulting from the one-group model. The lower panel shows the posterior predictive distribution of positive categorization decisions resulting from the two-group model. For every item the panels include a separate distribution for each categorization group (circular outlines for Group 1, square outlines for Group 2). The size of the plot symbols is proportional to the posterior mass given to the various categorization probabilities. It is clear that the one-group model did not capture the group differences in categorization that were identified for items 7, 9, 11, and 12. The model predicted categorization probabilities that were in between the categorization proportions that were observed for the two groups. Because the model adopted the same dimension of variation for both groups, it is not surprising that it could not predict very different outcomes. The two-group model could yield different model predictions due to its separate item organization for each group. In the lower panel of Figure 1 the posterior predictive distributions for the two groups are quite different when this was required. In the case of item 12, for instance, a positive categorization response was predicted for the Group 1 members, while negative categorization responses were predicted for the Group 2 members. Figure 1 thus clearly shows that for sports the two-group model provided a better fit to the categorization data than the one-group model did. In addition, inspection of the lower panel of Figure 1 learns that the two-group model's predictions closely mirrored the observed data: The model is appropriate for the data in absolute terms as well. This conclusion holds in all other categories for which the BIC indicated a more complex model was to be preferred.

Discussion

The results from Study 1 indicate that inter-individual differences in semantic categorization need not only indicate vagueness. They can result from ambiguity as well. While for some natural language categories (fruits, furniture, vegetables) the most parsimonious account of the inter-individual categorization differences involves the use of different cut-offs for common criteria, other categories (fish, insects, sciences, sports, tools) require the additional assumption of the use of different criteria by different participants. In this respect these results qualify those that involved the application of a vagueness-only model to the same categorization data [29]. That particular model is found to apply here, with the understanding that it does so in subgroups of participants who employ different criteria for category membership.

First and foremost this finding has important implications for the so called threshold theory of categorization [34]–[36] of which the vagueness-only model was intended to be a formalization. The threshold theory assumes that prior to categorization respondents assess the similarity between item and category. The position of the item along the latent dimension is thought to reflect the outcome of this assessment, with items that are highly similar to the category receiving a position further along the dimension than items that resemble the category to a lesser degree. This item-category-similarity is then compared against an internal threshold Inline graphic , the position of the categorizer along the dimension, to decide whether it affords a positive rather than a negative decision. The existence of multiple item organizations for a single category suggests that it might be improper to assume a default similarity assessment outcome that is the same for all language users. Rather, it would appear that there exist a number of these default outcomes, some of which may be more prominent than others. Which one of these defaults is involved in the categorization responses of a particular participant is then indicated by that participant's group membership, with the size of the group providing an indication of the prominence of the corresponding assessment outcome. For each of the categories the extracted subgroups were few in number and considerable in size, suggesting that the goals, interests, experiences and/or interactions with category instances that might be responsible for the ambiguity, are largely shared by members of the language community, rather than idiosyncratic.

Of course, the finding that for several categories there is more than one dimension of variation that informs categorization has implications for any account of inter-individual categorization differences that presumes this dimension to be known or leaves it unspecified [3]. They run the risk of attributing differences which in fact result from ambiguity to vagueness and/or attributing differences that result from variation along one dimension to variation along another. To obtain a better understanding of the manner in which these dimensions of variation differ, external measures can be related to the Inline graphic estimates of the different groups. In Study 2 the employed criteria are substantiated using attributes that are deemed characteristic of the target categories.

Study 2: Substantiating the Criteria for Semantic Categorization

In Study 2 an explanation of the items' positions along the latent dimensions is attempted. The focus will be on the group differences herein since this ambiguity constitutes the cardinal contribution of Study 1. Before, differences like these had only been shown between groups of categorizers that were a priori known to differ in a respect considered important for categorization [21], [22]. The main research question shifts from “Are there group differences in categorization?” in the previous study to “How are the groups different?” in the current study. To answer this question their Inline graphic estimates are related to external measures that may reveal the considerations that go into the categorization decisions. First, we determine to what extent attributes that participants consider important for category membership are true of the different candidate items. Then, we obtain a small number of principal components that convey the information that is contained in these attribute applicability judgments. Finally, the item organizations of different groups are regressed upon these principal components to look for distinct patterns of attention to or weighting of the attribute information they represent.

Characteristic category attributes were already available for the categories of fish, fruits, insects, sports, tools, and vegetables [37]. For the purpose of this Study, additional category attributes were collected for furniture and sciences. For each category a matrix was then constructed indicating which category attributes apply to which category items. The applicability matrices for fish, fruits, insects, and vegetables have already been described elsewhere [28]. The applicability matrices for furniture, sciences, sports, and tools are new.