Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Aug 1.
Published in final edited form as: Cognition. 2008 Apr 23;108(2):543–556. doi: 10.1016/j.cognition.2008.03.002

Sample diversity and premise typicality in inductive reasoning: Evidence for developmental change

Marjorie Rhodes 1, Daniel Brickman 1, Susan A Gelman 1
PMCID: PMC2525567  NIHMSID: NIHMS58863  PMID: 18436200

Abstract

Evaluating whether a limited sample of evidence provides a good basis for induction is a critical cognitive task. We hypothesized that whereas adults evaluate the inductive strength of samples containing multiple pieces of evidence by attending to the relations among the exemplars (e.g., sample diversity), six-year-olds would attend to the degree to which each individual exemplar in a sample independently appears informative (e.g., premise typicality). To test these hypotheses, participants were asked to select between diverse and non-diverse samples to help them learn about basic-level animal categories. Across various between-subject conditions (N = 133), we varied the typicality present in the diverse and non-diverse samples. We found that adults reliably selected to examine diverse over non-diverse samples, regardless of exemplar typicality, six-year-olds preferred to examine samples containing typical exemplars, regardless of sample diversity, and nine-year-olds were somewhat in the midst of this developmental transition.


Inductive reasoning is central to human learning, as most of the knowledge we possess is acquired via inductive inferences rather than through direct instruction or observation. Thus, there has been considerable interest in understanding the basis of inductive reasoning processes throughout development. The question of how inductive reasoning skills develop relates to a major theoretical debate in the field of cognitive development regarding the relative contributions to development of knowledge enrichment and of conceptual change. Do young children generalize knowledge using the same reasoning mechanisms as adults do (sometimes arriving at different conclusions due to limitations in their knowledge base), or does young children's inductive methodology differ systematically from the adult approach (e.g., Carey, 1985; Heit, 2000; Viale & Osherson, 2002)? Within this framework, the goal of the present research was to examine developmental changes in how individuals approach a key challenge of inductive reasoning—determining whether limited evidence provides a strong sample on which to base broader generalizations.

Adults' Criteria for Evaluating Samples

As discussed by Heit (2000), adults have different strategies for evaluating samples that contain single versus multiple pieces of evidence. For example, to evaluate whether a single bird is informative about all birds, adults usually consider how typical the given exemplar is of the category “birds,” and base generalizations on typical exemplars (e.g., robins) more than atypical exemplars (e.g., penguins, Rips, 1975). When evaluating samples containing multiple pieces of evidence, however, adults do not simply sum the typicality of each of the given exemplars; rather, they focus on group-level properties.

To illustrate this distinction, consider the following inductive problem: If a person wants to learn about a biological property of mammals, is it better to examine a sample containing a lion and a tiger, or a sample containing a whale and a llama? Lions and tigers are likely to be perceived as more typical of the category ‘mammal’ than are either whales or llamas (e.g., Barr & Kaplan, 1987; Diesendruck & Gelman, 1999). Yet, as described by Osherson, Smith, Wilkie, Lopez, and Shafir (1990), the sample of a whale and llama seems relatively more informative in this case because adults evaluate multiple-exemplar samples by attending to the extent to which the given exemplars, together, cover the relevant inclusive category. Whales and llamas provide broad coverage of mammals in that they represent very different kinds of mammals. Viewing diverse samples as more informative is a robust phenomenon among adult populations (Heit & Feeney, 2005; Kim & Keil, 2003; Lopez, 1995; but see Medin, Coley, Storms, & Hayes, 2003, for a description of cross-cultural variation) and is consistent with theorizing from the philosophy of science indicating that it is more valid to draw conclusions based on evidence obtained from diverse sources (Heit, Hahn, & Feeney, 2004).

Children's Criteria for Evaluating Samples

When evaluating samples that contain single pieces of evidence, children as young as five years old share adults' preferences for typical exemplars (Lo, Sides, Rozelle, & Osherson, 2002; Lopez, Gelman, Gutheil, & Smith, 1992). There is decidedly mixed evidence, however, regarding the criteria that children use to evaluate multiple-exemplar samples.

Lopez et al. (1992) examined preferences for diverse samples developmentally by asking six-year-olds, eight-year-olds, and adults to select whether to generalize a novel property found among diverse or non-diverse sets of animals. In these experiments, adults reliably preferred to generalize properties found among diverse sets, eight-year-olds demonstrated limited sensitivity to diversity, but six-year-olds responded randomly. Gutheil and Gelman (1997) replicated these negative diversity effects among children in a series of studies focusing on basic level animal categories (see also Rhodes, Gelman, & Brickman, in press). Based on these findings, the authors of these studies proposed that young children do not attend to sample diversity to evaluate evidence, and therefore, that there are important developmental changes in the mechanisms that support inductive reasoning.

An alternate perspective, however, suggests that apparent developmental changes in inductive reasoning result from limitations in young children's knowledge base (e.g., Carey, 1985; Heit & Hahn, 2001). To examine this possibility, Heit and Hahn (2001) designed studies that involved items and properties thought to be more familiar to young children. They presented five-year-olds with two samples of dolls, for example, including three diverse dolls belonging to one character and three non-diverse dolls belonging to another character. Participants were then shown another different doll, and were asked to predict which character owned the doll. On these questions, children reliably responded that the target doll belonged to the character that owned the diverse set of dolls. Based on these data, Heit and Hahn (2001) suggest that young children can engage in diversity-based reasoning under simplified conditions.

These findings demonstrate that young children can recognize sample diversity and sort evidence based on diversity. In our view, however, they do not clearly indicate that young children view diverse samples as a stronger basis for induction. Instead, children may solve these problems by recognizing that diverse items better match diverse (rather than non-diverse) sets of evidence (Lo et al., 2002). Similarly, Shipley and Shepperson (2006) report that preschool children prefer to test toys from two sub-classes (e.g., one blue whistle and one red whistle) in order to determine if the whistles ‘made good party favors.’ Although this task also reveals that young children recognize and reason about sample diversity, because participants were not asked to make an inference about a larger set (e.g., whistles not included in either specific sub-class), this study also does not provide clear evidence that young children use sample diversity to determine whether a sample provides a good basis for broader generalizations (i.e., about a larger category or unobserved instance).

The Present Research

In the present study, we further examine diversity-based reasoning in children and also directly examine the possibility that the mechanisms that support inductive reasoning among young children differ systematically from those that support adult induction. Although previous research has documented that young children do not consistently demonstrate preferences for diverse samples on tasks designed to elicit diversity-based reasoning, prior work has not focused on revealing the kinds of strategies that children do in fact apply to such problems. Identifying that children have a consistent approach to these inductive problems would support the proposal that there are meaningful developmental changes in the mechanisms that support induction.

As reviewed above, a requirement for engaging in diversity-based reasoning is that individuals evaluate samples by considering the relations among premise exemplars (e.g., how well the given exemplars, taken together, cover an inclusive category), as opposed to the inductive potential of each premise considered separately (e.g., premise typicality). We proposed that young children's difficulty evaluating multiple-exemplar samples based on diversity may result from a failure to evaluate these group-level relations; instead, we hypothesized that young children evaluate samples by focusing on the individual properties (e.g., the typicality) of each of the given exemplars. If this hypothesis is correct, young children should prefer to base generalizations on multiple-exemplar samples that contain the most informative individual exemplars, regardless of the relations among the exemplars. We tested this possibility by examining how children and adults weight the influence of properties of individual examples in a sample (i.e., typicality) and the properties of whole samples (i.e., diversity) in evaluating inductive potential.

To examine participants' evaluation of different kinds of samples, we asked participants to select pairs of animals that they would like to examine to help them learn about a basic level animal category. In all conditions, participants were given choices between diverse and non-diverse samples of animals (e.g., two dogs of different or the same species) to help them learn about basic-level categories (e.g., dogs). This task explicitly required participants to select the sample they believe is the stronger one, and therefore, is a direct test of evidence evaluation (Lopez, Atran, Coley, Medin, & Smith, 1997; Rhodes et al., in press; Shippley & Shepperson, 2006). We asked participants to select samples to support a general conclusion, as opposed to a specific conclusion, to preclude a strategy based on similarity matching.

To test our hypothesis about the differential influence of typicality and diversity for each age-group, we constructed four sets of stimuli corresponding to a between-subjects design with four conditions. Across the conditions, we varied the amount of typicality present in the diverse and non-diverse sets (see Table 1). We included six-year-olds and adults, as well as a sample of nine-year-olds in order to better chart the course of developmental change. As detailed in Table 1, we hypothesized that when typicality was held constant (all exemplars were typical or atypical) nine-year-olds and adults would select diverse samples and six-year-olds would choose randomly. When typicality varied across the two samples, however, we hypothesized that six-year-olds would choose to examine the typical samples, regardless of whether they contained diverse or non-diverse exemplars, but that adults would reliably select to examine diverse sets. When typicality and diversity were directly pitted against each other (e.g., a strategy based on diversity would favor the diverse set and a strategy based on typicality would favor the non-diverse set), we expected nine-year-olds to demonstrate chance-level responding, reflecting the relative fragility of their preference for diversity (e.g., Gutheil & Gelman, 1997; Lopez et al., 1992; Rhodes et al., in press).

Table 1. Summary of conditions and hypotheses.

Condition Non-diverse set Diverse set Hypotheses
All Typical Typical Typical Adults favor diverse
9-year-olds favor diverse
6-year-olds respond at chance

All Atypical Atypical Atypical Adults favor diverse
9-year-olds favor diverse
6-year-olds respond at chance

Diverse-Typical Atypical Typical Adults favor diverse
9-year-olds favor diverse
6-year-olds favor diverse

Diverse-Atypical Typical Atypical Adults favor diverse
9-year-olds respond at chance
6-year-olds favor non-diverse

Method

Participants

Participants (N = 133) included 43 six-year-olds (20 male, 23 female, M age = 6,4; range: 5,5 – 7;2), 42 nine-year-olds (15 male, 27 female, M age = 9,2; range: 8,2 - 10,4), and 48 college students (19 male, 29 female, M age = 18,9; range: 18,2 - 19,9). An additional 10 six-year-olds and 14 college students participated in pre-testing of the stimuli (see below). Children were recruited from a public elementary school in a Midwestern university town via letters sent home to all children in participating classrooms; only children who returned a signed permission letter were included. College students were recruited from an introductory psychology subject pool and received partial course credit for participating.

Stimulus Development

To test these hypotheses it was necessary to create sets of stimuli that varied as desired in terms of typicality. To do so, we collected color photographs of 20-30 animals from each of five basic-level animal categories: birds, cats, dogs, fish, and pigs. First, a group of 14 college students rated the typicality of each exemplar with respect to its category on a seven-point scale, following the “goodness of example” procedure developed by Rosch (1975). The task was presented via computer. The items were blocked by category, with the categories, and items within each category, presented in a separate random order for each participant. Based on these ratings, we selected two species of each category that were perceived as highly typical and two species that were perceived as less typical. Analyses confirmed that the average typicality ratings of the species selected as typical and atypical exemplars significantly differed (see Table 2).

Table 2. Animal species selected as typical and atypical exemplars of each basic level category.

Mean (SD) Typicality Ratings


Category Typical Species Atypical Species t1
Dog Golden retriever 7.0 (.00) German Spitz Mittel 5.6 (2.1) 2.35*
Black Labrador 6.8 (.40) Hairless Chinese Crested 5.6 (1.9)
Fish Goldfish 6.9 (.45) Yellow Boxfish 5.0 (2.3) 3.96**
Blue Angel Fish 6.6 (.55) Puffer Fish 3.1 (0.9)
Pigs Yorkshire 6.8 (.45) Warthog 3.4 (1.0) 7.45**
Hampshire 6.6 (.55) Javelina 3.4 (0.9)
Cats Orange Tabby 6.8 (.45) Checkered Ragdoll 5.6 (2.1) 3.23**
Grey and White Longhair 7.0 (.00) Red Cameo Persian Longhair 5.6 (1.3)
Birds Cardinal 6.8 (.45) Owl 4.8 (2.4) 4.27**
Blue Jay 6.8 (.45) Variable Oystercatcher 5.6 (1.7)
1

Paired-sample t-test comparing mean typicality ratings for species selected as typical and atypical exemplars with 13 degrees of freedom.

*

p < .05.

**

p < .01.

To confirm that young children also perceived the typicality of the selected examples as desired, we conducted a pre-test with a group of 10 six-year-olds. Children were interviewed individually. The task was introduced as follows, “I am making a book about different animals. For each kind of animal, I have to pick the best picture to put in the book.”

Children were asked to choose pictures for each animal category. For example, children were presented with four pictures of dogs, including two from the atypical set and two from the typical set, and were asked, “We want to pick the picture that best shows DOG, so that anyone who looked at our book would understand what DOG really means. What is the best picture to show DOG?” After their initial choice, participants were asked to select the ‘next best’ two additional times. Thus, for each set, we obtained a ranking of how well children thought each exemplar represented the animal category. The animal categories, and the items within each category, were presented in a separate random order for each participant. As predicted, at least nine out of the ten children chose the two typical exemplars as their first two choices of ‘best examples’ for each animal category; significantly more often than expected by chance (sign tests, ps < .05). Based on these findings, we constructed four sets of stimuli, one for each of the conditions described in Table 1. These stimuli are summarized in Table 2.

Main Study Procedures

Children were tested individually in a quiet area of their elementary school. Children were given the following instructions, “We're going to pretend that you are a scientist who is trying to learn about animals. Your job is to pick the best set of animals to look at to help you learn about each kind of animal. For each question, look at each pair of animals carefully, and pick the best set to look at to help you learn new things about animals.”

Children were then asked a series of five questions. For each question, children were shown two sets of animals, a diverse and a non-diverse set, each containing color photographs (9 × 13 cm) of two animals. Each pair of animals was mounted on a piece of white paper. For example, children were shown one diverse (e.g., a black lab and a golden retriever) and one non-diverse (e.g., two different golden retrievers) set of dogs, and told, “Here are two sets of dogs. You are a scientist who wants to find out if dogs have ulnar arteries. Which set of dogs do you want to look at to learn about dogs?” Children pointed to indicate their responses. Adults completed a questionnaire version of the task. All questions referred to categories using generic nouns “e.g., to learn about DOGS.” This language choice, as well as the instructions, which referred to learning about kinds of animals, insured that children understood the task as an attempt to learn about a category as a whole, not just the given exemplars, as even preschoolers understand generic noun phrases as referring to categories, not only pictured exemplars or a subset (e.g., Gelman & Raman, 2003; Hollander, Gelman, & Star, 2002). Questions were presented in a separate random order for each participant. Presentation of the diverse set on the participants' right or left was counter-balanced across questions and participants.

Participants of each age-group were randomly assigned to one of four conditions. All participants received identical instructions and questions. Across the conditions, we varied the typicality of the diverse and non-diverse sets to form four conditions (see Table 3).

Table 3. Sample Stimuli for category DOG for each condition.

All Typical All Atypical Diverse Typical Diverse Atypical
Diverse Black Labrador & Golden retriever Hairless Chinese crested & German Spitz Mittel Black Labrador & Golden retriever Hairless Chinese Crested & German Spitz Mittel

Non-diverse1 Two Golden Retrievers Two Hairless Chinese crested Two Hairless Chinese Crested Two Golden retrievers
1

Although the non-diverse sets included animals of the same species, the two exemplars in each set were easily identifiable as different individuals (e.g., they were slightly but noticeably different shades of color, size, etc.).

Results

Selections of the diverse samples were scored as ‘1’; selections of non-diverse samples were scored as ‘0.’ In order to account for the binomial structure of the data, we conducted a series of binomial regressions, using the chi-square distributed deviance-based test (Faraway, 2006)1. Total mean proportions by age-group and condition are presented in Figure 1; mean proportions for each category set are presented in Table 4.

Figure 1.

Figure 1

Proportion of diverse responses by age and condition

Table 4. Proportion of Diverse Responses for Each Stimulus Set by Age-group and Condition.

Six-year-olds Nine-year-olds Adults



All
Typical
All
Atypical
Diverse
Typical
Diverse
Atypical
All
Typical
All
Atypical
Diverse
Typical
Diverse
Atypical
All
Typical
All
Atypical
Diverse
Typical
Diverse
Atypical
Dog .55 .55 1** .36 .82* 1** .80 .40 .85* .83* .92* .82*
Pig .55 .45 1** .09* .64 .73 .50 .50 .85* .92* .92* .36
Bird .64 .45 1** .00** .91** .73 .60 .70 .85* .92* .92* 1**
Cat .55 .63 .90* .09* .82* .73 .90* .50 .85* .83* .83* .73
Fish .64 .45 .70 .64 .91* .82* .40 .40 .85* .83** 1** .73

Note. Comparisons to proportion expected by chance using binomial regression.

*

p ≤ .05.

**

p < .01.

As hypothesized, there was an overall significant interaction between age-group and condition, χ2(7) = 55.96, p < .001. Among adults, there was no effect of condition χ2(3) = 3.14, p > .3. Adults selected diverse samples significantly more often than expected by chance in all conditions.

Among nine-year-olds, there was a significant effect of condition, χ2(3) = 16.38, p < .001. Comparisons among the conditions indicated that nine-year-olds selected the diverse samples more often in the All Typical and All Atypical conditions than in the Diverse-Atypical condition (Diverse-Atypical compared to All Typical, χ2(1) = 12.20, p < .001, and All Atypical, χ2(1) = 10.65, p < .01). Nine-year-olds chose the diverse samples more often than expected by chance in all conditions, except for the Diverse-Atypical condition.

Among six-year-olds, we also found a significant effect of condition, χ2(3) = 56.56, p < .001. Comparisons among conditions indicated that when typicality differed between the two samples, six-year-olds reliably chose to examine the sets that contained the typical exemplars, regardless of diversity. Thus, in the Diverse-Typical condition, six-year-olds chose the diverse set more often than in any other condition (ps < .001), and did so more often than expected by chance. But, in the Diverse-Atypical condition, they chose the non-diverse (i.e., typical) set more often than in any other condition (ps < .01), and chose the diverse set significantly less often (i.e., they chose the non-diverse set more often) than expected by chance. When the level of typicality was equal across the two samples, six-year-olds demonstrated chance-level responding.

Discussion

A fundamental challenge of cognitive development is to establish criteria for evaluating when samples of limited evidence provide strong bases for inductive inferences. The goal of the present research was to examine developmental changes in how people evaluate whether limited samples of evidence provide a good basis for induction. In this study, young children used different criteria than adults for making such judgments. Six-year-olds valued item typicality, whereas older children and adults valued overall sample diversity. These findings support our proposal that young children focus on properties of individual exemplars, whereas adults focus on group-level properties involving relations among exemplars. Nine-year-olds were somewhat in the midst of this transition; they favored diverse samples, except when doing so required selecting atypical exemplars instead of typical exemplars.

Children's Evaluation of Multiple Exemplar Samples

When typicality differed between the diverse and non-diverse samples, six-year-olds chose typical sets (regardless of sample diversity). Based on an examination of only the conditions in which typicality varied across samples, one possible explanation for these findings could be that young children and adults have different approaches to integrating information about typicality and diversity. In other words, individuals of all ages may recognize the value of both typicality and diversity, but when the two factors are pitted against each other, young children are drawn toward typicality (e.g., Heit et al., 2004; see also Osherson et al., 1990). Another related possibility is that young children value typical samples more than adults do because they fail to recognize atypical examples as category members (e.g., Bjorklund & Thompson, 1983), and therefore, do not recognize diverse samples containing atypical members as valuable.

Considering young children's performance across all four of the conditions included in this experiment, however, undermines the possibility that our findings relate only to developmental differences in how individuals integrate sample diversity and typicality. Young children did not select diverse samples even when the level of typicality was held constant across the two samples. Thus, children appear not to recognize the value of diverse evidence. Across conditions, young children's performance is predicted only by sample typicality; when typicality varies, young children make reliable predictions; when typicality is held constant, they demonstrate chance-level responding.

Adults' Evaluation of Multiple-Exemplar Samples

These findings also highlight an important characteristic of adult inductive reasoning. Specifically, the ability to distinguish the inductive potential of individual exemplars from the inductive potential of a set of exemplars appears to be a critical component underlying adult cognition. Although this distinction has been discussed in prior theoretical work (Heit, 2000), we are aware of no previous study that systematically examined this ability by varying the inductive potential of the individual exemplars in a sample. Thus, this pattern of findings demonstrates that examining cognitive development is a critical method for isolating the processes that contribute to adult cognition (Heit & Hahn, 2001; Lopez et al., 1992).

Nonetheless, these findings do not imply that typicality is unimportant within adults' inductive reasoning. As described by Osherson et al. (1990), adults evaluate the relations among individual pieces of evidence in order to identify the extent to which a given sample covers the relevant category. Often, the sample containing the most diverse examples provides the greatest coverage and is therefore viewed as the strongest sample, as was found in the present study. We selected atypical items that were only moderately atypical, however, and Osherson et al. (1990) indicate that sometimes adults will conclude that two non-diverse but highly typical items provide better coverage than two diverse but highly atypical items. Thus, in general, adults approach sample evaluation by considering how the exemplars in a category relate to one another and the category as a whole.

The Present Findings in the Context of Prior Work

We have suggested that differences in methodology underlie the inconsistencies across prior reports with respect to children's diversity-based reasoning. Children succeed on tasks that require matching diverse items to diverse sets (Heit & Hahn, 2001) or reasoning about a limited class of items (Shippley & Shepperson, 2006), but have more difficulty when they are required to evaluate samples for the purpose of making broader generalizations. Future work more directly comparing children's performance on different types of reasoning tasks would help to clarify these discrepancies. Also, given the present findings, it is important to consider that preferences for typical information may have biased young children toward diverse sets in some prior work. For example, Lo et al. (2002) found some evidence of diversity effects among preschool children; however, an informal examination of their items suggests that more typical items were often included in the diverse sets (e.g., children rated a ladybug and an ant as stronger evidence than a ladybug and a beetle; we suspect that the first sample is more diverse and more typical than the second sample).

An additional factor that will be important to consider in resolving the discrepancies across reports is content domain. In the present research, we focused on animal kinds for two reasons: to be consistent with much prior research on induction, and because biological kinds are so inductively rich that understanding how young children reason about such categories is central to understanding how they learn about the world. Other studies of diversity-based reasoning in children have focused on artifact categories (Heit & Hahn, 2001; Shippley & Shepperson, 2006), which may be more accessible to young children. However, young children generally demonstrate at least as sophisticated inductive reasoning skills when considering the hidden properties of animals as when considering the properties of artifacts (see Gelman, 2003 for a review). Therefore, we doubt that domain differences are primarily responsible for children's successes and failures on tasks related to diversity-based reasoning. Nonetheless, in future research, it would be helpful to directly examine possible domain effects.

Why Might Young Children Focus on Individual Exemplars?

The present findings, which document some limitations in young children's inductive reasoning, may seem at odds with other research in cognitive development that has revealed an impressive level of sophistication in young children's category-based induction. For example, young children systematically extend information that they learn about individual category members to other category members, even when confronted with conflicting perceptual information (Gelman & Markman, 1986). Thus, young children consider two sources of information that could be inductively relevant, and systematically select to base their inferences on category membership.

We suggest that young children's strong beliefs about category homogeneity may underlie both their success on these category-based induction tasks and their difficulty with sample diversity. On a number of tasks, young children have been documented to view categories as more strongly determinative of properties than adults do (e.g., Diesendruck & HaLevy, 2006; Gelman & Kalish, 1993; Taylor, 1996; Waxman, Medin, & Ross, 2007), such that they predict that category members will share category-linked properties even when adults rely on individuating information. In this way, children appear to ignore within-category variability that adults find important, focusing instead on within-category commonalities. An overall tendency to overlook within-category variability could also explain children's failure to use diversity (e.g., Gutheil & Gelman, 1997). In both cases, young children focus on certain individual exemplars as standing in for the category as a whole. Over development, as children increasingly consider within-category variability, they may begin to recognize the value of sampling from diverse category members before drawing conclusions about a category as a whole.

In sum, the present findings contribute to an ongoing broad research program aimed at identifying the extent to which cognitive development is driven by knowledge enrichment as well as by conceptual reorganization. Both processes clearly contribute to development, as knowledge differences have been documented to influence children's reasoning on other tasks (e.g., Carey, 1985), as well as adults' tendency to rely on diversity to guide induction (Medin et al., 2003; Proffitt, Coley, & Medin, 2000). The present work suggests, however, that the mechanisms that support inductive reasoning about natural kinds also undergo reorganization across the elementary school years.

Acknowledgments

This research was supported by NICHD grant HD-36043 to the third author. The first and second authors were supported by funding from the Michigan Prevention Research Training Grant (NIH grant number T32 MH63057-03). We are grateful to the parents, teachers, and children of the Ann Arbor Public Schools for participating in this research.

Footnotes

1

Although this analysis plan is the appropriate approach given the structure of out data, we also analyzed these data through a 3(age-group) × 4 (condition) analysis of variance with total proportion of diverse responses as the dependent variable. These analyses revealed an identical pattern of significant results.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Barr RA, Caplan LJ. Category representations and their implications for category structure. Memory & Cognition. 1987;15:397–418. doi: 10.3758/bf03197730. [DOI] [PubMed] [Google Scholar]
  2. Bjorklund DF, Thompson BE. Category typicality effects in children's memory performance: Qualitative and quantitative differences in the processing of category information. Journal of Experimental Child Psychology. 1983;35:329–344. [Google Scholar]
  3. Carey S. Conceptual change in childhood. Cambridge, MA: Bradford Books; 1985. [Google Scholar]
  4. Diesendruck G, Gelman SA. Domain differences in absolute judgments of category membership: Evidence for an essentialist account of categorization. Psychonomic Bulletin & Review. 1999;6:338–346. doi: 10.3758/bf03212339. [DOI] [PubMed] [Google Scholar]
  5. Diesendruck G, Halevi H. The role of language, appearance, and culture in children's social category-based induction. Child Development. 2006;77:539–553. doi: 10.1111/j.1467-8624.2006.00889.x. [DOI] [PubMed] [Google Scholar]
  6. Faraway JJ. Extending the linear model with R: Generalized linear, mixed effects, and nonparametric regression models. Boca Raton, FL: Chapman & Hall/CRC; 2006. [Google Scholar]
  7. Gutheil G, Gelman SA. Children's use of sample size and diversity information within basic-level categories. Journal of Experimental Child Psychology. 1997;64:159–174. doi: 10.1006/jecp.1996.2344. [DOI] [PubMed] [Google Scholar]
  8. Gelman SA. The essential child: Origins of essentialism in everyday life. New York: Oxford University Press; 2003. [Google Scholar]
  9. Gelman SA, Kalish CW. Categories and causality. In: Pasnak R, Howe ML, editors. Emerging themes in cognitive development. New York: Springer-Verlag; 1993. pp. 3–32. [Google Scholar]
  10. Gelman SA, Markman EM. Categories and induction in young children. Cognition. 1986;23:183–209. doi: 10.1016/0010-0277(86)90034-x. [DOI] [PubMed] [Google Scholar]
  11. Gelman SA, Raman L. Preschool children use linguistic form class and pragmatic cues to interpret generics. Child Development. 2003;24:308–325. doi: 10.1111/1467-8624.00537. [DOI] [PubMed] [Google Scholar]
  12. Gutheil G, Gelman SA. Children's use of sample size and diversity information within basic-level categories. Journal of Experimental Child Psychology. 1997;64:159–174. doi: 10.1006/jecp.1996.2344. [DOI] [PubMed] [Google Scholar]
  13. Heit E. Properties of inductive reasoning. Psychonomic Bulletin & Review. 2000;7:569–592. doi: 10.3758/bf03212996. [DOI] [PubMed] [Google Scholar]
  14. Heit E, Feeney A. Relations between premise similarity and inductive strength. Psychonomic Bulletin & Review. 2005;12:340–344. doi: 10.3758/bf03196382. [DOI] [PubMed] [Google Scholar]
  15. Heit E, Hahn U. Diversity-based reasoning in children. Cognitive Psychology. 2001;43:243–273. doi: 10.1006/cogp.2001.0757. [DOI] [PubMed] [Google Scholar]
  16. Heit E, Hahn U, Feeney A. Defending diversity. In: Ahn W, Goldstone RL, Love BC, Markman AB, Wolff P, editors. Categorization inside and outside the lab: Festschrift in honor of Douglas L Medin. Washington, DC: American Psychological Association; 2004. [Google Scholar]
  17. Hollander MA, Gelman SA, Star J. Children's interpretation of generic noun phrases. Developmental Psychology. 2002;38:883–894. doi: 10.1037//0012-1649.38.6.883. [DOI] [PubMed] [Google Scholar]
  18. Kim NS, Keil FC. From symptoms to causes: Diversity effects in diagnostic reasoning. Memory & Cognition. 2003;31:155–165. doi: 10.3758/bf03196090. [DOI] [PubMed] [Google Scholar]
  19. Lo Y, Sides A, Rozelle J, Osherson D. Evidential diversity and premise probability in young children's inductive judgment. Cognitive Science. 2002;26:181–206. [Google Scholar]
  20. Lopez A. The diversity principle in the testing of arguments. Memory & Cognition. 1995;23:374–382. doi: 10.3758/bf03197238. [DOI] [PubMed] [Google Scholar]
  21. Lopez A, Atran S, Coley JD, Medin DL, Smith EE. The tree of life: Universal and cultural features of folkbiological taxonomies and inductions. Cognitive Psychology. 1997;32:251–295. [Google Scholar]
  22. Lopez A, Gelman SA, Gutheil G, Smith EE. The development of category-based induction. Child Development. 1992;63:1070–1090. [PubMed] [Google Scholar]
  23. Medin DL, Coley JD, Storms G, Hayes BL. A relevance theory of induction. Psychonomic Bulletin & Review. 2003;3:317–332. doi: 10.3758/bf03196515. [DOI] [PubMed] [Google Scholar]
  24. Osherson DN, Smith EE, Wilkie O, Lopez A, Shafir E. Category-based induction. Psychological Review. 1990;97:185–200. [Google Scholar]
  25. Proffitt JB, Coley JD, Medin DL. Expertise and category-based induction. Journal of Experimental Psychology: Learning, Memory, & Cognition. 2000;26:811–828. doi: 10.1037//0278-7393.26.4.811. [DOI] [PubMed] [Google Scholar]
  26. Rhodes M, Gelman SA, Brickman D. Developmental changes in the consideration of sample diversity in inductive reasoning. Journal of Cognition and Development. doi: 10.1016/j.cognition.2008.03.002. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Rips LJ. Inductive judgments about natural categories. Journal of Verbal Learning and Verbal Behavior. 1975;14:665–681. [Google Scholar]
  28. Rosch E. Cognitive representations of semantic categories. Journal of Experimental Psychology: General. 1975;104:192–233. [Google Scholar]
  29. Shipley EF, Shepperson B. Test sample selection by preschool children: Honoring Diversity. Memory & Cognition. 2006;34:1444–1451. doi: 10.3758/bf03195909. [DOI] [PubMed] [Google Scholar]
  30. Taylor M. The development of children's beliefs about social and biological aspects of gender differences. Child Development. 1996;67:1555–1571. [PubMed] [Google Scholar]
  31. Viale R, Osherson D. Cognitive development, culture, and inductive judgment. In: Viale R, Andler D, Hirschfeld LA, editors. Biological and cultural bases of human inference. Mahweh, NJ: Lawrence Erlbaum Associates; 2002. [Google Scholar]

RESOURCES