Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Jul 1.
Published in final edited form as: J Exp Child Psychol. 2014 Mar 17;123:15–35. doi: 10.1016/j.jecp.2014.01.009

Using the axis of elongation to align shapes: Developmental changes between 18 and 24 months

Linda B Smith 1, Sandra Street 1, Susan S Jones 1, Karin H James 1
PMCID: PMC4030647  NIHMSID: NIHMS577438  PMID: 24650776

Abstract

An object’s axis of elongation serves as an important frame of reference for forming 3-dimensional representations of object shape. By several recent accounts, the formation of these representations is also related to experiences of acting on objects. Four experiments examined 18- to 24-month-old (N = 103) infants’ sensitivity to the elongated axis in action tasks that required extracting, comparing and physically rotating an object so that its major axis was aligned with that of a visual standard. In Experiments 1 and 2, the older infants precisely rotated both simple and complexly shaped 3-dimensional objects in insertion tasks in which the visual standard was the rectangular contour defining the opening in a box. The younger infants performed poorly. Experiments 3 and 4 provide evidence on emerging abilities in extracting and using the most extended axis as a frame of reference for shape comparison. Experiment 3 showed that 18 month olds could rotate an object to align its major axis to the direction of their own hand motion and Experiment 4 showed that they could align the major axis of one object to that of another object of the exact same 3-dimensional shape. The results are discussed in terms of theories of the development of 3-dimensional shape representations, visual object recognition, and the role of action in these developments.

Keywords: Shape, visual object recognition, perception-action, infancy


Visual object recognition is central to human cognition in many domains including category learning (Dickenson, Leonardis, Schiele, Tarr, 2009), problem solving (Cavanagh, 2011), and tool use (Lockman, 2000). However, as several recent reviews note (Nishamura, Scherf & Behrmann, 2009; Smith, 2009), the development of visual object recognition has received little systematic attention beyond early infancy. The present experiments are motivated by recent findings indicating marked changes in the representation of object shape in the second year of life (Smith, 2009; Smith & Jones, 2011;Yee, Jones & Smith, 2012). The experiments specifically focus on one structural property of object shape –the most elongated axis – that has been proposed as a key frame of reference for object constancy and object categorization. As background, we first provide the theoretical rationale for focusing on the most elongated or “major” axis as a stepping stone to understanding early visual object recognition, and then we review the specific findings from previous research with 18- and 24-month olds that motivate the present experiments.

The axis of elongation

Three-dimensional objects project different 2-dimensional contours when viewed from different directions. This fact yields two fundamental problems in explaining human visual object recognition – object constancy and object categorization. These are illustrated in Figure 1. Object constancy concerns perceivers’ ability to recognize the same object, for example a specific spoon, from different viewing perspectives (e.g., Biederman & Gerhardstein, 1993; see also, Pinto, Cox & Dicarlo, 2008). Object categorization concerns how perceivers recognize instances of the same category, different spoons, given their idiosyncratic variations in shape along with their different viewing perspectives (e.g., Biederman, 1987; Edelman & Duvdevani-Bar, 1997; Rosch, 1999). Proposed theoretical solutions to both problems often posit that perceivers use the most elongated axis of the object as a reference frame for comparing 2-dimensional images. The utility of the axis of elongation is that it is a structural property of the whole object that is relatively invariant and recoverable across many different views and that is often shared across different instances of the same category (Marr & Nishihara, 1978; Biederman, 1987; Lowe, 1987). The idea, then, is that perceivers solve the perspective problem by aligning –physically or mentally -- the major axes of to-be-compared shapes (e.g., Marr & Nishihara, 1978, Sekuler, 1996; Tarr, 2003; see also Dickison, Leonardis, Schiele & Tarr, 2009 ). Consistent with this idea, and as also apparent in Figure 1, the shape similarity of objects is more readily perceived when the views of objects are aligned by their major axis (Sekuler, 1996; Tarr, 2003; see also Dickison, Leonardis, Schiele & Tarr, 2009). Several recent proposals have further suggested that the major axis –and physical rotations and alignments with respect to that axis -- play a role in the developmental processes that build integrated 3-dimensional representations of object shape from 2-dimensional views (Smith, 2009; Graf, 2006; Farivar, 2009, Cutzu & Tarr, 2007; Pereira, James, Jones, & Smith, 2010; James, Swain, Jones & Smith, 2013). The idea is that by rotating the major axis (in all three planes) children self-generate the visual information that is the basis for building integrated views (see Graf, 2006; Farivar, 2009) and that by stacking and aligning objects, children extract the major axis as a frame of reference for comparing shapes (see Smith, 2009). These proposals highlight the importance of studying young children’s sensitivity to and use of the major axis as a visual frame of reference.

Figure 1.

Figure 1

A and B illustrate how perceived identity (object constancy) and shape similarity (object categorization) benefit from aligning objects by their most elongated axes. C illustrates rectangles with their elongated axis aligned to the spoons in B and the logic behind the task of asking children to align different shaped objects to a rectangular standard.

Only a handful of studies, all with older children, have explicitly examined the role of the major axis in children’s shape perception and object recognition (see Gregory, Landau & McCloskey, 2011; Ons & Wagemans, 2011; Smith, 2005). Studies with young infants show early developments of perhaps related abilities, including volumetric completion, that is, the visual expectation that the unseen side a convex volume will also be convex (Soska & Johnson, 2008). Volumetric completion indicates an sensitivity to whole-object 3-dimensional shape. Other results show that after young infants’ looking behavior has been habituated to a cylinder, they see new views of a different-shaped object (e.g., brick) as more different than new views of the familiarized object (Kraebel, West & Gerhardstein, 2007). This results implicates processes that support the recognition of the multiple views of the same object. Also relevant are studies showing that young infants perceive frontal-plane rotations of a 2-dimensional shape as more similar to each other than other rotations or other shapes (Moore & Johnson, 2009; Quinn & Liben, 2009); these results could (but do not necessarily) indicate attention to and comparison of shape in terms of the major axis. All in all, these findings from young infants underscore the likelihood that object constancy depends on a suite of skills with a protracted developmental course (Smith, 2009; 2013). One line of evidence –to which we turn next – suggests that a later period in infancy, the period between 18 and 24 months, is one of change with respect to sensitivity to structural properties of object shape, including, we propose, the major axis.

Emerging representations of 3-dimensional shape

One line of studies (Augustine, Smith & Jones, 2011; Pereira & Smith, 2009; Smith 2003; Smith & Jones, 2011; Yee et al., 2012) presented toddlers with rich typical instances and also with sparse 3-dimensional representations of common objects (for example, a camera, ice cream cone, and hair brush) and in a forced choice task asked them to select the named object (e.g., where is the ice cream?). The sparse representations consisted of only 2–3 geometric volumes representing major object parts in the correct spatial arrangement so as to convey the structural properties of the characteristic category shape but with no high frequency spatial information and no surface features. Older children recognized instances of the named categories from these sparse 3-dimensional representations just as well as they did rich typical examples but the younger children only recognized the rich instances. Further, individual children’s abilities to recognize object categories from sparse structural shape representations were related to their familiarity with the tested basic level categories and to their ability to rapidly learn and generalize new object names (Augustine et al, 2011; Smith, 2003; Yee et al., 2012). Theories of visual object categorization see recognition based on the sparse geometry of 3-dimensional shape as a high-level visual skill that uses the major axis as the object-centric frame of reference (Marr & Nishihara, 1978; Biederman, 1987).

A second line of relevant research examined the object views that toddlers present to themselves when they are visually and manually exploring objects (Pereira et al., 2010; James, Swain, Jones & Smith, 2013). When adults generate views of 3-dimensional objects for themselves by holding and rotating the objects (Pereira et al., 2010) or when they control the views they see through other means (James, Humphrey, Vilis, Corrie, et al., 2002; Perrett et al., 1992) they systematically show themselves planar views. Planar views, as illustrated in Figure 2, are views in which the major axis of elongation is parallel or perpendicular to the line of sight (LOS). When 12- to 36-month-old children are given novel objects to hold and visually examine, they, like adults, oversample planar views and particularly those views in which the major axis is elongated relative to the line of sight (Pereira et al., 2010; James et al., 2013). Self-generated planar views have been shown to support rapid object recognition in adults (Harman et al., 1999). Critically, they have also been shown to be developmentally related to toddlers’ ability to recognize common object categories from sparse representations of shape using 2–3 volumes (James, Swain, Jones & Smith, 2013). These findings fit the idea that object views organized around the major axis support the building of object-centered and sparse representations of 3-dimensional shape, and as such they imply that these older infants may be sensitive to the major axis as a reference frame for object recognition. However, there is no independent evidence that young children can extract nor mentally manipulate the major axis as a visual property of objects. Providing this evidence is the goal of the present experiments.

Figure 2.

Figure 2

Top: Planar views of a rectangular block defined by the relation of the major axis to the line of sight (LOS), perpendicular and parallel. Bottom: Three insertion tasks that require aligning the rectangular block so that it is perpendicular or parallel to the line of sight.

Rationale for the experimental method

Although the larger program of research motivating these experiments concerns developmental changes in visual object recognition, the specific experiments reported here focus solely on the question of whether 18- and 24-month old children can visually extract the most elongated axis of an object in order to align the major axis of the object with that of a differently shaped visual standard. Aligning the major axes of objects with different shapes can not be solved by matching local visual similarities or parts and thus provides strong evidence of the extraction of this object property. As illustrated at the bottom of Figure 1, our task required children to physically align 3-dimensional objects to a rectangular standard. In Experiments 1 and 2, the main experiments, we used insertion tasks to assess this ability; given the poor performance of the 18 month olds in these tasks as compared to the much better performance of the 24 month olds, Experiments 3 and 4 used variants of the main procedure to better understand the emerging skills of 18 month olds.

In Experiments 1 and 2, we asked 18 and 24 month olds to physically rotate objects so as to align the elongated axis of the object with the elongated axis of a rectangular hole, with the child’s goal being to insert the object into that hole. We chose an insertion task because past research shows that by 12 months of age, infants understand that the goal of the task is to put the object “in” (even if they cannot always succeed in fitting the object into the opening: see Hayashi, Takeshita & Matsuzawa, 2006; von Hofsten, 2007), and because an insertion task potentially provides a direct way of measuring sensitivity to the major axis of elongation. In the task in Experiment 1 (see Figure 2), the to-be-inserted object would fit into the hole if its most elongated axis was aligned to most elongated axis of the rectangular opening. Thus success required comparing the major axes of both the object and opening and if necessary rotating the object so as to align its axis to the axis of the hole. Our measure of success, however, was not an actual successful insertion of the object into the opening. Instead, because our interest is in children’s sensitivity to visual information, we measured the alignment of object and opening after the child’s initial movement of the object to the opening and prior to any feedback from a failure to fit the object into the hole. This is the standardly used measure of the visual information used to plan an action (see Örnkloo & von Hofsten, 2007). Planning the action must be based on visual information and a planned aligned rotation would implicate the mental alignment of the object to the standard. Accordingly, and to limit possible effects of feedback from success or failure in an actual insertion attempt, children were tested with each unique object only once.

Our decision to use insertion as the main task was suggested by a series of experiments by Schutts, Ornkloo, von Hofsten, Keen, & Spelke (2009). They examined the insertion skills of 15 to 30 month olds in tasks in which the children had to select an object from a set of objects (spheres, cubes) and then fit that object into a same-shaped hole. In general, children under 24 months had difficulty in this task whereas children 24 months and older succeeded, a developmental pattern also observed in other insertion tasks (e.g., Örnkloo & von Hofsten, 2007; Street, James, Jones & Smith, 2011). However, Schutts et al. also used a second version of this task in which the holes were replaced with 2-dimensional silhouettes of the projected shape (circles and squares) and children were asked to select and set the 3-dimensional objects onto their matching 2 dimensional projections. Although the manual demands were considerably less in this silhouette task than in the insertion task proper, Schutts et al. (2009) found similar developmental and performance patterns across the two versions of the task. They therefore proposed that there were overlapping components of the two tasks that limited younger children’s success and that one of these difficulties was the perception of the shape similarity between the to-be-inserted 3-dimensional object and the 2-dimensional contour, the skill we seek to measure.

The objects used in Experiments 1 are shown in Figure 3. Four of the objects were variants of rectangular blocks and thus (if viewed from the appropriate perspective) had 2-dimension projections that matched that of the rectangular opening, a shape similarity that could benefit comparing and aligning these targets to the rectangular standard. Four of the objects were simple elongated shapes with minimal part structure but different in overall shape from a rectangular block, and lacking a 2-dimensional projection that matched the rectangular opening. These objects were used to address the question of whether children could abstract, compare, and align objects to the rectangular opening by the major axis alone. Finally, four objects were complex multi-part shapes that were richly detailed toy instances of real things (with canonical orientations). These were used to strongly challenge children’s ability to abstract the major axis of an object.

Figure 3.

Figure 3

The stimulus objects used in Experiments 1, 2 and 3: Blocks, Simple shapes and Complex shapes. Also shown is the orientation of the objects at presentation to the children in the insertion tasks.

Experiment 1

The insertion box used in Experiment 1 had a large rectangular opening, larger than the to-be-inserted object, such that the object –if roughly aligned – could pass through easily. More specifically, the size of the opening relative to the objects allowed for a 25° difference in the orientations of the major axes of object and rectangular opening. We chose to use this large opening because this particular task had not been previously used with young children and we wanted it to be enjoyable and self-motivating. Further, our interest is in children’s alignment of the object on first approach to the hole –that is, on the basis of visual information alone – and therefore our goal was not precise measurement of their ultimate skill in inserting the object. A pilot study indicated that children did not adjust well to trial-to-trial changes in the orientation of the rectangular opening. Therefore children were randomly assigned to either a Vertical or Horizontal opening condition, and on every trial attempted to align objects to a hole of constant orientation. Half the objects were presented to the child with the major axis vertically oriented and half with the major axis horizontally oriented, so that on half the trials alignment required the to-be-inserted object be rotated 90°.

Method

Participants

The participants were 16 (8 girls, 8 boys) 18 month olds (range=16.2 to 18.2 months; M=17.25 months) and 16 (9 girls and 7 boys) 24 month olds (range = 22.87 to 24.67 months; M=23.95 months). The children were recruited from a working- and middle-class population in the Midwest and had no known developmental disorders. Parents reported normal visual acuity. Half the children at each age level were randomly assigned to the Vertical Opening condition and half to the Horizontal Opening condition.

Stimuli and design

The overall size and ratios of the major and minor axes of the objects were based on our previous research on toddlers’ manual and visual exploration of objects (Pereira et al., 2010). When the major axis of the object is 15 to 20 cm and neither of the two minor axes exceeds 7 cm, toddlers comfortably hold the object with one hand at each end of the major axis so that the major axis elongated horizontally or with one or two hands at the center and around the minor axes so that the major axis is elongated vertically. Accordingly, the 12 selected stimulus objects met these constraints for easy holding with the major axis oriented both horizontally and vertically. The visual standard was a 21.5 cm × 9 cm rectangular opening that was cut into one side of a 43.5 cm × 45 cm × 13 cm cardboard box. The same box was rotated to form the receptacle for both the Vertical and Horizontal Opening conditions.

There were 12 unique objects (shown in Figure 3), 4 of each of the three kinds: (1) Blocks – 4 unique rectangular blocks made of wood, sponge, plastic, and cloth, with different colors and other surface properties; (2) Simple shapes – 4 novel forms made from wood, plastic, and clay that could stand alone only in one orientation, with the major axis aligned either vertically (2 objects) or horizontally (2 objects) with respect to gravity; and (3) Complex shapes – 4 toy versions of known categories with canonical vertical orientations of the major axis (radio, ice cream cone) or horizontal orientations (tiger, boat). The dimensions of the Blocks were 18 cm by 7 cm by 4.25 cm and the dimensions of the other objects (Simple and Complex) fell within the following ranges on each dimension: length, 16 to 19 cm; width, 6 to 8 cm; and depth, 3 to 5 cm. The area of the largest plane of any object ranged from 114–128 sq cm; whereas the areas of the rectangular opening was 193.5 sq cm. All objects could be fit into the opening if their major axis was turned to within 25° of the major axis that of the opening. The major axis of each object exceeded the minor axis of the opening by at least 7 cm. Children received only one trial with each unique object as the goal was to measure their ability to use visual information about the object’s major axis rather than their ability to adjust their behavior in response to feedback from attempted insertions. Half of the stimulus objects of each kind were presented to the child to insert with vertical axis elongated and half were presented with the horizontal axis elongated with the orientation at presentation of each unique object shown in Figure 3. The order of presentation of the individual objects was randomly determined for each child.

In sum, the between-subject factors in the design were Age, Group and Opening Orientation (Vertical or Horizontal). The within-subject factors were the 3 categories of stimulus objects (Block, Simple, Complex) and the Object-Opening Alignment (Aligned, Not Aligned) at presentation.

Head camera

A head camera was used to provide an unobstructed and child-centric view of the task that showed the child’s hands on the objects, and the relation between each object and the opening. The head-mounted camera was embedded in a headband that could be placed on the child’s head in one movement. The camera used was a Watec model WAT-230A. This model has 512 × 492 effective image frame pixels, weighs 30g and measures 36 mm × 30 mm × 30 mm. The lens used was Watec model 1920BC-5, with a focal length off 1.9 and an angle of view of 115.2° on the horizontal and 83.7° on the vertical. The camera could be adjusted slightly to ensure that it was properly aligned to the center of the visual field when the head and eyes were directed forward (see Yoshida & Smith, 2008, for further calibration and validation studies). Power and video cables were sufficiently long and supported so that children could freely move their heads and bodies unhindered by the cords. A second camera recorded the child’s activity from a third person, side view.

Procedure

The child sat at a table next to the parent and beside the seated experimenter. At the start of the session the child was given an engaging toy with buttons to push that caused animals to pop up. The main experimenter distracted the child with this toy as a second experimenter placed the head camera on the child’s head and adjusted it so that the button that was being pushed by the child was centered in the head camera view at the moment of pushing. This was repeated three times with the push-button toy in different positions relative to the child’s midline. When this was accomplished, the toy was removed and the testing box was placed directly in front of the child within easy reach, and secured so as to not move. The experimenter demonstrated the insertion of one rectangular block and asked the child to copy her with the instructions “put it in the box”. The experiment began immediately after this one demonstration and practice trial. On each test trial the to-be-inserted object was placed in front of the child and directly in front of the hole in its designated orientation, and the child was asked to “put it in the box.” Children were given as much time as they wanted and allowed to make multiple attempts until they either successfully inserted the object or released it (dropped it or handed it back to the researcher). No direct feedback was given and all actions on the part of the child were accepted with a warm smile, equanimity, and a “ready for the next one?” The parents were instructed not to help their children. There were 12 trials, one for each unique object. The order of object presentation was randomly determined for each child. The entire experimental session lasted less than fifteen minutes.

Coding

Children’s performances were coded from the head camera view. The dependent measure was the alignment of the object’s major axis to that of the rectangular opening on the very first insertion attempt for that object. More specifically, the dependent variable was the angle of each object’s major axis relative to the major axis of the opening at the point in the first approach that was just prior to the object’s touching a side (if not perfectly aligned) or beginning to pass through the plane of the front face and opening of the box (if sufficiently well aligned). To calculate this angle, as shown in Figure 4, two lines were drawn, one bisecting the head camera image of the opening along its major axis and the second bisecting the image of the object along its major axis. The angle formed by the two lines was measured. Children could also orient the object for insertion so that the major axis was not elongated in the view but was instead foreshortened (parallel to the LOS: see Figure 2). For these attempts, no angle was measured and the response was scored as “foreshortened”. Two measures were derived from the coding of these angles: a categorical and an interval measure. For the categorical coding, the major axis was scored as Aligned to that of the opening if the measured angle was 25° or less. All other approaches, including those scored as “foreshortened” were categorized as Not Aligned. The second, interval measure, was Alignment Error of the major axes of the object and the standard (opening) and could vary from 0° to 90°.

Figure 4.

Figure 4

Drawings from head-camera images of “Not aligned” and “Aligned” responses. The dashed lines show the elongated axes of the opening and the object that were used to measure the angle of alignment.

Whether or not children ultimately inserted the object, and the number of adjustments attempted, were also coded. A main coder coded all of the data and those data were used for analysis. A second coder coded a randomly selected 25% of children’s performances. Cohen’s Kappa for the inter-rater reliability of the measurement of angles (within 5°) was .79 (coder agreement = .88); for the categorical judgment of Aligned or Not Aligned, Kappa was .98 (coder agreement = .96).

Results and discussion

Overall, children found the task highly enjoyable. The 18-month-old children managed to insert the object, often after multiple adjustments, on 58% of the trials whereas the 24 month olds succeeded (again with adjustments) on 79% of their trials (t (34) = 7.87, p < .001). On the first approach of the object to the opening, children sometimes produced a foreshortened approach, rotating the object to align its major axis with their LOS and the direction of hand motion. On average 18 month olds rarely did this, on only.08 (SD =.11) of the trials; 24 month olds used the foreshortened approach reliably more often than did 18 month olds (t (34) = 3.28, p < .05). but it was also not their dominant approach occurring on only on .20 (SD=.16) of the trials These trials are included as “Not Aligned” in the categorical measure of alignments of the two major axes (object and standard) and were excluded from the analyses of the interval (degrees) measure.

Our main question was whether the toddlers at each age level aligned the major axis of the object with the rectangular opening as they first moved the object towards the box. Although the insertion task did not strictly demand such alignment (given the size of the opening and the possibility of a foreshortened insertion strategy), a finding that children did physically rotate the objects to the align the major axis with that of the rectangle opening would provide strong evidence of their ability to use elongated axis as a frame of reference when comparing visual entities.

Figure 5, left panel, shows the mean proportion of first approaches on each trial in which the major axes of object and opening were aligned by the categorical measure. The proportions of aligned trials were submitted to a 2 (Age) X 2 (Opening Orientation: Horizontal/Vertical) X 3 (Object Type) X 2 (Object-Opening Alignment at Presentation: Aligned/Not Aligned) ANOVA. The analysis revealed a main effect of Age (F(1,29) = 4.432, p<.05. η2=0.47), with older children performing better than younger children overall, and a main effect of Object Type (F(1,29) = 4.90, p<.05, η2=0.43) with performance on Blocks better (by Tukeys HSD, p < .05) than performance on both Simple and Complex shapes. Although the differences between stimulus conditions are small, this Stimulus result indicates that children were better able to align the major axes of the object and opening when presented with the blocks than when given the more complex shapes, a finding that could indicate that overall shape similarity matters to the early use of object-centric reference frames. There was also a reliable interaction between Age and Object Type (F(1,29) = 4.78, p<.05, η2=0.26), as the alignment advantage of the Blocks over the other stimulus categories was greater for 18 month olds than for 24 month olds. Neither the main effect of Object- Opening Alignment at presentation nor any interaction with this factor approached significance (p > .30 in all cases). The lack of an effect of the object’s orientation is not surprising given children’s active manipulation and visual exploration of the objects prior to insertion. That is, the children frequently held, manually explored, rotated, put down and picked up the object before attempting to insert it, and on over 82% of trials they changed the orientation of the object from that at presentation prior to the initial approach to the box. In summary, by the categorical measure of alignment on first approach, 24 month olds showed greater ability than 18 month olds in aligning the major axis of an object to the major axis of a visual standard.

Figure 5.

Figure 5

Three measures of performance in Experiment 1 as a function of age group. Left: Proportion of trials on which the axis of elongation of the object and the rectangular slot were aligned (within the 25° tolerance of the opening) on initial approach (just prior to insertion). Center: Mean degree of measured alignment of the elongated axis of the object and the slot. The standard error of the mean is provided for both measures. Right: Histogram of all insertions within 10° bins (bins labeled by the maximum angle included) across all participants and stimuli for 18 month olds and for 24 month olds.

Figure 5, center, shows the means and standard errors for children’s Alignment Errors – the number of degrees by which the orientation of the major axes of the object and the opening differed. Children’s scores on this measure were submitted to a 2 (Age) X 2 (Opening Orientation: Horizontal/Vertical) X 3 (Object Type) X 2 (Object-Opening Alignment at Presentation:Aligned/Not Aligned) ANOVA. There was a main effect of Age (F(1,29) = 13.29, p<.001, η2=0.68) with the angle of difference smaller for older than for younger children. There was also a main effect of and Object Type (F (1,29) = 4.72, p<.05, η2=0.46) Post hoc comparisons (Tukey’s HSD, p < .05) indicate that children more precisely aligned the Blocks than the Simple Shapes or the Complex Shapes, again indicating greater difficulty in aligning objects that were not simple rectangular shapes. There was no interaction between Age and any of these factors. There was no main effect of Object-Opening Alignment at presentation and no interactions involving this factor.

The developmental differences in ability to align major axes are perhaps revealed most clearly by considering the variation in precision on individual trials. The right-most panel of Figure 5 shows the histogram of the mean angle of first approach for each trial within each age group (collapsed across stimulus categories). The distribution for 24 month olds is unimodal and skewed, with the mode at 20° (within the tolerance of the opening). Not only is the distribution for 18 month olds markedly shifted right, with the mode at 50°, but 90° and 10° alignments are also very common. This distribution raises the possibility that the 18 month olds were not actually trying to align the axes of object and opening.

The above analyses concern children’s first attempts because it is these first attempts with each object that provide information about children’s use of the visual information without feedback from prior attempts to insert the object. However, children’s adjustments after the initial attempt support the observed developmental differences. For the 24 month olds, 3 of the 16 children aligned and successfully inserted all 12 objects on the first attempt. The remaining 13 children in this older age group typically adjusted the orientation of the axis of the object on the second attempt so as to make the angle of misalignment smaller, M = .67 of second attempts, SD =.18). For the 18 month olds, 3 of the children never made a second attempt, despite not succeeding on the first attempt. The remaining younger children did make multiple attempts. However, in .66 of the attempted adjustments, these children rotated the object in the wrong way, so as to increase misalignment, and thus decreased the angle of misalignment on only .34 (SD =.25) of the second attempts. The lack of improvement on the 18 month olds’ second attempts suggests either that they did not perceive the differences in orientation of the major axes of the object and opening, or that they perceived the differences but did not recognize their relevance to the task, or that they perceived the differences but did not know how to make the necessary corrections to bring the two axes into alignment.

In summary, the results of Experiment 1 indicate the following: First, 24 month olds are able to extract the major axis of a 3-dimensional object and align it to the major axis of a rectangular opening, although they do not do so precisely on all first attempts. Second, this ability, at least as measured in this task, increases markedly between 18 and 24 months of age, with 18 month olds providing few instances of alignment. Third, the ease or difficulty of this alignment appears is related to the shape similarity of the to-be-inserted objects and the opening.

Experiment 2

There are several properties of the task used in Experiment 1 may have obscured children’s emerging abilities to assess and align the major axes of two visual forms. First, the insertion task required that children hold and move the objects along pathways perpendicular to the frontal plane. The need to orient the object to the opening while simultaneously moving it forward to the box could have limited children’s (and perhaps particularly the 18 month olds’) ability to demonstrate a sensitivity to the relation between the major axes of the object and the opening. Previous studies using insertion tasks suggest that moving objects downward into an opening is an earlier-acquired motor skill (Hayashi et al., 2006). Thus, in Experiment 2, we asked children to insert objects into trays laid flat on a table. The core demand of the task is the same as in Experiment 1, namely to align the major axis of the object with that of a standard (container), but the hand movements required to perform the act and the direction of motion are different. Second, in Experiment 1, the opening in the box was large so that precision in aligning the objects was not required. We chose a big opening in that first task to limit children’s possible frustration with failed insertions. However, both younger and older children appeared to enjoy the task regardless of their success. However, the large size of the opening may have encouraged less precision and alternative ways of inserting the object (the foreshortened insertions) and thus may over-estimate or under-estimate the abilities of the older children. Accordingly, the insertion task in Experiment 2 was designed to encourage and require precision: the objects “just fit” into the containers, requiring deviations in the angles of axes to be no greater than 5° in order for the object to be inserted. Again, the main question does not concern success in insertion per se, but the alignment of the major axis of the object with that of the container on initial approach.

Method

Participants

The participants were 32 children divided equally between two age groups – 18 month olds (7 males, 9 females. Mean Age = 17.2 mo; range = 16.1 to 18.2 mo) and 24 month olds (8 males, 8 females. Mean Age = 24.2 mo; range = 23.33 to 25.3 mo). The children were recruited from a working- and middle-class population in the Midwest, had no known developmental disorders and had normal visual acuity by parental report. All were tested in the laboratory. At each age level, half of the children were assigned to a Vertical Opening condition and half to a Horizontal Opening condition. None of the children had participated in Experiment 1 but many were participating in other unrelated experiments (concerning for example, language and early number concepts) on the same testing day.

Stimuli, procedure and coding

A puzzle-like tray was made from thick Styrofoam sheets (30 cm × 50.5 cm) mounted onto a board. The tray had 3 rectangular openings (18.5 cm × 7.5cm) cut into it as shown in Figures 2 and 4. We used a tray with 3 openings (all the same size) in order to invoke the idea of a puzzle in which the objects could be fit, and to accommodate the experimenter’s intermittent demonstrations of the task goal as described below. The trays were laid flat on the table, like a puzzle, such that an object could be fit downward into the opening, as one might place an object into a toddler puzzle board or into a tight fitting box. The trays were presented to the child either with the extended axes of the openings aligned horizontally (left to right) or vertically (near to far) with respect to the child. Six objects from Experiment 1 were used (the left-most object of each type in Figure 3 with the exception that some children in Experiment 2 used a car instead of the boat as horizontal complex objects). For all children, there were two of each Object Type, one presented in a vertical orientation (with respect to gravity) and the other in a horizontal orientation (with respect to gravity). In sum there were six trials involving 2 Blocks, 2 Simple shapes, and 2 Complex shapes.

The general procedure was the same as in Experiment 1. To begin the experiment, the experimenter took a practice object (a block that was not used as an insertion object) and placed it in one of the openings on the puzzle board to demonstrate the task goal. The experimenter repeated this action after the first three trials, and as needed when a child did not attempt an insertion. Children were free to put the test object into any available opening. The experimental session, including fitting and calibrating the head camera, lasted less than 15 minutes.

As in Experiment 1, children’s performances were coded from the head camera view. As in Experiment 1, the angle formed by the major axes of the object and of the rectangular opening was measured at the point in the initial approach just prior to when the object passed the surface plane of the container (if aligned) or just prior to when the object hit the surface of the box (if not aligned). Coders also scored successful insertions, defined as fitting the object entirely into an opening so that it lay flat. One coder served as the main coder for all of the data. A second coder scored a randomly selected 25% of the insertions: inter-rater reliabilities using Cohen’s Kappa were 0.81 for angle at first approach (within 5°, inter-coder agreement = .89) and 1.0 (intercoder agreement = 1.00) for first approach successes in inserting the object into the container (that is, instances in which the object was so well aligned with the opening that it went into the container on first approach). The central dependent measures were the categorical measure of Alignment (Aligned-Not Aligned), reflecting success on initial approach (defined as precision within the 5° tolerance of the opening), and the interval Alignment Error, measure (0 to 90°), of angle of difference between the the major axes of the object and the opening on first approach.

Results and discussion

Younger and older children’s approaches to this task differed. Eighteen-month-old children had considerable difficulty in fitting the objects into the openings, and generally made one initial attempt that ended with object on top of the opening, only partway into the opening, or given back to the experimenter. The younger children made second attempts or adjustments on fewer than 5% of trials. The 24-month-old children, in contrast, approached the opening with each object already roughly aligned and then maneuvered the object into the opening with a series of small adjustments. After a maximum of 3 small adjustments, the older children succeeded in laying the object fully into the opening on 100% of the trials. Again, however, the critical dependent measure is the not the end result of the effort, but the alignment of the axes on initial approach, because these alignments must be based on visual information about object and container shapes and not on feedback from a failed attempt. These results are shown in Figure 6.

Figure 6.

Figure 6

Three measures of performance in Experiment 2 as a function of age group. Left: Proportion of trials on which the axis of elongation of the object and the rectangular slot were aligned (within the 5° tolerance of the opening) on initial approach (just prior to insertion). Center: Mean degree of measured alignment of the elongated axis of the object and the slot. The standard error of the mean is provided for both measures. Right: Histogram of all insertions within 10° bins (bins labeled by the maximum angle included) across all participants and stimuli for 18 month olds and for 24 month olds.

The proportions of trials on which children succeeded in aligning the object with the opening were submitted to a 2(Age) X 3(Object Type) by 2(Orientation: Horizontal/Vertical) X 2(Object-Opening Alignment at Presentation: Aligned/Not Aligned) ANOVA. The analysis revealed a reliable main effect of Age (F (1,31) = 48.202, p<.001, η2=0.71) with 24-month-old children aligning precisely (within 5°) on first approach on .49 (SD=.24) of the trials and 18 month olds doing so on .10 (SD=.13) of the trials. If we had used the more tolerant 25° criterion as in Experiment 1for the categorization of an initial approach as Aligned, then the older children in the present experiment aligned the major axes of object and opening on .90 of the trials, whereas the younger children did so on only .37 of the trials. There were no main effects of Object Type (F< 1.00), Orientation of the opening, or Object-Opening Alignment at Presentation, and no interactions involving these factors.

These results, shown in the left-most panel of Figure 6, suggest that if a task demands precision alignment, the 24 month olds can provide it. Further, the lack of an effect of Object Type in the experiment suggests that the older children, at least, could extract and align the major axes of differently shaped visual forms: 24 month olds proportion of precise alignments (within 5°) were .48 (SD=.18) for the Blocks, .52 (SD=.23) for the Simple Shapes, and .46 (SD=.17) for the Complex Shapes. The younger children also showed no effect of Object Type on the categorical measure of Alignment, but in their case, the result reflects a floor level performance with all objects.

The interval measure of Angle Error at first attempt for each child was submitted to a 2(Age) X 3(Object Type) by 2(Orientation: Horizontal/Vertical) X 2(Object-Opening Alignment at Presentation: Aligned/Not Aligned) analysis of variance. This analysis also yielded only a reliable main effect of age (F (1,31) = 37.52, p<.001, η2=0.76). As is apparent in the center panel of Figure 6, older children performed much better than younger children by this measure. The main effect of Object Type was again not significant (F < 1.00). Across all objects, the mean angle at initial approach was .14 (SD = .11) for 24-month-old children and .40 (SD = .15) for the 18-month-old children. The histogram of the degree of initial alignment across trials (using the same 10° bins as in Experiment 1) shows the contrast between the high degree of precision of the approaches made by the 24 month olds and the generally poor alignments of the 18 month olds. In short: the task in Experiment 2 called for greater precision in aligning the major axes of the objects and openings than did the task in Experiment 1. The 24 month olds showed that they could readily meet this standard: however, the 18 month olds could not.

The results from this experiment clarify those of Experiment 1. In both experiments, 24 month olds showed greater skill than 18 month olds in aligning the object to the opening. However, Experiment 1 did not require precise alignments, whereas Experiment 2 did: and accordingly, the age difference in successful alignment observed in Experiment 1, while reliable, was not as substantial as the age difference observed in Experiment 2. The task in Experiment 2 also involved insertion motions that were downward and known to be easier to execute than lateral movements away from the body (e.g., Hayashi et al., 2006). The task properties of Experiment 2 exaggerated the developmental differences observed in Experiment 1 showing marked gains in sensitivity to the major axes of to-be-compared shaped between 18 and 24 months. Experiments 3 and 4 focused on 18 month olds only, and were designed to assess component abilities that may be relevant to the developmental differences observed in the first two experiments.

Experiment 3

Past research shows that when 18 month olds are given objects to hold and visually explore, they oversample the planar views of those objects – views in which the major axis is perpendicular or parallel to the line of sight (Pereira et al., 2010). Holding an object in ways that bias or oversample planar views indicates both sensitivity to the major axis and the ability to orient that axis with respect to the perceiver’s own body and viewing perspective. Prior studies of toddlers in insertion tasks (Hayashi, et al., 2006; Orkloo & von Hofsten, 2007) also suggest that they can orient objects with respect to their own viewing perspective to produce a foreshortened approach, as shown in Figure 2. In these foreshortened insertion attempts, the major axis is also aligned with the path of motion: alignment of an object with the path of motion has been shown to highlight attention to the aligned object axis in both adult (Sekuler, 1996) and child studies (Smith, 2005). Thus, positioning the major axis in relation to one’s own view or in relation to the direction of an action may be a developmental step towards aligning the axis of one visual object to another visual object. Accordingly, Experiment 3 was designed to determine whether 18 month olds could align the major axes of the same objects used in the prior experiments with a path of motion parallel to their line of sight, if given an insertion task that required foreshortened insertions (see Figure 2).

Method

Participants

The participants were 19 18 month olds (8 males, 11 females: mean age = 17.8 mo, range = 16.27 to 19.13 mo). The children were recruited from a working- and middle-class population in the Midwest and had no known developmental disorders. Parents reported normal visual acuity. One additional child was recruited but the data were not included due to experimenter error. None of these children participated in Experiments 1 or 2 but most participated in other tasks on other topics (language, number concepts, categorization) on the same day as their testing in this experiment.

Stimuli and procedure

The full set of 12 stimuli used in Experiment 1 – 4 Blocks, 4 Simple Shapes, and 4 Complex Shapes – were given one by one to the child to insert, for a total of 12 unique trials. All aspects of the stimuli and procedure were identical to those in Experiment 1 except that in this experiment, an 11.5 cm diameter hole was cut into a 26.5 cm × 46 cm × 36 cm cardboard box. Each object fit easily into the hole when its major axis was roughly perpendicular to the face of the box (and thus aligned with the direction of motion from the child to the box), but did not fit if the major axis was aligned parallel to the surface of the box.

Coding

Performance was coded from the head camera perspective. Two major coding categories were defined with respect to the Alignment of the major axis to the LOS on the approach to initial insertion: 1) Aligned: the major axis of the object was parallel to the line of sight (within 25° of a 90° angle to the frontal plane of the box) versus 2) Not Aligned: any other approach. Successful versus Unsuccessful Insertions on initial approach were also recorded. Two independent coders coded the same random sampling of 25% of the trials and Cohen’s Kappa calculated for inter-rater reliability was 0.81 for Alignment on initial approach (intercoder agreement = .91); Cohen’s Kappa was 1.0 for Successful/Unsuccessful Insertion on initial approach.

Results and discussion

This was an easy task for 18 month olds; they succeeded in inserting the object on their first attempt on 72% of the trials, and the angle of the major axis of the object on approach to the box was parallel to the line of sight (as defined above) on 86% of all initial attempts. In a 3(Object Type) X 2(Object Orientation at Presentation) ANOVA, there was no reliable effect of either factor on performance (for Object Type, F (1,17) = 4.10, p < .10; for Object Orientation, F (1,17) = .60, p < .40). Thus, when planning a goal-directed action that did not require comparing and matching the elongated axis of a visual object to that of a visual standard, 18 month olds clearly showed that they could align the major axis to the path of motion and line of sight. Further this ability did not appear to depend on the complexity of object shape as children held and properly rotated all the objects, including the complex shapes. Thus, the results of Experiments 3 indicate that in planning an action, 18-month-olds can identify the major axis so as to align that axis with their own line of sight or path of motion a sensitivity also indicated by the planar bias in how they hold objects for visual exploration.

Experiment 4

A critical question concerning 18 month olds performances in Experiments 1 and 2 concern whether their difficulties reflect the extraction of the major axis from the shape or perhaps only their ability to plan and execute rotations of objects. The goal of Experiment 4 was to make a simpler task that also involved rotating objects to assess this possibility. We simplified the task by removing three possible impediments to 18-month- old children’s ability to compare and align the major axis of a 3-dimensional object to that of a visual standard. All these possible impediments concern the nature of visual standard, that is the opening. First in Experiments 1 and 2, whereas the objects to be inserted were 3-dimensional, with multiple 2-dimensional views, the standard was the 2-dimensional contour around the opening. Comparing the shape properties of 3-dimensional and contours may be particularly difficult for young children (see Shutts et al., 2009). Second, the 2-dimensional entity was in each case the contour around an opening: the shapes of openings, including contrast, shading and whether the hole is visually seen as figure or ground, differ considerably from the visual properties that define the shapes of pictures and 3-dimensional volumes (Bertamini& Croucher, 2003; Giralt & Bloom, 2000, Peterson, 1999; Shutts et al., 2009) and thus the use of a task requiring aligning objects with the negative space of an opening could be a critical limitation. Third, shape complexity in and of itself may be a limiting factor in the extraction of the major axis, an idea suggested by 18 month olds’ better (albeit not strong) performance with blocks than with the simple and complex shapes in Experiment 1. Very young children may be able to rotate and align 3-dimensional objects to match the orientation of other same-shaped 3-dimensional objects. Doing so would not necessarily require perceptual isolation of the major axes of the two objects.

The Experiment 4 task that resulted from these considerations is illustrated in Figure 7. Children were first presented with two blocks, the “flankers”, which were fixed to the tabletop. Both flankers were either vertically oriented (as in the figure) or horizontally oriented with respect to gravity. Children were then given a block to insert that was presented in an orientation that either matched or did not match the orientation of the flankers. Their insertions were coded as Aligned or Not Aligned based only on the orientation of the major axis of the inserted object (and not on whether the three blocks formed a straight line or were evenly distributed). Note that there are at least three ways in which the child could solve this task: by comparing and aligning the major axes of the blocks; by comparing and aligning the height of the placed object to the heights of the fixed flankers; and by comparing shapes (long versus tall rectangles). These correlated cues are all related to the structural property of the axis of elongation and are correlated in sets of everyday things with similar shapes. The question for Experiment 4 was whether in this supportive context, with these redundant cues, 18 month olds could compare the shapes of objects and could physically rotate one object to bring its elongated axis into alignment with the elongated axes of the flankers.

Figure 7.

Figure 7

Illustration of the flanker task used in Experiment 4: The child is presented with fixed flanker blocks and is asked to insert a block to match the flankers. Insertions may be aligned or not aligned, as illustrated, with the axes of elongation of the flanker blocks.

Method

Participants

The participants were 16 18 month olds (9 males, 7 females: mean age = 18.2 mos, range =17.53 to 18.63 mos). Three additional children refused to do the task. The children were recruited from a working- and middle-class population in the Midwest and had no known developmental disorders. Parents reported normal visual acuity. Half of the children were randomly assigned to a Vertical Flankers condition and half to a Horizontal Flankers condition. None of the children participated in Experiments 1, 2, or 3.

Stimuli

Twelve plastic rectangular blocks (17.75 X 7.5 × 3.5 cm) were created. Four blocks –all the same color --were used as flankers: two were affixed to a wood board (61 X 25.5 X 1 cm) in a vertical orientation and two were affixed to a wood board in a horizontal orientation. In each case, the two flankers were separated by a space large enough to accommodate another block inserted between them in either a vertical or a horizontal orientation. The remaining 8 blocks, each a unique color, were used as the test objects to be placed between the flankers.

Procedure and Coding

The overall procedure was identical to that used in the prior experiments. However in this experiment, as in Experiment 1, the orientation of the test objects at presentation was alternated across trials. Each test object was placed on the table in front of the child in either a horizontal or vertical orientation with respect to gravity. The children were told to put the object in between the other two blocks and to “make it look the same”. The experimenter demonstrated a correct alignment prior to the child’s first trial and then every 2 to 3 trials thereafter. There were a total of 8 trials, one for each uniquely colored block, and the experimental session lasted less than fifteen minutes. The videos were coded for either correct or incorrect alignment of the major axis of the each test object with the orientation of the major axes of the flankers. Coding was from the head camera view as in the prior experiments. Two coders both independently scored the same randomly selected 25% of trials and produced 100% agreement.

Results and Discussion

Children’s placements of the objects were scored as matching or not matching the flankers. The 18 month olds were highly successful in this task, aligning the objects with the flankers on 70% of the trials, and doing so equally well in the Horizontal (69% aligned) and Vertical (71% aligned) Flanker conditions, and equally well whether the object to be acted upon was presented aligned (72%) or not aligned (and thus had to be rotated) to the flankers (68%). A 2 (Flankers: Horizontal/Vertical) X 2(Object Orientation at Presentation: Horizontal/Vertical) analysis of variance yielded no main effects or interactions. Nine of the 16 children placed the objects correctly on 7 or 8 of the total 8 trials, the probability that this number of children would achieve this rate of success exceeds that expected by chance (binominal exact probability, p < .0001). Thus, in a context in which 3-dimensional objects were compared to 3-dimensional objects and in which multiple cues supported comparison and alignment, the 18 month olds’ were able to physically rotate objects so that their axis of elongation matched that of another object. It is also the case that only two orientations were physically possible because the objects had to be flat on the table, a real world physical constraint that limited response options. Nonetheless, the success of 18 month olds’ in this task clearly shows that that their difficulties in Experiments 1 and 2 were not due to problems in physically rotating the objects because they readily executing those rotations in the Experiment 4 task. Their success also suggests that the ability to align objects by their axis of elongation might emerge first in physically constrained contexts in which the to-be-compared objects are visually similar and thus have multiple cues to support alignment.

General Discussion

The most extended axis of an object is believed to play an important role in object constancy and object categorization by serving as an object-centered reference frame for aligning, comparing, and integrating object views (e.g., Amir, Biederman & Hayworth, 2012; Graf, 2006; Farivar, 2009; Marr & Nishimara, 1978). The present experiments are the first of which we are aware that have attempted to directly assess very young children’s abilities to physically align visual entities by their axes of elongation. The results show that, by the end of the second year, children can plan actions that physically align quite different shapes by their most elongated axis, plans that would require the internal comparison of the two elongated axes, a necessary component for the use of the major axes as reference frame for visual object recognition. The results further suggest that these abilities develop during the period from 18 to 24 months. These developmental differences do not appear due to the visual-motor skills needed to perform the required actions as Experiments 1 and 2 found the same results in tasks requiring different actions, as Experiment 3 and 4 showed that 18 months could rotate the objects to align the major axis to the path of motion or to a same shaped object. Thus, the overall pattern of results is consistent with the hypothesis that during this developmental period, children are increasingly able to visually isolate the axes of elongation of different visual forms.

The developmental timing of these achievements occurs at the same time as other potentially related changes in object perception. Each connecting line in Figure 8 indicates an empirically reported correlation between two achievements in 18 to 24 month olds. First, marked changes between 18 and 24 months have been reported in young children’s ability to recognize common objects – chairs, buckets, cars – from sparse major-part 3-dimensional representations of object shape (e.g., Smith, 2003; Pereira & Smith, 2009). This ability to recognize basic level categories from such sparse structural information, in turn, has been shown to be correlated with the size of individual children’s object name vocabularies (Smith, 2003; Pereira & Smith, 2009) and to the strength of the shape bias – the generalization of newly learned object names to new instances by similarity in shape (Yee et al., 2012). As noted in the introduction, during this same developmental period, when children are given objects to hold and visually examine, they increasingly orient those objects to oversample planar views (Pereira et al., 2010), views in which the major axis is extended in the frontal plane. The strength of a planar bias in self-generated views has been shown to be correlated with children’s abilities to recognize basic level categories from sparse representations of shape and with object name vocabulary size (James et al., 2013). These developments may be interconnected by the emerging sensitivity to the most elongated axis of an object as the object-centric frame of reference, supporting object constancy and categorization. This is the key hypothesis to be tested next in this program of research.

Figure 8.

Figure 8

Illustration of the web of inter-related developmental changes between 18 and 24 months in tasks related to the perception of object shape. Letters denote citations for reported correlations.

Developing reference frames

The 18 month olds’ level of success in Experiments 3 and 4 not only help in interpreting the findings of Experiments 1 and 2 but may also provide potential insights into the developmental progression and experiences that support the use of the major axis as a reference frame for object recognition. The insertion task in Experiment 3 did not require matching the most extended axis of the held object to that of the opening, but instead required aligning the major axis of the held object to the child’s own body – to the path of the hand’s motion in moving the object to the opening and to the child’s line of sight to the goal. Within this task context, the 18 month olds succeeded. The developmental priority of orienting objects with respect to the actor’s own body has been noted in studies of early tool use in this same age period. For example, toddlers can orient a tool such as a hairbrush or a spoon appropriately to their own body before they can do so with respect to another body (e.g., a doll: see McCarty, Clifton, & Collard, 2001; Hayashi et al., 2006). One hypothesis suggested by this pattern is that access to the major axis may first emerge as a reference frame for manual actions on individual objects and with respect to the relation between the axis of elongation and the actor (see also, Smith, 2005). This proposal also fits the evidence of a planar bias in self-generated views in 18 month olds in that these are views that align the major axis with respect to the viewer (Pereira et al., 2010).

The 18-month-old children’s performances in Experiment 4 may also provide insight into the kinds of experiences that may support the development of the major axis as a frame of reference for comparing objects. In the Experiment 4 flanker task, children could succeed by matching on a single shape dimension (tall or short rectangles) or by matching just the heights of the inserted and flanker objects without perceptually isolating or the major axes. Further, given the physics of blocks and tables, objects only could be positioned in specific ways. These correlated factors and physical constraints may be important characteristics of the developmental training ground for frames of reference for shape comparison. That is, our conjecture is that these redundant properties of real physical constraints as exist in the everyday tasks of stacking blocks, inserting dolls in beds, and lining up toy cars may provide critical experiences for extracting the major axis as a reference frame for comparing shapes.

A role for action?

Our experimental tasks were action tasks. As such, they required motor planning and execution skills beyond the perception and representation of an object’s axis of elongation. Dominant theories of the human visual system divide that system into two distinct and separate functionalities, vision for planning actions and vision for object recognition, and these are proposed to be sensitive to different kinds of information, and to operate via different computational principles (Goodale & Milner, 1992; Ungerlieder & Mishkin, 1982). From this perspective, the demonstration of sensitivity to an object’s major axis in planning an action would not necessarily imply sensitivity to the major axis in building visual representations for object recognition.

However, there are theoretical arguments and growing evidence from several directions that link visual object recognition and its development to action and to the full set of multi-sensory experiences (not just visual experiences but also dynamically coupled haptic and proprioceptive information) that emerge from active engagement with objects (Cooke, Jakel, Wallraven & Bulthoff, 2007; Cuijpers, Smeets & Brenner, 2004; Farivar, 2009; Hommel, Muesseler, Aschersleben, & Prinz, 2001; James et al., 2001, 2002; Simmons & Barsalou, 2003). For example, holding an object provides direct haptic information about 3-dimensional shape, and 3-dimensional shape recognition is improved when adults explore an object both haptically and visually as compared to only visually (e.g., Craddock & Lawson, 2008). Further haptic information may be more view-invariant than visual information and thus when coupled with visual exploration may support the development of more view-variant visual representations (see Lawson, 2009). Infants’ joint haptic and visual experiences of objects begin early (see Rochat, 1989) and thus may be part of the developmental pathway leading to the extraction of visual frames of reference for aligning differently oriented objects.

Other experimental evidence from adults also implicates action in visual object recognition. For example, planning an action on an object has been shown to prime (and in some cases alter) the subsequent visual recognition of that object (e.g., Helbig, Graf, & Kiefer, 2006; see Smith 2005 for similar evidence from children). Evidence from neuroimaging studies supports these behavioral findings by showing activation in motor and premotor areas both in adults (e.g., Buxbaum & Kalenine, 2010; Martin & Chao, 2001; Cross et al., 2012) and in children (Dekker et al., 2011; James & Swain, 2011; James & Bose, 2011) when manipulable objects are viewed. These ideas are consistent with the many developmental demonstrations of the mutual influences between visual processes involved in action and in object recognition and are consistent with the growing evidence on links between action and visual object in infants and children (e.g., James et al., 2013; Ruff & Saltarelli, 1993; Smith, 2005; Perone, et al., 2008; Soska, Adolph & Johnson, 2010).

Thus, the tasks used in the present study may not only provide a measure of infants’ sensitivity to the elongated axis of an object but may also exemplify the kinds of action contexts that support the extraction and representation of this property. When toddlers lay a doll onto a doll bed or stack blocks on top of each other, the motor activities of holding and rotating the objects in the service of these outcomes are organized around the objects’ major axes. These actions in turn generate dynamic multisensory information, and they generate visual consequences such as aligned dolls and beds, and aligned or misaligned blocks. These actions and their consequences may be critical components of the experiential history that builds greater sensitivity to the nonmetric properties of 3-dimensional shape, and that also builds object representations that are based on and make use of those properties. The present findings show that sensitivity to the axis of elongation – an object centered frame of reference for representing object shape -- strengthens in action tasks during the second year of life.

  • Visual object recognition changes markedly as toddlers manually act on objects.

  • Aligning objects by their most elongated axis is on critically developing skill.

  • Aligning objects by their major axis improves markedly between 18- and 24 months.

Acknowledgments

This research was supported by NICHD grants R01HD 28675 and R01 HD057077. SS was supported by NICHD training grant 5T32HD007475.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Amir O, Biederman I, Hayworth KJ. Sensitivity to nonaccidental properties across various shape dimensions. Vision Research. 2012;62:35–43. doi: 10.1016/j.visres.2012.03.020. [DOI] [PubMed] [Google Scholar]
  2. Augustine E, Smith LB, Jones SS. Parts and relations in young children's shape-based object recognition. Journal of Cognition and Development. 2011;12(4):556–572. doi: 10.1080/15248372.2011.560586. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bertamini M, Croucher CJ. The shape of holes. Cognition. 2003;87:33–54. doi: 10.1016/s0010-0277(02)00183-x. [DOI] [PubMed] [Google Scholar]
  4. Biederman I. Recognition-by-components: a theory of human image understanding. Psychological review. 1987;94(2):115–147. doi: 10.1037/0033-295X.94.2.115. [DOI] [PubMed] [Google Scholar]
  5. Biederman I, Gerhardstein PC. Recognizing depth-rotated objects: evidence and conditions for three-dimensional viewpoint invariance. Journal of Experimental Psychology: Human perception and performance. 1993;19(6):1162. doi: 10.1037//0096-1523.19.6.1162. [DOI] [PubMed] [Google Scholar]
  6. Cavanagh P. Visual cognition. Vision research. 2011;51(13):1538–1551. doi: 10.1016/j.visres.2011.01.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Craddock M, Lawson R. Repetition priming and the haptic recognition of familiar and unfamiliar objects. Perception & psychophysics. 2008;70(7):1350–1365. doi: 10.3758/PP.70.7.1350. [DOI] [PubMed] [Google Scholar]
  8. Cutzu F, Tarr MJ. Representation of three dimensional object similarity in human vision; Paper presented at the SPIE Electronic Imaging: Human Vision and Electronic Imaging II; San Jose, CA. 2007. [Google Scholar]
  9. Dekker T, Mareschal D, Sereno MI, Johnson MH. Dorsal and ventral stream activation and object recognition performance in school-age children. NeuroImage. 2011;57(3):659–670. doi: 10.1016/j.neuroimage.2010.11.005. [DOI] [PubMed] [Google Scholar]
  10. Edelman S, Duvdevani-Bar S. A model of visual recognition and categorization. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences. 1997;352(1358):1191–1202. doi: 10.1098/rstb.1997.0102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Farivar R. Dorsal-ventral integration in object recognition. Brain Research Reviews. 2009;61(2):144–153. doi: 10.1016/j.brainresrev.2009.05.006. [DOI] [PubMed] [Google Scholar]
  12. Giralt N, Bloom P. How special are objects? Children’s reasoning about objects, parts, and holes. Psychological Science. 2000;11(6):497–501. doi: 10.1111/1467-9280.00295. [DOI] [PubMed] [Google Scholar]
  13. Goodale MA, Milner AD. Separate visual pathways for perception and action. Trends in neurosciences. 1992;15(1):20–25. doi: 10.1016/0166-2236(92)90344-8. [DOI] [PubMed] [Google Scholar]
  14. Graf M. Coordinate transformations in object recognition. Psychological Bulletin. 2006;132(6):920–945. doi: 10.1037/0033-2909.132.6.920. [DOI] [PubMed] [Google Scholar]
  15. Gregory E, Landau B, McCloskey M. Representation of object orientation in children: Evidence from mirror-image confusions. Visual cognition. 2011;19(8):1035–1062. doi: 10.1080/13506285.2011.610764. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Harman KL, Humphrey GK, Goodale MA. Active manual control of object views facilitates visual recognition. Current Biology. 1999;9:1315–1318. doi: 10.1016/s0960-9822(00)80053-6. [DOI] [PubMed] [Google Scholar]
  17. Hayashi M, Takeshita H, Matsuzawa T. Cognitive development in chimpanzees. Springer Tokyo: 2006. Cognitive development in apes and humans assessed by object manipulation; pp. 395–410. [Google Scholar]
  18. Helbig HB, Graf M, Kiefer M. The role of action representations in visual object recognition. Experimental Brain Research. 2006;174(2):221–228. doi: 10.1007/s00221-006-0443-5. [DOI] [PubMed] [Google Scholar]
  19. Hommel B, Müsseler J, Aschersleben G, Prinz W. The theory of event coding (TEC): A framework for perception and action planning. Behavioral and brain sciences. 2001;24(05):849–878. doi: 10.1017/s0140525x01000103. [DOI] [PubMed] [Google Scholar]
  20. James KH, Humphrey GK, Goodale MA. Manipulating and recognizing virtual objects: Where the action is. Canadian Journal of Experimental Psychology. 2001;55:111–120. doi: 10.1037/h0087358. [DOI] [PubMed] [Google Scholar]
  21. James KH, Humphrey GK, Vilis T, Corrie B, Baddour R, Goodale MA. “Active” and “passive” learning of three-dimensional object structure within an immersive virtual reality environment. Behavior Research Methods, Instruments & Computers. 2002;34:383–390. doi: 10.3758/bf03195466. [DOI] [PubMed] [Google Scholar]
  22. James KH, Swain SN. Only self-generated actions create sensori-motor systems in the developing brain. Developmental Science. 2011;14:673–678. doi: 10.1111/j.1467-7687.2010.01011.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. James KH, Bose P. Self-generated actions during learning objects and sounds create sensori-motor systems in the developing brain. Cognition, Brain & Behavior. 2011;15:485–503. [PMC free article] [PubMed] [Google Scholar]
  24. James KH, Swain SN, Jones SS, Smith LB. Young Children's Self-Generated Object Views and Object Recognition. Journal of Cognition and Development. 2013 doi: 10.1080/15248372.2012.749481. online view. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kraebel KS, West RN, Gerhardstein P. The influence of training views on infants' long-term memory for simple 3D shapes. Developmental psychobiology. 2007;49(4):406–420. doi: 10.1002/dev.20222. [DOI] [PubMed] [Google Scholar]
  26. Lawson R. A comparison of the effects of depth rotation on visual and haptic three-dimensional object recognition. Journal of experimental psychology. Human perception and performance. 2009;35(4):911. doi: 10.1037/a0015025. [DOI] [PubMed] [Google Scholar]
  27. Lockman JJ. A perception–action perspective on tool use development. Child Development. 2000;71(1):137–144. doi: 10.1111/1467-8624.00127. [DOI] [PubMed] [Google Scholar]
  28. Lowe DG. Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence. 1987;31:355–395. [Google Scholar]
  29. Marr D, Nishihara HK. Representation and recognition of the spatial organization of three-dimensional shapes. Proceedings of the Royal Society of London, Series B. 1978;200:269–294. doi: 10.1098/rspb.1978.0020. [DOI] [PubMed] [Google Scholar]
  30. Martin A, Chao LL. Semantic memory and the brain: structure and processes. Current opinion in neurobiology. 2001;11(2):194–201. doi: 10.1016/s0959-4388(00)00196-3. [DOI] [PubMed] [Google Scholar]
  31. McCarty ME, Clifton RK, Collard RR. The beginnings of tool use by infants and toddlers. Infancy. 2001;2(2):233–256. [Google Scholar]
  32. Nishimura M, Scherf S, Behrmann M. Development of object recognition in humans. F1000 biology reports. 2009;1 doi: 10.3410/B1-56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Ons B, Wagemans J. Development of differential sensitivity for shape changes resulting from linear and nonlinear planar transformations. i-Perception. 2011;2(2):121. doi: 10.1068/i0407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Örnkloo H, von Hofsten C. Fitting objects into holes: On the development of spatial cognition skills. Developmental Psychology. 2007;42:404–416. doi: 10.1037/0012-1649.43.2.404. [DOI] [PubMed] [Google Scholar]
  35. Perrett DI, Harries MH, Looker S. Use of preferential inspection to define the viewing sphere and characteristic views of an arbitrary machined tool part. Perception. 1992;21:497–497. doi: 10.1068/p210497. [DOI] [PubMed] [Google Scholar]
  36. Pereira AF, Smith LB. Developmental changes in visual object recognition between 18 and 24 months of age. Developmental science. 2009;12(1):67–80. doi: 10.1111/j.1467-7687.2008.00747.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Pereira A, James K, Jones S, Smith LB. Early biases and developmental changes in self-generated object views. Journal of Vision. 2010;10(11):1–13. doi: 10.1167/10.11.22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Perone S, Madole KL, Ross-Sheehy S, Carey M, Oakes LM. The relation between infants’ activity with objects and attention to object appearance. Developmental psychology. 2008;44(5):1242. doi: 10.1037/0012-1649.44.5.1242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Peterson MA. Organization, segregation and object recognition. Intellectica. 1999;28:37–51. [Google Scholar]
  40. Peterson MA. Object perception. In: Goldstein EB, editor. Blackwell handbook of perception. Oxford, UK: Blackwell; 2001. pp. 168–203. [Google Scholar]
  41. Pinto N, Cox DD, DiCarlo JJ. Why is real-world visual object recognition hard? PLoS computational biology. 2008;4(1):e27. doi: 10.1371/journal.pcbi.0040027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Rochat P. Object manipulation and exploration in 2-to 5-month-old infants. Developmental Psychology. 1989;25(6):871–884. [Google Scholar]
  43. Rosch E. Principles of categorization. Concepts: core readings. 1999:189–206. [Google Scholar]
  44. Ruff HA, Saltarelli LM. Exploratory play with objects: Basic cognitive processes and individual differences. In: Bornstein MH, O’Reilly AW, editors. The role of play in the development of thought. New directions for child development No. 59: The Jossey-Bass education series. San Francisco, CA US: Jossey-Bass; 1993. pp. 5–16. [DOI] [PubMed] [Google Scholar]
  45. Sekuler, Allison B. Axis of elongation can determine reference frames for object perception. Canadian Journal of Experimental Psychology. 1996;50(3):270–279. doi: 10.1037/1196-1961.50.3.270. [DOI] [PubMed] [Google Scholar]
  46. Simmons WK, Barsalou LW. The similarity-in-topography principle: Reconciling theories of conceptual deficits. Cognitive Neuropsychology. 2003;20(3–6):451–486. doi: 10.1080/02643290342000032. [DOI] [PubMed] [Google Scholar]
  47. Shutts K, Örnkloo H, Von Hofsten C, Keen R, Spelke ES. Young children’s representations of spatial and functional relations between objects. Child Development. 2009;80(6):1612–1627. doi: 10.1111/j.1467-8624.2009.01357.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Smith LB. Learning to recognize objects. Psychological Science. 2003;14(3):244–250. doi: 10.1111/1467-9280.03439. [DOI] [PubMed] [Google Scholar]
  49. Smith LB. Action alters shape categories. Cognitive Science. 2005;29:665–679. doi: 10.1207/s15516709cog0000_13. [DOI] [PubMed] [Google Scholar]
  50. Smith LB. From fragments to geometric shape: Changes in visual object recognition between 18- and 24- months. Current Directions in Psychology. 2009;18(5):290–294. doi: 10.1111/j.1467-8721.2009.01654.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Smith LB, Jones SS. Symbolic play connects to language through visual object recognition. Developmental science. 2011;14(5):1142–1149. doi: 10.1111/j.1467-7687.2011.01065.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Soska KC, Johnson SP. Development of Three-Dimensional Object Completion in Infancy. Child development. 2008;79(5):1230–1236. doi: 10.1111/j.1467-8624.2008.01185.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Soska KC, Adolph KE, Johnson SP. Systems in development: motor skill acquisition facilitates three-dimensional object completion. Developmental psychology. 2010;46(1):129–138. doi: 10.1037/a0014618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Street SY, James KH, Jones SS, Smith LB. Vision for action in toddlers: The posting task. Child development. 2011;82(6):2083–2094. doi: 10.1111/j.1467-8624.2011.01655.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Tarr MJ. Visual object recognition: Can a single mechanism suffice? In: Peterson MA, Rhodes G, editors. Perception of faces, objects, and scenes: Analytic and holistic processes. Oxford, England: Oxford University Press; 2003. pp. 177–211. [Google Scholar]
  56. Mishkin M, Ungerleider LG. Contribution of striate inputs to the visuospatial functions of parieto-preoccipital cortex in monkeys. Behavioural brain research. 1982;6(1):57–77. doi: 10.1016/0166-4328(82)90081-x. [DOI] [PubMed] [Google Scholar]
  57. Von Hofsten C. Action in development. Developmental Science. 2007;10(1):54–60. doi: 10.1111/j.1467-7687.2007.00564.x. [DOI] [PubMed] [Google Scholar]
  58. Yee M, Jones SS, Smith LB. Changes in visual object recognition precede the shape bias in early noun learning. Frontiers in psychology. 2012;3 doi: 10.3389/fpsyg.2012.00533. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Yoshida H, Smith LB. What’s in view for toddlers? Using a head camera to study visual experience. Infancy. 2008;13(3):229–248. doi: 10.1080/15250000802004437. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES