Seeing our 3D world while only viewing contour-drawings

Maddex Farshchi; Alexandra Kiba; Tadamasa Sawada

doi:10.1371/journal.pone.0242581

. 2021 Jan 22;16(1):e0242581. doi: 10.1371/journal.pone.0242581

Seeing our 3D world while only viewing contour-drawings

Maddex Farshchi ¹, Alexandra Kiba ¹, Tadamasa Sawada ^1,^*

Editor: Markus Lappe²

PMCID: PMC7822326 PMID: 33481778

Abstract

Artists can represent a 3D object by using only contours in a 2D drawing. Prior studies have shown that people can use such drawings to perceive 3D shapes reliably, but it is not clear how useful this kind of contour information actually is in a real dynamical scene in which people interact with objects. To address this issue, we developed an Augmented Reality (AR) device that can show a participant a contour-drawing or a grayscale-image of a real dynamical scene in an immersive manner. We compared the performance of people in a variety of run-of-the-mill tasks with both contour-drawings and grayscale-images under natural viewing conditions in three behavioral experiments. The results of these experiments showed that the people could perform almost equally well with both types of images. This contour information may be sufficient to provide the basis for our visual system to obtain much of the 3D information needed for successful visuomotor interactions in our everyday life.

Introduction

Artists can represent a 3D scene with 3D objects by using only contours in a 2D contour-drawing and people can recognize the scene and objects reliably from such drawings [1–9]. There are computer vision algorithms that try to emulate this artists’ skill and can generate contour-drawings from 2D photographic-images of 3D scenes [10, 11] and from 3D information contained in the scene [12, 13]. These contours represent an abrupt change of the luminance, color, or texture in the image and characteristic features in the 3D information. These characteristic features include self-occluding boundaries on the surface of objects [14–16], ridges on the surface [17, 18], as well as sharp edges on surfaces (see [12, 13] for reviews). Neither the luminance-polarity nor luminance-gradients are present in a contour-drawing.

Human beings can see the shape and position of a 3D object veridically when given only 2D drawings of it, and they can also recognize such objects reliably [1, 3–6, 19–27]. These well-known facts present a problem because according to Inverse-Problem Theory, the recovery of the shape of a 3D object from a 2D drawing is an ill-posed inverse problem. There are infinitely many 3D interpretations of a 2D contour-drawing. Note that line-drawings lack most depth cues, including binocular disparity, shading, and cast-shadows. This inverse problem can be resolved by imposing a priori constraints on the family of possible 3D interpretations [3, 5, 28–30]. Now, consider ordinary objects we see and use in our everyday life. Such objects are not composed of a random scattering of points. They can be characterized by regularities of their shape, for example, their symmetry, volume, the planarity of their contours, and the presence of rectangular corners. These regularities, which introduce specific features into the drawing, also could be used to detect the presence and shape of an object [30–33]. Our visual system could make use these regularities as the a priori constraints needed to recover the shape of a 3D object from a 2D contour-drawing of the object [34–38].

Prior studies that tested 3D perception from contour drawings have shown that people can obtain 3D information from the contour drawing, but it is not clear just how useful such contour information actually is in real dynamical 3D scenes in which ordinary objects are recognized and utilized under natural viewing conditions. These studies generated contour-drawings of objects taking care to avoid using degenerate views [5, 30], but note that in the real 3D scenes, objects will often be seen with degenerate views. Also note that people often change their viewing positions providing them with different views of the objects. Put simply, people can interact with the objects in real dynamical scenes.

Visual perception in real dynamical 3D scenes can be studied using the XR (Augmented-, Mixed-, and Virtual-Reality) technology. This XR technology can provide immersive experiences of a 3D scene that can be controlled by a computer. It has been shown that people can see the 3D information in the scene and they can interact within the scene on the basis of the visual information provided by the XR technology even if the scene is not fully photorealistic [39–41].

We developed an Augmented Reality (AR) device that can show a participant both a contour-drawing and a grayscale-image of a real dynamical 3D scene in an immersive manner (see [42–44] for earlier studies using AR devices to test the human visual system). The gray-scale images were used as a control. They provided a baseline for the performance of a participant conducting our kind of tasks while wearing this device in our experiments. Our AR device allowed us to determine how well the participant can interact dynamically with objects in a scene, by using only contours in the contour-drawing or by using luminance-polarity and luminance-gradients in the gray-scale image.

General methods

AR device

The AR device used in this study showed, in an immersive manner, a contour-drawing and a grayscale image that represented a scene "out there" (Fig 1). The device was composed primarily of a smart phone (Lenovo Phab 2 Pro) and a wearable stereoscope (VR head-set). The phone ran on the Google Android OS which was equipped with an LCD screen and a camera located on the back of the screen. The two halves of the screen were seen individually by the two eyes of a participant who looked through the lenses of the stereoscope. The distance between the eyes and the screen, when the stereoscope was worn, was 8.5 cm. The screen’s resolution was 1248 × 2560 pixels, and its size was 6.9 × 14.2 cm.

The phone’s camera captured a photographic image of the scene in front of the participant. A contour-drawing and grayscale-image representing the scene were generated from this photographic image. This image was first converted to a grayscale-image I_G. Then, I_G was passed through a set of image filters to generate the contour-drawing I_C:

I_{C} = 0.5 | (I_{G} * B) * S_{V} | + 0.5 | (I_{G} * B) * S_{H} |

where B was a Gaussian filter and Sv and Sh were Sobel filters [10, 11, 45] that emphasized vertical and horizontal edges:

B = \frac{1}{16} [\begin{matrix} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{matrix}]

S_{V} = [\begin{matrix} - 1 & 0 & + 1 \\ - 2 & 0 & + 2 \\ - 1 & 0 & + 1 \end{matrix}]

S_{H} = [\begin{matrix} - 1 & - 2 & - 1 \\ 0 & 0 & 0 \\ + 1 & + 2 & + 1 \end{matrix}]

Sobel filters were chosen because of their computational simplicity. This allowed our AR device to process the photographic images in near real-time. It is worth mentioning that an analogy of this algorithm with the visual system’s process of edge detection in the primary visual cortex has been discussed [46]. This image process was implemented as an Android app using OpenCV library [47]. The resolution of the original photographic image and of the processed images was the same as the resolution of the screen (1248 × 2560 pixels).

Two image segments (1248 × 1280 pixels) taken from regions in the grayscale-image I_G or in the contour-drawing I_C were shown on the left and right halves of the screen (Fig 2). These regions were horizontal translations of one another in I_G and I_C. The size of the translation Δ_S could be adjusted to allow the participant to fuse the retinal mages of the halves of the screen when the screen was viewed with the stereoscope. Note that the two halves of the screen were seen binocularly but binocular depth cues (binocular disparity and vergence) could not be used to perceive the 3D scene. These cues simply represented a frontoparallel plane but its effect on the immersive experience with the AR device seemed to be small [48].

Fig 2 — These regions are horizontal translations of one another for Δ_S in I_G and I_C.

A small wide-angle lens was attached to the camera to widen the camera’s field of view. The visual angle of each image segment that was displayed on half of the screen was 53 × 54° from the camera when this lens was attached. The visual angle of each screen half was 58 × 59° from the eye of the participant. The refresh-rate of the screen was 10 Hz because of the time required to generate the grayscale-image and the contour-drawing. These processes also introduced a delay of 200–300 msec into the time required to refresh the screen. The refresh-rate (10 Hz) and the delay (200–300 msec) should be acceptable for an immersive experience when the AR device was used, but it could degrade a participants’ interaction in the AR environment (see [49–51] for reviews; see the Appendix). Note that the contour-drawing was always generated regardless of whether the contour-drawing or the grayscale-image were shown on the screen. This made the refresh-rate and the delay constant with both types of images.

A shutter panel was also attached to the camera. This panel fully occluded the camera’s view when it was closed. A clicking signal produced by pressing the middle button of a mouse was triggered when the shutter panel opened. This signal was used to signal the onset of a trial in the experiments.

Procedure

The experiments were conducted in a well-lit room. The participant’s tasks were different from one another, but all of the tasks required interacting with objects on a desk in front of the participant. During these tasks, the participant sat on a chair in front of this desk and viewed the objects on it through our AR device. Note that the participant could only see objects that were relevant to the task at hand during each trial. All other objects were hidden from view.

The two conditions with different image filters were blocked within each experiment and the participant put on the AR device with one of the image filters before each block. The block started with a training phase during which the participant was asked to look at their hands and to examine the scene through the device for 1 minute. The participant got used to viewing through the filter in this adaptation phase. All of the objects relevant to the tasks used in the experiments were hidden from the participants during the adaptation phase.

Before each trial, the shutter panel of the AR device was closed to occlude the participant’s view (Fig 3). The trial began by opening the panel. The participant was asked to finish a given task as soon as possible and to press a large green button on a wall in front of the participant when s/he was finished. The response time between opening the shutter panel and pressing the green button was recorded.

Participants were 36 undergraduate students (aged 18 or over) in the Department of Psychology at the National Research University Higher School of Economics. All had normal or corrected-to-normal vision. There were 12 participants in Experiment 1, 12 participants in Experiment 2, and 12 participants in Experiment 3 (see https://osf.io/t5jgb/ for details). All were naïve with respect to the purpose of the study. Written informed consents were obtained from all of the participants. They were compensated with 100 Rubles for their participation.

The experiments described were conducted in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki) and approved by the institutional review board (the HSE Committee on Interuniversity Surveys and Ethical Assess of Empirical Research).

Experiment 1: Shape matching

In Experiment 1, we measured performance of a shape matching task with the image filters.

Procedures

A participant was given 12 prism-shaped objects that were randomly oriented on a tray and a box with 12 holes whose shapes corresponded to the 12 individual objects (Fig 4). The shapes of all of the holes were different from one another. They were the same as the shapes of the cross-sections of the individual objects, and the objects could go through only their corresponding holes. The participant had to insert all of the objects into the box by finding their unique holes. This task required matching the shapes of the objects with the shapes of the holes. Note that the participant could not see any of these objects before the first trial. The nature of this task was explained to the participant by using an analogous toy before the experiment. Put simply, this task required recognizing the objects’ shapes and the shapes of their holes and then coordinating their relative positions and orientations.

Fig 4 — The shapes of the holes corresponded to the 12 individual objects.

The experiment had 2 blocks consisting of 3 trials. There were 2 groups of 6 participants. The first group ran the block with the contour-drawing filter first. This was followed by the block with the grayscale-image. The second group ran the blocks in the opposite order. The participants were asked to rest for 5 minutes between the blocks during which they did not wear the AR device.

Results

Fig 5 shows the averaged results observed in Experiment 1. The ordinate shows the response time. The abscissa shows the trials. The colors of the plots (blue and orange) represent the image filters (contour-drawing and grayscale image) and the styles of the plots represent the two groups of participants. The results were analyzed by using a three-way mixed-design ANOVA with repeated measures on two factors [52]: groups of participants, image filters, and trial numbers in each block (1, 2, and 3). The effect of the trial numbers (F_2,50 = 9.0, p = 0.00046 × 7, where 7 is multiplied for a Bonferroni correction, see [53]) and the interaction between the filters and groups (F_1,50 = 27, p = 3.5× 10⁻⁶ × 7) were significant. The results of the other effects were not significant: the filters (F_1,50 = 0.31, p = 0.58 × 7), the groups (F_1,10 = 0.016, p = 0.97 × 7), the trial numbers × filters (F_2,50 = 0.22, p = 0.80 × 7), the trial numbers × groups (F_2,50 = 1.4, p = 0.26 × 7), the trial numbers × filters × groups (F_2,50 = 5.4, p = 0.0075 × 7). (Note that p-values are adjusted to 1 if the p-values become larger than 1 after multiplying the Bonferroni factor.).

A posteriori test (Tukey) was performed to test the interaction between the filters and groups. The response time was shorter in the first block than in the second block: p = 0.00091 for the group that ran the block with the contour-drawing filter first and p = 0.0095 for the group that ran the block with the grayscale-image. The effect of the filter was significant neither in the first block (p = 1.0) nor in the second block (p = 1.0).

These results show that a human participant could conduct the shape matching task reliably with both a contour-drawing and a grayscale image that represented a scene "out there". We did not observe any difference in performance with both types of representation.

Experiment 2: Object recognition

In Experiment 2, we used our image filters to measure performance in an object recognition task.

Procedure

The participants sat in front of an open box and a collection of toys that were randomly oriented on a tray (Figs 6 and 7). These toys represented a variety of animals. The experimenter said the names of 3 target animals out loud, and the participant was asked to repeat these names while the shutter panel was kept closed to occlude the participant’s view. When the panel was opened, the participant found the toys that represented 3 target animals, which had been chosen from the collection, and put them into the box. Put simply, this task only required that a participant could recognize an object s/he had not seen before.

Fig 6 — (A) 4 collections of animal toys used in Experiment 2: pulp-paper (left-top), stuffed (right-top), plastic-cartoon-like (left bottom), and plastic-realistic (right-bottom) animals. (B) The collection of toy animals on the tray, and the open box as they were arranged on the desk before each trial.

This experiment had 2 blocks, each containing 4 trials. Four collections, consisting of a variable number of animals, were used (Figs 6 and 7). The order of the trials was determined by using the Latin-square method for each image filter. Ten animal toys were used in 3 of the 4 collections and 7 were used in the remaining collection. The target animals were chosen by using the following criteria: (i) no animal was used more than once throughout all blocks for each participant and (ii) individual animals in each collection were used at roughly equal frequencies throughout the experiment. (see https://osf.io/t5jgb/ for a list of the target animals used in the experiment).

There were 2 groups of 6 participants. The first group ran the block with the contour-drawing filter first. This was followed by the block with the grayscale-images. The second group ran the blocks in the opposite order. The participants rested for 5 minutes between blocks. They did nor wear our AR device during these rest periods.

Results

All participants recognized all of the objects. They made no errors in both image filter conditions, i.e., contour-drawing and grayscale-image.

Fig 8 shows the averaged results observed in Experiment 2. The two panels of bar-graphs show the results obtained in both image filter conditions. The ordinate shows the response time. The abscissa shows the collections of animals. The brightness of the bars represents the two groups. The results were analyzed by using a three-way mixed-design ANOVA with repeated measures on two factors: groups of participants, image filters, and collections of toys. The two main factors were significant: the filters (F_1,70 = 75, p = 1.0 × 10⁻¹² × 7, where 7 is multiplied for a Bonferroni correction) and the toy collections (F_3,70 = 7.5, p = 0.00020 × 7). The results of the other effects were not significant: the groups (F_1,10 = 0.028, p = 0.87 × 7), the groups × toy collections (F_3,70 = 2.7, p = 0.051 × 7), the groups × filters (F_1,70 = 1.6, p = 0.21 × 7), the filters × toy collections (F_3,70 = 0.062, p = 0.98 × 7), and the groups × filters × toy collections (F_3,70 = 3.2, p = 0.030 × 7).

A posteriori test (Tukey) was performed to test the effect of the toy collections. The response time was shorter with the plastic-cartoon-like animals than with the stuffed animals (p = 0.00075) and with the pulp-paper animals (p = 0.018). The response time was shorter with the plastic-realistic animals than with the stuffed animals (p = 0.0047). The other pair-wise comparisons were not significant: plastic-cartoon-like vs. plastic-realistic (p = 0.94), plastic-realistic vs. pulp-paper (p = 0.074), paper-realistic vs. stuffed (p = 0.74).

Experiment 3: Visuomotor coordination

In Experiment 3, we measured performance of a visuomotor interaction with objects in two tasks.

Procedure

The participants performed two visuomotor interactions, one with tongs (Fig 9); the second with a brick (Fig 10). In the tongs task, the participant picked up and moved 7 objects from a tray to an open box. The tongs were used to minimize haptic information being used to perform the task. The participants had to control the tongs, using only dynamical visual information provided by the tongs and by the objects on the tray. In the brick task, the participant was given an open box containing 12 rectangular bricks and used his/her hands to build a stack, with 4 layers of 3 bricks on the top of a stand.

The experiment had 4 blocks: 2 image filters (contour-drawing and grayscale image) × 2 tasks (tongs and brick). Each block consisted of 3 trials. Three collections consisting 7 objects were used in the tongs task. These trials were repetitions of the blocks used in the brick task. The order of blocks and the order of trials within each block of the tongs task were randomized by using the Latin-square method with the following restriction: blocks with the contour-drawing and with the grayscale-image filters were conducted alternatively (see https://osf.io/t5jgb/). The participants rested for 5 minutes between blocks. They did not wear our AR device during these rest periods.

Results

Fig 11 shows the averaged results for the tongs and brick tasks observed in Experiment 3. The ordinate shows the response time. The abscissa in Fig 11A shows the trials within each block. The abscissa in Fig 11B shows the collections of objects. The colors of the plots (blue and orange) represent the image filters (contour-drawing and grayscale image).

The results of the tongs task were analyzed by using a two-way between-subject-design ANOVA with repeated measures: image filters, and collections of objects. The main factor, object collections, was significant (F_2,55 = 49, p = 6.4 × 10⁻¹³ × 3, where 3 is multiplied for a Bonferroni correction). The effect of the filters and the interaction between the filters and object collections were not significant: the filters (F_1,55 = 0.52, p = 0.48 × 3), the filters × object collections (F_2,55 = 0.18, p = 0.83 × 3). A posteriori test (Tukey) was performed to test the effect of the object collections. The response time was shorter with the plastic fruits-and-vegetables than it was with the plastic animals (p = 1.2 × 10⁻⁷) and with the wooden geometrical objects (p = 4.1 × 10⁻¹²). The response time was shorter with the plastic animals than with the wooden geometrical objects (p = 0.0042).

The results of the brick task were analyzed by using a two-way between-subject-design ANOVA with repeated measures: image filters, and trials. No effect was significant: the filters (F_1,55 = 0.0046, p = 0.95 × 3), the trials (F_2,55 = 3.2, p = 0.048 × 3), and their interaction (F_2,55 = 0.29, p = 0.75 × 3).

General discussion

We conducted three behavioral experiments that tested human performance when people interacted dynamically with objects in a real 3D scene, solely on the basis of a contour-drawing, or a grayscale-image that represented the scene. Contour-drawings were generated by applying an image filter to grayscale-images. The filter emphasized edges and eliminated luminance-gradients in the grayscale-image. The gray-scale images were used as a control. They provided a baseline for a participant’s performance in our kind of tasks while wearing our AR device. The effect of the image filter was observed in the response time in an object recognition task (Experiment 2). Responses were slower with contour-drawings than with the grayscale-images, but note that participants could also recognize objects reliably from both contour-drawings and grayscale images. This difference in response time with these two types of images was not observed in our shape matching task (Experiment 1) or in our visuomotor coordination tasks (Experiment 3).

The tasks in Experiments 1 and 3 were designed so that the tasks required a dynamical visuomotor-coordination. The participant had to interact with multiple objects and to control their positions and orientations by using her/his hands. This kind of dynamical visuomotor-coordination is required in many run-of-the-mill tasks in which we use our hands in our everyday life. Note, however, that the tasks in this study did not require precise control of timing or quick reactions to unexpected events that could happen in a real-life scene. This kind of highly dynamical task could not be tested in this study because of technical limitations of our AR device. Note that such rapid processing of visual information is required in sports, and that it has been shown that the human visual system can process static contour-drawings very quickly [3, 6, 9]. So, it is possible that contour information is essential for visual processing in highly-dynamical tasks.

The results of Experiment 2 showed that a participant can recognize an object quite well when given a contour-drawing but recognition was even better when given its grayscale-image. This difference in performance can be attributed: (i) to the image filter used to generate the contour-drawing, and (ii) to the luminance-polarity and the luminance-gradient that are present in the grayscale-image but not in the contour-drawing. The image filter used a very simple algorithm that only emphasized the luminance edges on the basis of local information in the photographic image. This filter missed information in 2D images of a scene, such as edges between two isoluminant regions and between two regions that had different textures [54]. The filter could also miss contours that represent important features of the 3D information in a scene, for example, ridges on the surface of objects [17, 18]. This filter could also detect edges that are usually not drawn as contours in drawings made by an artist, for example, the edge of a shadow [55], and details of texture. These missed and redundant edges can degrade performance in an object recognition task [32, 56–59]. Note that the algorithm of the filter can be analogous to the visual system’s process of edge detection in the primary visual cortex [46]. The human visual system must organize the edges in a retinal image in such a way that makes it possible for an observer to perceive the 3D information in the scene veridically [32, 58].

Also note that the luminance-polarity and the luminance-gradient present in the grayscale-image, but not in the contour-drawing, could also explain the difference in performance observed in Experiment 2. These two types of luminance information helped the visual system organize the luminance distributions and edges in the grayscale-images [42, 60–64]. Also, the luminance-gradient called "shading" could help the visual system to perceive the shape of the object’s surface [18, 25, 27, 65–72] and to decompose the object on the basis of the surface’s shape [73]. Note that this kind of luminance information did not improve performance in Experiments 1 and 3, which suggests that the luminance-gradient and the luminance-polarity are not as important as the contours produced by luminance-edges in the retinal image are for perceiving 3D information in a real scene.

It is well-known that people can obtain 3D information from a contour drawing, but, until now, it was not clear how useful contour information actually is in a real dynamical scene. Our study shows that contour information, alone, is sufficient for ordinary people to perform a variety of run-of-the-mill tasks. We believe that our demonstration suggests that contour information, alone, may be sufficient to provide the basis for our visual system to obtain much of the 3D information needed for successful visuomotor interactions in our everyday life.

Appendix

The AR device, used in this study, introduced some technical limitations in what the participants could do in Experiments 1, 2, and 3: specifically, the refresh-rate (10 Hz), the delay (200–300 msec), and the visual angles [49–51]. Furthermore, the Images shown on the screen of the device were achromatic, and binocular depth cues (binocular disparity and vergence) could not be used to perceive the 3D scenes. Also, the participants could move their heads less freely when they wore the AR device on their heads. All of these factors could degrade the participants’ performance even in the baseline condition with the grayscale-images. Two of the authors (MF, TS) performed these tasks without wearing the AR device to get an idea about how difficult these tasks were under more natural conditions. Specifically, MF and TS ran sessions in Experiments 1, 2, and 3 without wearing the AR device. Their response times in these sessions are summarized in Table 1 (See Figs 5, 8 and 11 for comparison).

Table 1. Response times (sec) of MF and TS, under a more natural viewing condition, measured in sessions run when the AR device was not worn.

The response time with the two image filters in Experiments 1, 2, and 3 are also shown for comparison (average ± standard error).

Exp. 1	Trial-1	Trial-2		Trial-3	Trial-4	Trial-5		Trial-6
MF	56.3	46.2		42.8	38.4	37.7		53.1
TS	56.0	39.8		39.1	40.1	37.5		36.0
Contour	232.4 ± 22.7	181.6 ± 30.7		121.7 ± 16.3	133.7 ± 25.6	108.7 ± 21.6		96.3 ± 15.8
Grayscale	248.9 ± 81.5	140.9 ± 29.1		128.8 ± 23.8	95.7 ± 7.5	118.8 ± 38.6		98.4 ± 19.8
Exp. 2	Plastic-cartoon-like animals		Plastic-realistic animals		Pulp-paper animals		Stuffed animals
MF	6.7		8.6		7.6		7.5
TS	7.2		5.9		11.0		7.8
Contour	42.1 ± 4.4		44.0 ± 4.8		51.5 ± 5.5		56.4 ± 4.9
Grayscale	20.9 ± 1.3		22.7 ± 2.5		32.2 ± 2.0		34.2 ± 3.3
Exp. 3: Tongs-task	Plastic animal		Wooden geometrical objects		Plastic fruits and vegetables
MF	12.5		13.6		12.6
TS	13.0		19.5		14.8
Contour	53.3 ± 2.7		61.9 ± 2.4		40.2 ± 3.1
Grayscale	53.6 ± 3.5		60.1 ± 3.3		37.9 ± 2.4
Exp. 3: Brick-task	Trial-1		Trial-2		Trial-3
MF	21.1		16.5		19.3
TS	16.1		16.3		16.4
Contour	45.2 ± 6.3		38.1 ± 3.4		37.1 ± 2.5
Grayscale	42.8 ± 2.5		40.1 ± 3.6		37.1 ± 2.2

Open in a new tab

The response times of MF and TS were analogous to one another in the individual sessions and they were substantially shorter than the response times of the naïve participants even with the grayscale-image filter in Experiments 1, 2, and 3. It suggests that the participants’ performance with both types of image filters (contour-drawing and grayscale image) was suppressed by the loss of color information and as well as by technical factors in the AR device used in the experiments. These technical factors should be minimized to make the viewing condition of the experiments more natural. This will be addressed in a future study.

Acknowledgments

We thank Svetlana V. Salomasova for helping to run the experiments reported in this study.

Data Availability

All empirical data reported in this study are available from https://osf.io/t5jgb/.

Funding Statement

This article was prepared within the framework of the Academic Fund Program at the National Research University Higher School of Economics (HSE University) in 2019 (grant № 19-04-006, awarded to TS) and by the Russian Academic Excellence Project «5-100». https://www.hse.ru/science/scifund/nug/nug2019 The sponsors or funders play no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Cole F, Sanik K, DeCarlo D, Finkelstein A, Funkhouser T, Rusinkiewicz S, et al. How well do line drawings depict shape? ACM Trans. Graph. 2009; 28(3), 28. [Google Scholar]
2.Hertzmann A. Why do line drawings work? A realism hypothesis. Percept. 2020; 49(4): 439–451. 10.1177/0301006620908207 [DOI] [PubMed] [Google Scholar]
3.Kroll JF, Potter MC. Recognizing words, pictures, and concepts: A comparison of lexical, object, and reality decisions. J Verbal Learning Verbal Behav. 1984. February 1;23(1):39–66. 10.1016/S0022-5371(84)90499-7. [DOI] [Google Scholar]
4.Pizlo Z. 3D shape: Its unique place in visual perception. Cambridge, MA: MIT Press; 2008. [Google Scholar]
5.Pizlo Z, Li Y, Sawada T, Steinman RM. Making a machine that sees like us. New York, NY: Oxford University Press; 2014. [Google Scholar]
6.Potter MC, Faulconer BA. Time to understand pictures and words. Nature. 1975. February 6;253(5491):437–8. 10.1038/253437a0. [DOI] [PubMed] [Google Scholar]
7.Sayim B, Cavanagh P. What line drawings reveal about the visual brain. Front Hum Neurosci. 2011; 5: 118 10.3389/fnhum.2011.00118 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Walther DB, Chai B, Caddigan E, Beck DM, Fei-Fei L. Simple line drawings suffice for functional MRI decoding of natural scene categories. Proc Natl Acad Sci. 2011. June 7;108(23):9661–6. 10.1073/pnas.1015666108. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Walther DB, Shen D. Nonaccidental properties underlie human categorization of complex natural scenes. Psychol Sci. 2014. April;25(4):851–60. 10.1177/0956797613512662. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Bhardwaj S, Mittal A. A survey on various edge detector techniques. Procedia Technology. 2012. January 1;4:220–226. [Google Scholar]
11.Spontón H, Cardelino J. A review of classic edge detectors. IPOL. 2015. June;5:90–123. 10.5201/ipol.2015.35. [DOI] [Google Scholar]
12.DeCarlo D. Depicting 3d shape using lines. Proc SPIE 2012; 8291 (Hum Vis Electron Imag XVII): 361–376. [Google Scholar]
13.Cole F, Golovinskiy A, Limpaecher A, Barros HS, Finkelstein A, Funkhouser T, et al. Where do people draw lines? ACM Trans. Graph. 2003; 27(3): 88 10.1145/1360612.1360687 [DOI] [Google Scholar]
14.DeCarlo D, Finkelstein A, Rusinkiewicz S, Santella A. Suggestive Contours for Conveying Shape. ACM Trans. Graph. 2003; 22(3): 848–855. [Google Scholar]
15.Koenderink JJ, Van Doorn AJ. The shape of smooth objects and the way contours end. Percept. 1982; 11: 129–137. 10.1068/p110129 [DOI] [PubMed] [Google Scholar]
16.Koenderink JJ. What does the occluding contour tell us about solid shape. Percept. 1984; 13: 321–330. 10.1068/p130321 [DOI] [PubMed] [Google Scholar]
17.Judd T, Durand F, Adelson E. Apparent Ridges for Line Drawing. ACM Trans. Graph. 2007; 26(3): 19 http://doi.acm.org/10.1145/1276377.1276401 [Google Scholar]
18.Todd JT. The visual perception of 3D shape. Trends Cog Sci. 2004; 8(3): 115–121. 10.1016/j.tics.2004.01.006 [DOI] [PubMed] [Google Scholar]
19.Attneave F. Some informational aspects of visual perception. Psychol Rev. 1954; 61: 183–193. 10.1037/h0054663 [DOI] [PubMed] [Google Scholar]
20.Biederman I. Recognition-by-components: A theory of human image understanding. Psychol Rev. 1987; 94: 115–147. 10.1037/0033-295X.94.2.115 [DOI] [PubMed] [Google Scholar]
21.Biederman I. Recognizing depth-rotated objects: A review of recent research and theory. Spat Vis. 2000;13(2–3):241–53. 10.1163/156856800741063. [DOI] [PubMed] [Google Scholar]
22.Hochberg J, Brooks V. Pictorial recognition as an unlearned ability: A study of one child’s performance. Am J Psychol. 1962; 75(4): 624–628. [PubMed] [Google Scholar]
23.Jahoda G, Deregowski JB, Ampene E, Williams N. Pictorial recognition as an unlearned ability: A replication with children from pictorially deprived environments In Butterworth G, editor. The Child’s Representation of the World. New York, NY: Plenum Press, 1977. p. 203–217. [Google Scholar]
24.Kennedy JM, Ross AS. Outline picture perception by the Songe of Papua. Percept. 1975; 4: 391–406. [Google Scholar]
25.Koenderink JJ, van Doorn AJ, Christou C, Lappin JS. Shape constancy in pictorial relief. Percept. 1996; 25: 155–164. 10.1068/p250155 [DOI] [PubMed] [Google Scholar]
26.Li Y, Pizlo Z. Depth cues versus the simplicity principle in 3D shape perception. Top Cogn Sci. 2011. October;3(4):667–85. 10.1111/j.1756-8765.2011.01155.x. [DOI] [PubMed] [Google Scholar]
27.Tsuruhara A, Sawada T, Kanazawa S, Yamaguchi MK, Corrow S, Yonas A. The development of the ability of infants to utilize static cues to create and access representations of object shape. J Vis. 2010; 10(12): 2 10.1167/10.12.2 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Pizlo Z. Perception viewed as an inverse problem. Vis Res. 2001; 41: 3145–3161. 10.1016/s0042-6989(01)00173-0 [DOI] [PubMed] [Google Scholar]
29.Poggio T, Torre V, Koch C. Computational vision and regularization theory. Nature. 1985; 317: 314–319. 10.1038/317314a0 [DOI] [PubMed] [Google Scholar]
30.Sawada T, Li Y, Pizlo Z. Shape Perception In Busemeyer J, Townsend J, Wang ZJ, Eidels A, editors. Oxford Handbook of Computational and Mathematical Psychology. New York, NY: Oxford University Press, 2015. p.255–276. [Google Scholar]
31.Leeuwenberg E, van der Helm PA. Structural Information Theory: The Simplicity of Visual Form. New York, NY: Cambridge University Press; 2013. [Google Scholar]
32.Michaux V, Jayadevan V, Delp E, Pizlo Z. Figure-ground organization based on 3D symmetry. J. Electron. Imaging. 2016; 25(6): 061606 10.1117/1.JEI.25.6.061606 [DOI] [Google Scholar]
33.Perkins DN. (1972). Visual discrimination between rectangular and nonrectangular parallelopipeds. Percept Psychophys. 1972; 12(5): 396–400 [Google Scholar]
34.Jayadevan V, Sawada T, Delp E, Pizlo Z. Perception of 3D symmetrical and nearly symmetrical shapes. Symmetry. 2018; 10(8): 344 10.3390/sym10080344 [DOI] [Google Scholar]
35.Li Y. Perception of Parallelepipeds: Perkins’s Law. Percept. 2009; 38: 1767–1781. 10.1068/p6397 [DOI] [PubMed] [Google Scholar]
36.Li Y, Pizlo Z, Steinman RM. A computational model that recovers the 3D shape of an object from a single 2D retinal representation. Vis. Res. 2009; 49: 979–91. 10.1016/j.visres.2008.05.013 [DOI] [PubMed] [Google Scholar]
37.Li Y, Sawada T, Shi Y, Kwon TK, Pizlo Z. A Bayesian model of binocular perception of 3D mirror symmetrical polyhedra. J. Vis. 2011; 11(4): 11 10.1167/11.4.11 [DOI] [PubMed] [Google Scholar]
38.Perkins DN. How good a bet is good form? Percept. 1976; 5(4): 393–406. [DOI] [PubMed] [Google Scholar]
39.de Gelder B, Kätsyri J, de Borst AW. Virtual reality and the new psychophysics. Br J Psychol. 2018. August;109(3):421–426. 10.1111/bjop.12308. [DOI] [PubMed] [Google Scholar]
40.Scarfe P, Glennerster A. Using high-fidelity virtual reality to study perception in freely moving observers. J Vis. 2015;15(9):3 10.1167/15.9.3. [DOI] [PubMed] [Google Scholar]
41.Triesch J, Ballard DH, Hayhoe MM, Sullivan BT. What you see is what you need. J Vis. 2003;3(1):86–94. 10.1167/3.1.9. [DOI] [PubMed] [Google Scholar]
42.Anstis S. Visual adaptation to a negative, brightness-reversed world: Some preliminary observations In Carpenter GA, Grossberg S, editors. Neural Networks for Vision and Image Processing. Cambridge, MA: MIT Press, 1992. p. 1–14. [Google Scholar]
43.Grush R, Jaswal L, Knoepfler J, Brovold A. Visual adaptation to a remapped spectrum In Metzinger T, Windt JM, editors. Open MIND: 16(T). Frankfurt am Main, Germany: MIND Group, 2015. 10.15502/9783958570283 [DOI] [Google Scholar]
44.Cohen MA, Botch TL, Robertson CE, The limits of color awareness during active, real-world vision. Proc Natl Acad Sci USA. 2020; 117(24): 13821–13827. 10.1073/pnas.1922294117 [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Sobel I. History and Definition of the so-called "Sobel Operator", more appropriately named the Sobel-Feldman Operator. Research gate. 2014 February 2 [Cited 2020 June 20] https://www.researchgate.net/publication/239398674_An_Isotropic_3x3_Image_Gradient_Operator
46.Wu Q, McGinnity M, Maguire L, Belatreche A, Glackin B. Edge detection based on spiking neural network model. ICIC 2007, LNAI 4682. 2007 Aug 21; 26–34.
47.Bradski G. The OpenCV Library. Dr Dobb’s J. Software Tools. 2000.
48.Wijntjes M, Füzy A, Verheij MES, Deetman T, Pont SC. The synoptic art experience. Art Percept. 2016; 4(1–2): 73–105. 10.1163/22134913-00002046 [DOI] [Google Scholar]
49.Chen JY, Thropp JE. Review of low frame rate effects on human performance. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans. 2007. October 29;37(6):1063–76. [Google Scholar]
50.Cummings JJ, Bailenson JN. How immersive is enough? A meta-analysis of the effect of immersive technology on user presence. Media Psychology. 2016. April 2;19(2):272–309. [Google Scholar]
51.Thropp JE, Chen JY. The effects of slow frame rates on human performance Aberdeen Proving Ground, MD: Army Research Laboratory; 2006. June. [Google Scholar]
52.Neter J, Kutner MH, Nachtsheim CJ, Wasserman W. Applied Linear Statistical Models. 4th ed, Boston, MA: McGraw-Hill; 1996. [Google Scholar]
53.Cramer AOJ, van Ravenzwaaij D, Matzke D, Steingroever H, Wetzels R, Grasman RPPP, et al. Hidden multiplicity in exploratory multiway ANOVA: Prevalence and remedies. Psychon Bull Rev. 2016; 23: 640–647. 10.3758/s13423-015-0913-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Rosenholtz R. Texture perception In Wagemans J, editor. Oxford Handbook of Perceptual Organization. New York, NY: Oxford University Press, 2015. p.167–186. 10.1167/15.3.9 [DOI] [Google Scholar]
55.Metzger W. Laws of Seeing (Spillman L, Lehar S, Stromeyer M, Wertheimer M, translators). Cambridege, MA: the MIT press; 2006. [Google Scholar]
56.Harrison SJ, Feldman J. The influence of shape and skeletal axis structure on texture perception. J Vis. 2009; 9(6): 13 10.1167/9.6.13 [DOI] [PubMed] [Google Scholar]
57.Kwon TK, Agrawal K, Li Y, Pizlo Z. Spatially-global integration of closed, fragmented contours by finding the shortest-path in a log-polar representation. Vis Res. 2015; 125: 143–163. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Li Y, Sawada T, Latecki LJ, Steinman RM, Pizlo Z. A tutorial explaining a machine vision model that emulates human performance when it recovers natural 3D scenes from 2D images. J Math Psychol. 2012; 56: 217–231. [Google Scholar]
59.Sassi M, Vancleef K, Machilsen B, Panis S, Wagemans J. Identification of everyday objects on the basis of Gaborized outline versions. i-Percept. 2010; 1(3): 121–142. 10.1068/i0384 [DOI] [PMC free article] [PubMed] [Google Scholar]
60.Elder JH, Trithart S, Pintilie G, MacLean D. Rapid processing of cast and attached shadows. Percept. 2004; 33(11): 1319–1338. 10.1068/p5323 [DOI] [PubMed] [Google Scholar]
61.Ghose T, Palmer SE. Extremal edges versus other principles of figure-ground organization. J Vis. 2010; 10(8): 3 10.1167/10.8.3 [DOI] [PubMed] [Google Scholar]
62.Kersten D, Knill DC, Mamassian P, Bülthoff I. Illusory motion from shadows. Nature. 1996; 379: 31 10.1038/379031a0 [DOI] [PubMed] [Google Scholar]
63.Lauenstein L. Über räumliche wirkungen von licht und schatten. Psychol Forsch. 1938; 22: 267–319. [Google Scholar]
64.Ramachandran VS. Perceiving shape from shading. Sci Am. 1977; 256(8): 76–83. [DOI] [PubMed] [Google Scholar]
65.Kunsberg B, Holtmann-Rice D, Alexander E, Cholewiak S, Fleming R, Zucker SW. Colour, contours, shading and shape: flow interactions reveal anchor neighbourhoods. Interface Focus. 2018; 820180019. 10.1098/rsfs.2018.0019 [DOI] [PMC free article] [PubMed] [Google Scholar]
66.Langer MS, Bülthoff HH. Depth discrimination from shading under diffuse lighting. Percept. 2000; 29: 649–660. 10.1068/p3060 [DOI] [PubMed] [Google Scholar]
67.Liu B, Todd JT. Perceptual biases in the interpretation of 3D shape from shading. Vis Res. 2004; 44: 2135–2145. 10.1016/j.visres.2004.03.024 [DOI] [PubMed] [Google Scholar]
68.Mingolla E, Todd JT. Perception of solid shape from shading. Biol Cybern. 1986; 53: 137–151. 10.1007/BF00342882 [DOI] [PubMed] [Google Scholar]
69.Nasanen R. Spatial frequenc bandwidth used in the recognition of facial images. Vis Res. 1999; 39: 3824–3833. 10.1016/s0042-6989(99)00096-6 [DOI] [PubMed] [Google Scholar]
70.Norman JF, Todd JT, Phillips F. The perception of surface orientation from multiple sources of optical information. 1995; 57: 629–636. [DOI] [PubMed] [Google Scholar]
71.Pentland A. Shape information from shading: A theory about human perception. Spat Vis. 1989; 4: 165–182. 10.1163/156856889x00103 [DOI] [PubMed] [Google Scholar]
72.Todd JT, Reichel FD. Ordinal Structure in the Visual Perception and Cognition of Smoothly Curved Surfaces. Psychol Rev. 1989; 96; 643–657. 10.1037/0033-295x.96.4.643 [DOI] [PubMed] [Google Scholar]
73.Nefs HT, Koenderink JJ, Kappers AML. The influence of illumination direction on the pictorial reliefs of Lambertial surfaces. Percept. 2005; 34: 275–287. 10.1068/p5179 [DOI] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0242581.r001

Decision Letter 0

Markus Lappe

7 Aug 2020

PONE-D-20-19600

Seeing our 3D world while only viewing contour-drawings

PLOS ONE

Dear Dr. Tadamasa Sawada,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Sep 21 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Markus Lappe

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that Figure(s) [10] in your submission contain copyrighted images. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission:

1. You may seek permission from the original copyright holder of Figure(s) [10] to publish the content specifically under the CC BY 4.0 license.

We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:

“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.”

Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission.

In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].”

2. If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

3. We note that Figure [3] includes an image of a [patient / participant / in the study].

As per the PLOS ONE policy (http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research) on papers that include identifying, or potentially identifying, information, the individual(s) or parent(s)/guardian(s) must be informed of the terms of the PLOS open-access (CC-BY) license and provide specific permission for publication of these details under the terms of this license. Please download the Consent Form for Publication in a PLOS Journal (http://journals.plos.org/plosone/s/file?id=8ce6/plos-consent-form-english.pdf). The signed consent form should not be submitted with the manuscript, but should be securely filed in the individual's case notes. Please amend the methods section and ethics statement of the manuscript to explicitly state that the patient/participant has provided consent for publication: “The individual in this manuscript has given written informed consent (as outlined in PLOS consent form) to publish these case details”.

If you are unable to obtain consent from the subject of the photograph, you will need to remove the figure and any other textual identifying information or case descriptions for this individual.

4. Thank you for including your ethics statement: "The experiments were conducted in accordance with the Code of

Ethics of the World Medical Association (Declaration of Helsinki) and approved by the institutional review board (IRB)"

Please amend your current ethics statement to include the full name of the ethics committee/institutional review board(s) that approved your specific study.

Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”).

For additional information about PLOS ONE ethical requirements for human subjects research, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this study participants performed simple visuomotor tasks in which they wore an AR device that showed them the visual field, but instead of capturing a full color photograph of the visual field the display showed a grayscale image or a the edges in the image. In each task, the participants’ behavior in the grayscale image and the edge image was not substantially different. The authors conclude that their study suggests that contour information is sufficient to the visual system to determine all 3D information that would be required for performing everyday tasks.

Overall, the experiments seem to be performed well, and the analyses were also performed correctly. It would be nice if the actual demographic information about participants was included in the text, and the readers were not just referred to the OSF repository.

The use of dynamic scenes seems to be an important point of the article, but I would guess that if the participant was shown only a static image, they would still be able to perform the task in experiment 2 and the tongs task of experiment 3, and that it would take equally as long to do the task in the line drawing condition as the grayscale image condition.

The claims in this paper rely on a negative finding. This makes it more difficult to justify the claims of the paper, especially with 12 participants per experiment. With a larger sample size the lack of a difference would be more persuasive. Also, to make the claim that this study shows that contours are sufficient to extract all necessary 3D information for every day tasks, the authors should spend some more time justifying that their three tasks generalize to other everyday tasks.

From previous work (M. Potter or I. Biederman, and from people communicating with drawings throughout history) we know that line drawings of static images are sufficient for image understanding. We also know that in a AR/VR environment people can successfully interact with their environment even if they are not photorealistic (Triesch, Ballard, Hayhoe, and Sullivan, 2003, many others too) – however I will note that this previous AR/VR work almost always gives the participant 3D information from binocular disparity. From our ability to understand cartoon videos, and the ability to even perceive intentionality and emotional content from simple line drawing videos (Hieder and Simmel) we know that people can understand dynamic line drawings. So I do not see anything unexpected about these results. I also am surprised that a study like this has not already been performed. So while I don’t see this as very novel or surprising, if it is has not been done by anyone else, then something like this should be published.

Reviewer #2: The paper is interesting, and the experiments are clearly described. The AR part is clear, even the stereoscopic part. Nevertheless, the authors must address the following concerns.

- rows 67-79 The authors implemented a simple algorithm for edge detection (not edges, but a combination of the image gradient), it would be very interesting to see how the results of their work might change as a function of the contour detection algorithms that can produce different kinds of contour (e.g. real edges, i.e. edges 1 pixel wide, black on white background or more human-like, i.e. edges more similar to the ones humans draw). The authors should discuss this point and try to do extend at least one of their experiments by using a different edge detection algorithm.

- rows 91-96 I wonder whether the low refresh rate and high lag has affected the results of the experiments, mainly since experiments are related to dynamical scene, when subjects interact with objects (i.e. the poor performance of the device has flattened the difference between the two conditions). This concern rises from my experience in AR/VR when the devices have poor performance. The authors should discuss (and take into consideration) this point.

- Experiment 1: Shape Matching It would be interesting to compare the subject performance in this AR experiment with respect to the baseline in real conditions (i.e. without wearing the AR device). This allow us both to have an idea of the reliability of the response time (e.g. it is so high that the difference between the condition are saturated) and to have an idea of the effect of the depth cue. The authors should discuss this point and try to do extend at least one of their experiments by comparing it with the result of one without wearing the AR device.

- rows 211-215: These results are affected by the kind of algorithms (conditions) the authors implemented, since the algorithm outputs depend on the object textures. I think this can not be completely related to the influence of contours on 3D interpretation of a scene.

- General discussion I am not totally convinced about the explanation of the authors, since their contours depend a lot on object textures, i.e. on the chosen algorithm, more than the effectiveness of the contours themselves (Experiment 2). Moreover, there is not a baseline without wearing the AR device. I think that the study could be more solid by following my previous suggestions.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jan 22;16(1):e0242581. doi: 10.1371/journal.pone.0242581.r002

Author response to Decision Letter 0

17 Sep 2020

Dear Dr. Markus Lappe,

We appreciate your handling our manuscript and two reviewers for reviewing the manuscript. All comments from the reviewers are addressed in this revision of the manuscript. We believe the manuscript is substantially improved thanks to the comments from the reviewers.

The suggestions are addressed point by point below with line numbers in the revised manuscript. All the revisions are in red with balloon comments in the revised manuscript “Revised Manuscript with Track Changes”.

E1. Style. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

We formatted our manuscript according to PLOS ONE’s style requirements and confirmed that files are named properly.

E2. Figure 10. We note that Figure(s) [10] in your submission contain copyrighted images. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission:

We blacked out all copyrighted parts in Figure 10 (L. 262).

E3. Figure 3. We note that Figure [3] includes an image of a [patient / participant / in the study].

We removed the photo with a person from Figure 3 (L. 129).

E4. Ethics statement. Thank you for including your ethics statement: "The experiments were conducted in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki) and approved by the institutional review board (IRB)"

Please amend your current ethics statement to include the full name of the ethics committee/institutional review board(s) that approved your specific study.

We revised the ethics statement to include the full name of the board (the HSE Committee on Interuniversity Surveys and Ethical Assess of Empirical Research). The revised statement was copied to the “Ethics Statement” field of the submission form (L. 147).

E5. Ethics statement. Please provide additional details regarding participant consent. In the Methods section, please ensure that you have specified (1) whether consent was informed and (2) what type you obtained (for instance, written or verbal). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

We revised the ethics statement to specify that written informed consent was obtained from all the participants (L 146). All participants were undergraduate students (aged 18 or over) (L. 143).

E6. Figures 1, 2, 4, 6, 7, 9 and 10. Copyright

We confirm that we produced all the figures in this manuscript specifically for this manuscript. We also blacked out all potentially copyrighted parts in Figure 1 (L. 81).

Reviewers' comments:

Reviewer #1:

Thank you very much for reviewing our manuscript and for the constructive feedback. We were especially happy to receive your positive evaluation of our study. We also thank you for sharing reference information about prior relevant studies. We revised our manuscript, taking all of your suggestions into account point by point (see below).

In this study participants performed simple visuomotor tasks in which they wore an AR device that showed them the visual field, but instead of capturing a full color photograph of the visual field the display showed a grayscale image or the edges in the image. In each task, the participants’ behavior in the grayscale image and the edge image was not substantially different. The authors conclude that their study suggests that contour information is sufficient to the visual system to determine all 3D information that would be required for performing everyday tasks.

Overall, the experiments seem to be performed well, and the analyses were also performed correctly.

R1a. It would be nice if the actual demographic information about participants was included in the text, and the readers were not just referred to the OSF repository.

Text explaining the demographic information of our participants was added (L. 140). The participants were 36 undergraduate students in the Department of Psychology at the National Research University Higher School of Economics. All had normal or corrected-to-normal vision, and all were naïve with respect to the purpose of the study. No other personal information was collected.

R1b. The use of dynamic scenes seems to be an important point of the article, but I would guess that if the participant was shown only a static image, they would still be able to perform the task in experiment 2 and the tongs task of experiment 3, and that it would take equally as long to do the task in the line drawing condition as the grayscale image condition.

Thank you for raising this issue. The tongs task in Experiment 3 was designed to make haptic information useless when the task was performed. The participants used the tongs, so only dynamical visual information provided by the tongs and the objects on the tray was available. The Procedure section in Experiment 3 was revised to clarify this point (L. 245).

We think that the object recognition task in Experiment 2 could be performed to some extent on the basis of static views of objects. But, in real dynamical scenes, people can interact with objects and they can change their view of the objects if they could not recognize the objects from their original view. A paragraph was added in the Introduction discussing object recognition in real dynamical scenes (L. 54, see also Comment R1e).

R1c. The claims in this paper rely on a negative finding. This makes it more difficult to justify the claims of the paper, especially with 12 participants per experiment. With a larger sample size the lack of a difference would be more persuasive.

We agree with the reviewer. We revised the Abstract (L.21) and Discussion (L. 290, 336) to address this concern.

R1d. Also, to make the claim that this study shows that contours are sufficient to extract all necessary 3D information for every day tasks, the authors should spend some more time justifying that their three tasks generalize to other everyday tasks.

We added a paragraph discussing the generalization of the tasks as well as its limitations in the Discussion (L. 299).

R1e. From previous work (M. Potter or I. Biederman, and from people communicating with drawings throughout history) we know that line drawings of static images are sufficient for image understanding.

Prior studies (including studies by M. Potter and I. Biederman) that tested 3D perception from contour drawings used clean contour-drawings of objects taking care to avoid using degenerate views of the objects. Now note that in the real 3D scenes, objects will often be seen with degenerate views (see Comment R1b). Also, contour-drawings that are automatically generated from photographic images of a real scene often lack important contours and have redundant contours. We revised the text in the Introduction (L. 54) and Discussion (L. 312) that discusses the differences between contour-drawings made by artists for human observers and contour-drawings generated by computer algorithms.

Thank you for the references to these prior studies. We added your references to prior studies about the visual perception of contour-drawings including the studies by M. Potter and of I. Biederman. Note that Biederman (1987) was cited in our original submission.

R1f. We also know that in a AR/VR environment people can successfully interact with their environment even if they are not photorealistic (Triesch, Ballard, Hayhoe, and Sullivan, 2003, many others too) – however I will note that this previous AR/VR work almost always gives the participant 3D information from binocular disparity.

We added a paragraph about the nature of an immersive experience of a 3D scene with XR technology and how people interact with such scenes the basis of visual information provided by this technology (L. 61). We also added references to prior studies (including Triesch, Ballard, Hayhoe, & Sullivan, 2003) that discussed using the XR technology to study the visual perception.

It is critical to control the visual stimuli used systematically when the human visual system is studied. If someone wants to study the effect of degrading the photorealism of stimuli on the perception of the stimuli, the way the photorealism is degraded must be systematically controlled. We did it by using two types of image filters. We also examined the effect of the degrading by comparing the participants’ performance with both types of filters. We revised the text to make this point clear (L. 21, 116, 290, 344).

R1g. From our ability to understand cartoon videos, and the ability to even perceive intentionality and emotional content from simple line drawing videos (Hieder and Simmel) we know that people can understand dynamic line drawings.

This comment is related to two unsolved questions: (i) how well can artists represent 3D scenes and 3D objects by using only contours in their drawings and (ii) how well can an artists’ skill be emulated by computer algorithms. Generating good contour-drawings from a 2D image of a 3D scene or acquiring 3D information of the scene is a non-trivial task in Computer vision. Our study examined how well people can perceive the 3D information contained in a real scene when a contour-drawing of it was automatically generated by a simple computer algorithm. An analogy of this algorithm with the visual system's process of edge detection in the primary visual cortex has been discussed in some prior studies (L. 99). We revised the text in the Introduction (L. 33) and Discussion (L. 316) to explain the difference between an artists’ drawing of a contour and a contour-drawing made by a computer algorithm.

R1h. So I do not see anything unexpected about these results. I also am surprised that a study like this has not already been performed. So while I don’t see this as very novel or surprising, if it is has not been done by anyone else, then something like this should be published.

We hope that our revision of the manuscript addressed all of your concerns and made this study more interesting. We appreciate the comments and suggestions that you made.

Reviewer #2:

Thank you very much for reviewing our manuscript and for your constructive feedback. We were pleased to see your interest to our study. We revised our manuscript to tale all of your suggestions into account point by point (see below).

The paper is interesting, and the experiments are clearly described. The AR part is clear, even the stereoscopic part. Nevertheless, the authors must address the following concerns.

R2a. rows 67-79 The authors implemented a simple algorithm for edge detection (not edges, but a combination of the image gradient), it would be very interesting to see how the results of their work might change as a function of the contour detection algorithms that can produce different kinds of contour (e.g. real edges, i.e. edges 1 pixel wide, black on white background or more human-like, i.e. edges more similar to the ones humans draw). The authors should discuss this point and try to do extend at least one of their experiments by using a different edge detection algorithm.

Thank you very much for this suggestion. We tried some other filters for edge detection, i.e., canny, but our AR device could not process these filters as quickly as a Sobel filter. The Sobel filter, introduced in 1968, is one of the simplest algorithms that can be used to emphasize edges in an image. The simplicity of the filter allowed our AR device to process photographic images from a camera in near real-time. We added text explaining the reason we chose the Sobel filter (L. 95).

Detecting important edges, while removing redundant edges, from a photographic image and generating a good contour-drawing from the image, or from a 3D model of a scene, are on-going research topics in Computer vision. We could say that the newer algorithms are better. We revised the text in the Introduction (L. 33) and the Discussion (L. 316) that address this issue.

R2b. rows 91-96 I wonder whether the low refresh rate and high lag has affected the results of the experiments, mainly since experiments are related to dynamical scene, when subjects interact with objects (i.e. the poor performance of the device has flattened the difference between the two conditions). This concern rises from my experience in AR/VR when the devices have poor performance. The authors should discuss (and take into consideration) this point.

We understand this reviewer’s concern. Based on the results of prior studies on the human factors that arise when XR technology is used, such as the refresh rate and the lag of our AR device, that despite the fact that it was acceptable, these factors could affect our subject's immersive experience. We revised the text in the General methods to address this concern (L. 114). We also discussed it in a new appendix (L. 338, see Comment R2c).

Note that the contour-drawing was always generated regardless of whether the contour-drawing or the grayscale-image were shown on the screen. This control made the refresh rate and delay consistent across the conditions of image filters. We added text explaining this control in the General methods (L. 116).

R2c. Experiment 1: Shape Matching It would be interesting to compare the subject performance in this AR experiment with respect to the baseline in real conditions (i.e. without wearing the AR device). This allow us both to have an idea of the reliability of the response time (e.g. it is so high that the difference between the condition are saturated) and to have an idea of the effect of the depth cue. The authors should discuss this point and try to do extend at least one of their experiments by comparing it with the result of one without wearing the AR device.

Two of the authors (MF, TS) ran sessions in Experiments 1, 2, and 3 without wearing the AR device to get an idea about how difficult these tasks were under more natural conditions. We added the new appendix section reporting these sessions (L. 338). Note that it is very difficult to test any naïve participants in such an interactive experiment in the current pandemic situation.

R2d. rows 211-215: These results are affected by the kind of algorithms (conditions) the authors implemented, since the algorithm outputs depend on the object textures. I think this cannot be completely related to the influence of contours on 3D interpretation of a scene.

We expanded our discussion in the General discussion (L. 316) about the differences between contour-drawings drawn by artists and generated by computer algorithms (see also Comment R2a, R1e, R1g).

R2e. General discussion I am not totally convinced about the explanation of the authors, since their contours depend a lot on object textures, i.e. on the chosen algorithm, more than the effectiveness of the contours themselves (Experiment 2). Moreover, there is not a baseline without wearing the AR device. I think that the study could be more solid by following my previous suggestions.

This study addresses the difference in performance observed with a contour-drawing and with a grayscale-image. The gray-scale images were used as a control. They provided a baseline for a participant's performance in our kind of tasks while wearing our AR device. We revised the text in the Abstract (L. 21) and the General discussion (L. 290) to make this point clearer.

Two of the authors (MF, TS) ran sessions in Experiments 1, 2, and 3 without wearing the AR device and these sessions are reported in the new appendix section (L. 338, see Comment R2c).

We believe your concerns about the quality of contour-drawings used in this study are addressed well in our replies to Comments R2a and R2d (see also Comments R1e and R1g).

We hope that our revision of the manuscript addressed all of your concerns. We appreciate the comments and suggestions that you made.

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(30.5KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0242581.r003

Decision Letter 1

Markus Lappe

19 Oct 2020

PONE-D-20-19600R1

Seeing our 3D world while only viewing contour-drawings

PLOS ONE

Dear Tadamasa,

I am happy to report that both reviewers are essentially satisfied with your revision. Reviewer 2 has a few minor points that you should be able to address easily. I will accept the paper once these minor changes have been made.

Please submit your revised manuscript by Dec 03 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Best regards,

Markus

---

Markus Lappe

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #1: The authors addressed all of our concerns. They added relevant literature, demographic information, and clarifications about the study. As the study relies on a null finding, they slightly toned down their claims to a more appropriate level. The study may have benefited from a Bayesian analysis to support the claims more strongly. Overall, I see no technical problems with the paper as it is now.

Reviewer #2: The authors addressed the concerns I raised in my review in a satisfactory way (by looking at the answers to the other reviewer too).

In particular, I pointed out that it should interesting to compare the subject performance in these AR experiments with respect to the baseline in real conditions (i.e. without wearing the AR device). The authors replied that it is very difficult to test any naïve participants in such an interactive experiment in the current pandemic situation. They added an Appendix where two of the authors (MF, TS) ran sessions in Experiments 1, 2, and 3 without wearing. In normal situation this is not acceptable, but in the current situation I think this is an added value for the paper. Moreover, the important issue related to the use of only one edge detector, which can be solved by running the experiments by using a different algorithm, is hampered by the pandemic situation. Thus, it is fine for me again.

However, in the Appendix the authors should add a row to each table in order to add the average performances of the experiments with the AR device to simplify the comparison (I think that the “See Figs. 5, 8, 11 for comparison” is not enough). Moreover, they should add a short comment about the comparison. I think that this point (at least) is important to improve the paper.

Then, the work can be published for me.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

PLoS One. 2021 Jan 22;16(1):e0242581. doi: 10.1371/journal.pone.0242581.r004

Author response to Decision Letter 1

31 Oct 2020

Dear colleagues,

We appreciate you for reviewing our manuscript and suggestions you made. We were delighted to see that both of the reviewers are mostly satisfied with our last revision. We also thank the reviewers for understanding our situation. All the suggestions are addressed point by point below with line numbers in the revised manuscript. All the revisions are in red with balloon comments in the revised manuscript “Revised Manuscript with Track Changes”.

Reviewer #1:

R1a. The study may have benefited from a Bayesian analysis to support the claims more strongly.

Thank you very much for this suggestion. Note that the results of the experiments reported in this study were analyzed by using multi-way repeated-measure ANOVA but Bayesian alternatives of multi-way repeated-measure ANOVA is still under discussion (see Nathoo & Masson, 2016 for discussion). Instead, we added information of confidence intervals based on the t-distribution in the manuscript (l. 186-187, l. 238-240, l. 290-296; see Francis, 2017 for comparison between the conventional t-test and its Bayes alternative). We believe it provides better quantitative information about our results.

Reviewer #2:

R2a. However, in the Appendix the authors should add a row to each table in order to add the average performances of the experiments with the AR device to simplify the comparison (I think that the “See Figs. 5, 8, 11 for comparison” is not enough). Moreover, they should add a short comment about the comparison. I think that this point (at least) is important to improve the paper.

Thank you very much for this suggestion. We revised Table 1 (l. 366-371) to show average performance under the contour-drawing and grayscale-image conditions in the main experiments.

We also added a short paragraph discussing the difference between Appendix and the experiments reported in the main text of the manuscript.

References

Francis, G. (2017). Equivalent statistics and data interpretation. Behavior research methods, 49(4), 1524-1538.

Nathoo, F. S., & Masson, M. E. (2016). Bayesian alternatives to null-hypothesis significance testing for repeated-measures designs. Journal of Mathematical Psychology, 72, 144-157.

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(19.2KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0242581.r005

Decision Letter 2

Markus Lappe

5 Nov 2020

Seeing our 3D world while only viewing contour-drawings

PONE-D-20-19600R2

Dear Tadamasa,

I am pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With best wishes,

Markus

Markus Lappe

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

PLoS One. doi: 10.1371/journal.pone.0242581.r006

Acceptance letter

Markus Lappe

9 Nov 2020

PONE-D-20-19600R2

Seeing our 3D world while only viewing contour-drawings

Dear Dr. Sawada:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Markus Lappe

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(30.5KB, docx)}

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(19.2KB, docx)}

Data Availability Statement

All empirical data reported in this study are available from https://osf.io/t5jgb/.

[pone.0242581.ref001] 1.Cole F, Sanik K, DeCarlo D, Finkelstein A, Funkhouser T, Rusinkiewicz S, et al. How well do line drawings depict shape? ACM Trans. Graph. 2009; 28(3), 28. [Google Scholar]

[pone.0242581.ref002] 2.Hertzmann A. Why do line drawings work? A realism hypothesis. Percept. 2020; 49(4): 439–451. 10.1177/0301006620908207 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref003] 3.Kroll JF, Potter MC. Recognizing words, pictures, and concepts: A comparison of lexical, object, and reality decisions. J Verbal Learning Verbal Behav. 1984. February 1;23(1):39–66. 10.1016/S0022-5371(84)90499-7. [DOI] [Google Scholar]

[pone.0242581.ref004] 4.Pizlo Z. 3D shape: Its unique place in visual perception. Cambridge, MA: MIT Press; 2008. [Google Scholar]

[pone.0242581.ref005] 5.Pizlo Z, Li Y, Sawada T, Steinman RM. Making a machine that sees like us. New York, NY: Oxford University Press; 2014. [Google Scholar]

[pone.0242581.ref006] 6.Potter MC, Faulconer BA. Time to understand pictures and words. Nature. 1975. February 6;253(5491):437–8. 10.1038/253437a0. [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref007] 7.Sayim B, Cavanagh P. What line drawings reveal about the visual brain. Front Hum Neurosci. 2011; 5: 118 10.3389/fnhum.2011.00118 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0242581.ref008] 8.Walther DB, Chai B, Caddigan E, Beck DM, Fei-Fei L. Simple line drawings suffice for functional MRI decoding of natural scene categories. Proc Natl Acad Sci. 2011. June 7;108(23):9661–6. 10.1073/pnas.1015666108. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0242581.ref009] 9.Walther DB, Shen D. Nonaccidental properties underlie human categorization of complex natural scenes. Psychol Sci. 2014. April;25(4):851–60. 10.1177/0956797613512662. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0242581.ref010] 10.Bhardwaj S, Mittal A. A survey on various edge detector techniques. Procedia Technology. 2012. January 1;4:220–226. [Google Scholar]

[pone.0242581.ref011] 11.Spontón H, Cardelino J. A review of classic edge detectors. IPOL. 2015. June;5:90–123. 10.5201/ipol.2015.35. [DOI] [Google Scholar]

[pone.0242581.ref012] 12.DeCarlo D. Depicting 3d shape using lines. Proc SPIE 2012; 8291 (Hum Vis Electron Imag XVII): 361–376. [Google Scholar]

[pone.0242581.ref013] 13.Cole F, Golovinskiy A, Limpaecher A, Barros HS, Finkelstein A, Funkhouser T, et al. Where do people draw lines? ACM Trans. Graph. 2003; 27(3): 88 10.1145/1360612.1360687 [DOI] [Google Scholar]

[pone.0242581.ref014] 14.DeCarlo D, Finkelstein A, Rusinkiewicz S, Santella A. Suggestive Contours for Conveying Shape. ACM Trans. Graph. 2003; 22(3): 848–855. [Google Scholar]

[pone.0242581.ref015] 15.Koenderink JJ, Van Doorn AJ. The shape of smooth objects and the way contours end. Percept. 1982; 11: 129–137. 10.1068/p110129 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref016] 16.Koenderink JJ. What does the occluding contour tell us about solid shape. Percept. 1984; 13: 321–330. 10.1068/p130321 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref017] 17.Judd T, Durand F, Adelson E. Apparent Ridges for Line Drawing. ACM Trans. Graph. 2007; 26(3): 19 http://doi.acm.org/10.1145/1276377.1276401 [Google Scholar]

[pone.0242581.ref018] 18.Todd JT. The visual perception of 3D shape. Trends Cog Sci. 2004; 8(3): 115–121. 10.1016/j.tics.2004.01.006 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref019] 19.Attneave F. Some informational aspects of visual perception. Psychol Rev. 1954; 61: 183–193. 10.1037/h0054663 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref020] 20.Biederman I. Recognition-by-components: A theory of human image understanding. Psychol Rev. 1987; 94: 115–147. 10.1037/0033-295X.94.2.115 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref021] 21.Biederman I. Recognizing depth-rotated objects: A review of recent research and theory. Spat Vis. 2000;13(2–3):241–53. 10.1163/156856800741063. [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref022] 22.Hochberg J, Brooks V. Pictorial recognition as an unlearned ability: A study of one child’s performance. Am J Psychol. 1962; 75(4): 624–628. [PubMed] [Google Scholar]

[pone.0242581.ref023] 23.Jahoda G, Deregowski JB, Ampene E, Williams N. Pictorial recognition as an unlearned ability: A replication with children from pictorially deprived environments In Butterworth G, editor. The Child’s Representation of the World. New York, NY: Plenum Press, 1977. p. 203–217. [Google Scholar]

[pone.0242581.ref024] 24.Kennedy JM, Ross AS. Outline picture perception by the Songe of Papua. Percept. 1975; 4: 391–406. [Google Scholar]

[pone.0242581.ref025] 25.Koenderink JJ, van Doorn AJ, Christou C, Lappin JS. Shape constancy in pictorial relief. Percept. 1996; 25: 155–164. 10.1068/p250155 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref026] 26.Li Y, Pizlo Z. Depth cues versus the simplicity principle in 3D shape perception. Top Cogn Sci. 2011. October;3(4):667–85. 10.1111/j.1756-8765.2011.01155.x. [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref027] 27.Tsuruhara A, Sawada T, Kanazawa S, Yamaguchi MK, Corrow S, Yonas A. The development of the ability of infants to utilize static cues to create and access representations of object shape. J Vis. 2010; 10(12): 2 10.1167/10.12.2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0242581.ref028] 28.Pizlo Z. Perception viewed as an inverse problem. Vis Res. 2001; 41: 3145–3161. 10.1016/s0042-6989(01)00173-0 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref029] 29.Poggio T, Torre V, Koch C. Computational vision and regularization theory. Nature. 1985; 317: 314–319. 10.1038/317314a0 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref030] 30.Sawada T, Li Y, Pizlo Z. Shape Perception In Busemeyer J, Townsend J, Wang ZJ, Eidels A, editors. Oxford Handbook of Computational and Mathematical Psychology. New York, NY: Oxford University Press, 2015. p.255–276. [Google Scholar]

[pone.0242581.ref031] 31.Leeuwenberg E, van der Helm PA. Structural Information Theory: The Simplicity of Visual Form. New York, NY: Cambridge University Press; 2013. [Google Scholar]

[pone.0242581.ref032] 32.Michaux V, Jayadevan V, Delp E, Pizlo Z. Figure-ground organization based on 3D symmetry. J. Electron. Imaging. 2016; 25(6): 061606 10.1117/1.JEI.25.6.061606 [DOI] [Google Scholar]

[pone.0242581.ref033] 33.Perkins DN. (1972). Visual discrimination between rectangular and nonrectangular parallelopipeds. Percept Psychophys. 1972; 12(5): 396–400 [Google Scholar]

[pone.0242581.ref034] 34.Jayadevan V, Sawada T, Delp E, Pizlo Z. Perception of 3D symmetrical and nearly symmetrical shapes. Symmetry. 2018; 10(8): 344 10.3390/sym10080344 [DOI] [Google Scholar]

[pone.0242581.ref035] 35.Li Y. Perception of Parallelepipeds: Perkins’s Law. Percept. 2009; 38: 1767–1781. 10.1068/p6397 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref036] 36.Li Y, Pizlo Z, Steinman RM. A computational model that recovers the 3D shape of an object from a single 2D retinal representation. Vis. Res. 2009; 49: 979–91. 10.1016/j.visres.2008.05.013 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref037] 37.Li Y, Sawada T, Shi Y, Kwon TK, Pizlo Z. A Bayesian model of binocular perception of 3D mirror symmetrical polyhedra. J. Vis. 2011; 11(4): 11 10.1167/11.4.11 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref038] 38.Perkins DN. How good a bet is good form? Percept. 1976; 5(4): 393–406. [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref039] 39.de Gelder B, Kätsyri J, de Borst AW. Virtual reality and the new psychophysics. Br J Psychol. 2018. August;109(3):421–426. 10.1111/bjop.12308. [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref040] 40.Scarfe P, Glennerster A. Using high-fidelity virtual reality to study perception in freely moving observers. J Vis. 2015;15(9):3 10.1167/15.9.3. [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref041] 41.Triesch J, Ballard DH, Hayhoe MM, Sullivan BT. What you see is what you need. J Vis. 2003;3(1):86–94. 10.1167/3.1.9. [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref042] 42.Anstis S. Visual adaptation to a negative, brightness-reversed world: Some preliminary observations In Carpenter GA, Grossberg S, editors. Neural Networks for Vision and Image Processing. Cambridge, MA: MIT Press, 1992. p. 1–14. [Google Scholar]

[pone.0242581.ref043] 43.Grush R, Jaswal L, Knoepfler J, Brovold A. Visual adaptation to a remapped spectrum In Metzinger T, Windt JM, editors. Open MIND: 16(T). Frankfurt am Main, Germany: MIND Group, 2015. 10.15502/9783958570283 [DOI] [Google Scholar]

[pone.0242581.ref044] 44.Cohen MA, Botch TL, Robertson CE, The limits of color awareness during active, real-world vision. Proc Natl Acad Sci USA. 2020; 117(24): 13821–13827. 10.1073/pnas.1922294117 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0242581.ref045] 45.Sobel I. History and Definition of the so-called "Sobel Operator", more appropriately named the Sobel-Feldman Operator. Research gate. 2014 February 2 [Cited 2020 June 20] https://www.researchgate.net/publication/239398674_An_Isotropic_3x3_Image_Gradient_Operator

[pone.0242581.ref046] 46.Wu Q, McGinnity M, Maguire L, Belatreche A, Glackin B. Edge detection based on spiking neural network model. ICIC 2007, LNAI 4682. 2007 Aug 21; 26–34.

[pone.0242581.ref047] 47.Bradski G. The OpenCV Library. Dr Dobb’s J. Software Tools. 2000.

[pone.0242581.ref048] 48.Wijntjes M, Füzy A, Verheij MES, Deetman T, Pont SC. The synoptic art experience. Art Percept. 2016; 4(1–2): 73–105. 10.1163/22134913-00002046 [DOI] [Google Scholar]

[pone.0242581.ref049] 49.Chen JY, Thropp JE. Review of low frame rate effects on human performance. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans. 2007. October 29;37(6):1063–76. [Google Scholar]

[pone.0242581.ref050] 50.Cummings JJ, Bailenson JN. How immersive is enough? A meta-analysis of the effect of immersive technology on user presence. Media Psychology. 2016. April 2;19(2):272–309. [Google Scholar]

[pone.0242581.ref051] 51.Thropp JE, Chen JY. The effects of slow frame rates on human performance Aberdeen Proving Ground, MD: Army Research Laboratory; 2006. June. [Google Scholar]

[pone.0242581.ref052] 52.Neter J, Kutner MH, Nachtsheim CJ, Wasserman W. Applied Linear Statistical Models. 4th ed, Boston, MA: McGraw-Hill; 1996. [Google Scholar]

[pone.0242581.ref053] 53.Cramer AOJ, van Ravenzwaaij D, Matzke D, Steingroever H, Wetzels R, Grasman RPPP, et al. Hidden multiplicity in exploratory multiway ANOVA: Prevalence and remedies. Psychon Bull Rev. 2016; 23: 640–647. 10.3758/s13423-015-0913-5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0242581.ref054] 54.Rosenholtz R. Texture perception In Wagemans J, editor. Oxford Handbook of Perceptual Organization. New York, NY: Oxford University Press, 2015. p.167–186. 10.1167/15.3.9 [DOI] [Google Scholar]

[pone.0242581.ref055] 55.Metzger W. Laws of Seeing (Spillman L, Lehar S, Stromeyer M, Wertheimer M, translators). Cambridege, MA: the MIT press; 2006. [Google Scholar]

[pone.0242581.ref056] 56.Harrison SJ, Feldman J. The influence of shape and skeletal axis structure on texture perception. J Vis. 2009; 9(6): 13 10.1167/9.6.13 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref057] 57.Kwon TK, Agrawal K, Li Y, Pizlo Z. Spatially-global integration of closed, fragmented contours by finding the shortest-path in a log-polar representation. Vis Res. 2015; 125: 143–163. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0242581.ref058] 58.Li Y, Sawada T, Latecki LJ, Steinman RM, Pizlo Z. A tutorial explaining a machine vision model that emulates human performance when it recovers natural 3D scenes from 2D images. J Math Psychol. 2012; 56: 217–231. [Google Scholar]

[pone.0242581.ref059] 59.Sassi M, Vancleef K, Machilsen B, Panis S, Wagemans J. Identification of everyday objects on the basis of Gaborized outline versions. i-Percept. 2010; 1(3): 121–142. 10.1068/i0384 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0242581.ref060] 60.Elder JH, Trithart S, Pintilie G, MacLean D. Rapid processing of cast and attached shadows. Percept. 2004; 33(11): 1319–1338. 10.1068/p5323 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref061] 61.Ghose T, Palmer SE. Extremal edges versus other principles of figure-ground organization. J Vis. 2010; 10(8): 3 10.1167/10.8.3 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref062] 62.Kersten D, Knill DC, Mamassian P, Bülthoff I. Illusory motion from shadows. Nature. 1996; 379: 31 10.1038/379031a0 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref063] 63.Lauenstein L. Über räumliche wirkungen von licht und schatten. Psychol Forsch. 1938; 22: 267–319. [Google Scholar]

[pone.0242581.ref064] 64.Ramachandran VS. Perceiving shape from shading. Sci Am. 1977; 256(8): 76–83. [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref065] 65.Kunsberg B, Holtmann-Rice D, Alexander E, Cholewiak S, Fleming R, Zucker SW. Colour, contours, shading and shape: flow interactions reveal anchor neighbourhoods. Interface Focus. 2018; 820180019. 10.1098/rsfs.2018.0019 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0242581.ref066] 66.Langer MS, Bülthoff HH. Depth discrimination from shading under diffuse lighting. Percept. 2000; 29: 649–660. 10.1068/p3060 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref067] 67.Liu B, Todd JT. Perceptual biases in the interpretation of 3D shape from shading. Vis Res. 2004; 44: 2135–2145. 10.1016/j.visres.2004.03.024 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref068] 68.Mingolla E, Todd JT. Perception of solid shape from shading. Biol Cybern. 1986; 53: 137–151. 10.1007/BF00342882 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref069] 69.Nasanen R. Spatial frequenc bandwidth used in the recognition of facial images. Vis Res. 1999; 39: 3824–3833. 10.1016/s0042-6989(99)00096-6 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref070] 70.Norman JF, Todd JT, Phillips F. The perception of surface orientation from multiple sources of optical information. 1995; 57: 629–636. [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref071] 71.Pentland A. Shape information from shading: A theory about human perception. Spat Vis. 1989; 4: 165–182. 10.1163/156856889x00103 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref072] 72.Todd JT, Reichel FD. Ordinal Structure in the Visual Perception and Cognition of Smoothly Curved Surfaces. Psychol Rev. 1989; 96; 643–657. 10.1037/0033-295x.96.4.643 [DOI] [PubMed] [Google Scholar]

[pone.0242581.ref073] 73.Nefs HT, Koenderink JJ, Kappers AML. The influence of illumination direction on the pictorial reliefs of Lambertial surfaces. Percept. 2005; 34: 275–287. 10.1068/p5179 [DOI] [PubMed] [Google Scholar]

PERMALINK

Seeing our 3D world while only viewing contour-drawings

Maddex Farshchi

Alexandra Kiba

Tadamasa Sawada

Roles

Abstract

Introduction

General methods

AR device

Fig 1. Contour-drawings of real scenes generated by the AR device used in this study.

Fig 2. (A) Regions of two image segments (blue and red) taken from the contour-drawing IC, or from the grayscale-image IG, and (B) an image composed of these segments on the LCD screen.

Procedure

Fig 3. The AR device used in this study with its shutter panel closed (left) and opened (right).

Experiment 1: Shape matching

Procedures

Fig 4. Gray-scale images and contour-drawings of 12 prism-shaped objects on a tray (left) and the box with 12 holes (right) used in Experiment 1.

Results

Fig 5. The results obtained in Experiment 1.

Experiment 2: Object recognition

Procedure

Fig 6.

Fig 7. Gray-scale images and contour-drawings of several samples taken from the 4 collections of animal toys used in Experiment 2.

Results

Fig 8. The two panels of bar-graphs show the results obtained with both kinds of filters, i.e., contour-drawing and grayscale-image.

Experiment 3: Visuomotor coordination

Procedure

Fig 9.

Fig 10.

Results

Fig 11.

General discussion

Appendix

Table 1. Response times (sec) of MF and TS, under a more natural viewing condition, measured in sessions run when the AR device was not worn.

Acknowledgments

Data Availability

Funding Statement

References

Decision Letter 0

Markus Lappe

Roles

Author response to Decision Letter 0

Decision Letter 1

Markus Lappe

Roles

Author response to Decision Letter 1

Decision Letter 2

Markus Lappe

Roles

Acceptance letter

Markus Lappe

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Fig 2. (A) Regions of two image segments (blue and red) taken from the contour-drawing I_C, or from the grayscale-image I_G, and (B) an image composed of these segments on the LCD screen.