Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2018 Oct 15;115(44):11333–11338. doi: 10.1073/pnas.1800901115

Potential downside of high initial visual acuity

Lukas Vogelsang a,b,1, Sharon Gilad-Gutnick a,1, Evan Ehrenberg a, Albert Yonas c, Sidney Diamond a, Richard Held a,2, Pawan Sinha a,3
PMCID: PMC6217435  PMID: 30322940

Significance

As newborns, we start with poor visual acuity, attributable to retinal and cortical immaturities. This has been considered to be a limitation of early visual processing. We propose that initially poor retinal acuity may, in fact, have adaptive value. It may help set up processing strategies and receptive fields in the cortex that facilitate spatial analysis over extended areas. Observations from children who started their visual journeys with abnormally high visual acuity, and computational simulations with deep neural networks trained on high-resolution or blurred images, corroborate this proposal. The results have implications for understanding normal visual development; identifying causal roots of, and interventions for, important visual processing impairments; and creating better training procedures for computational vision systems.

Keywords: visual development, visual acuity, deep neural networks, spatial integration, sight restoration

Abstract

Children who are treated for congenital cataracts later exhibit impairments in configural face analysis. This has been explained in terms of a critical period for the acquisition of normal face processing. Here, we consider a more parsimonious account according to which deficits in configural analysis result from the abnormally high initial retinal acuity that children treated for cataracts experience, relative to typical newborns. According to this proposal, the initial period of low retinal acuity characteristic of normal visual development induces extended spatial processing in the cortex that is important for configural face judgments. As a computational test of this hypothesis, we examined the effects of training with high-resolution or blurred images, and staged combinations, on the receptive fields and performance of a convolutional neural network. The results show that commencing training with blurred images creates receptive fields that integrate information across larger image areas and leads to improved performance and better generalization across a range of resolutions. These findings offer an explanation for the observed face recognition impairments after late treatment of congenital blindness, suggest an adaptive function for the acuity trajectory in normal development, and provide a scheme for improving the performance of computational face recognition systems.


This work was initiated by a serendipitous referral to our laboratory of a young boy, RK, with an unusual visual history. RK was born in China and placed in an orphanage soon thereafter. He had dense bilateral congenital cataracts, but lack of resources prevented him from receiving treatment until he was 4.5 y old. Two years after his surgery, an American family adopted and brought him to the United States, where he lives today. RK adapted well and his health and intellectual development progressed satisfactorily. However, his parents noticed that despite being otherwise visually proficient, he had difficulty recognizing people’s faces, an impairment that was impeding his ability to socialize. They contacted us to request that we assess RK’s vision. In the laboratory, we tested RK along multiple perceptual and neurological dimensions including acuity, contrast sensitivity, face/nonface discrimination, within-class object individuation, and face individuation. Consistent with his parents’ observations, we found that RK performed normally on all tests except face identification, on which his responses were no better than chance.

A review of the literature reveals that RK is not unique in his performance profile; his recognition deficit echoes results that have previously been reported in studies with adolescent children who underwent treatment for congenital cataracts during the first year of infancy. They exhibit impaired face-discrimination performance even when tested several years after their treatments (13). Notably, the periods of deprivation they suffered were quite short, ranging between 2 and 6 mo.

The finding that even brief periods of visual deprivation can lead to face-identification deficits, while appearing to spare other visual skills, has been explained as the manifestation of a critical period for face learning (4, 5). According to this account, face recognition is an “experience-expectant” process, requiring exposure to faces early in development to enable appropriate perceptual and cortical specialization necessary for face-identification abilities (1, 68). Children like RK, who pass this period without normal exposure to faces, are expected to exhibit compromised face recognition skills later in life. Recent evidence from nonhuman primates who have undergone controlled deprivation is consistent with these results (8). Monkeys reared without exposure to faces lacked face preference in their looking behavior, and did not exhibit face-specialized cortical domains, in contrast to nondeprived controls. These observations suggest that skills which appear early in the developmental timeline, such as face discrimination, may be particularly vulnerable to visual deprivation. It is worth pointing out, however, that the consequences of delayed treatment of congenital blindness are complex and do not all conform to a unitary template. Several high-level visual skills appear capable of being acquired even after prolonged periods of initial deprivation (912). Additional evidence of resilience is provided by studies of low-level vision. Here, findings suggest that the earliest appearing proficiencies of normal development, such as the ability to perceive visual flicker, are also the ones least susceptible to deprivation, a biological analog of the common corporate practice of “first-hired last-fired” (13, 14). Taken together, these studies point to a varied landscape of visual proficiencies following late treatment of congenital blindness, with some skills more susceptible to compromise by visual deprivation during early “sensitive” periods in development. Specifically, impairments of face processing following early visual deprivation may plausibly admit a sensitive period-based account. However, an explanation that is not dependent on the domain-specific action of visual deprivation would have the virtue of parsimony. Here, we consider a domain-general account and present results from computational tests of its predictions.

Rather than focusing exclusively on the absence of information in the period prior to treatment as a critical contributor to later face-processing impairments (the “experience-expectant” viewpoint), an alternative is to consider how visual experience in the period following treatment differs in children like RK relative to the typically developing newborn. An important source of such differences are maturational processes in the visual system that continue progressing even while the eye has an occlusive pathology like a cataract. A key dimension impacted by such processes is acuity (1517). Typically developing newborns commence their visual experience with remarkably poor acuity, below 20/600 (1820), which is well beyond the criterion for legal blindness. Much of this acuity impairment can be attributed to the immature neonatal retina (21, 22), with its reduced foveal photoreceptor packing density (23), as well as to immaturities in the visual cortex (24, 25). Over the initial months of development, these immaturities diminish steadily, leading to a stereotyped pattern of acuity improvement (26). In a child with congenital cataracts, the physiological mechanisms of neural maturation continue to proceed despite the lenticular opacity (27, 28). As a consequence of this maturational progression, once the cataracts are eventually removed, the initial retinal acuity of the child is significantly higher than that of a newborn. Indeed, this was the case with RK, whose visual experience started with an acuity of 20/40.

There are several notable aspects of the vision of children treated for congenital cataracts (29), including a lack of accommodation [given the fixed-focus aphakic correction comprising either intraocular lenses (IOLs) or external glasses], changes in ocular growth (30), and the development of deprivation-induced amblyopia (31, 32). The studies have spanned a spectrum of ages [with age at surgery ranging from less than 6 mo (3335) to 8 y (36, 37)]. The data show that although outcomes are generally better the earlier the surgery is conducted, and long-term acuity does not reach normal levels, of particular relevance here is the finding that these children exhibit better than neonatal acuity rapidly after treatment.

Although high retinal acuity at the outset may appear to be advantageous for the visual system given the richer visual experience it offers, it may, in fact, compromise the development of important visual processes, specifically those involving extended spatial integration. Poor acuity, by definition, introduces blur that reduces effective image resolution. Small patches in blurred images do not contain enough structure to be informative; spatial integration is necessary to detect and discriminate patterns in such images (38). With high-resolution images, however, local processing suffices for discrimination tasks, rendering extended spatial integration superfluous (39).

To formalize this intuition, in a simple fragment-based image classification system (detailed in Methods), we examined fragment sizes needed in high-resolution versus blurred images to obtain similar levels of classification performance. As Fig. 1 shows, to achieve the same z-score (or comparable classification performance), a smaller integration area suffices for high-resolution images relative to blurred ones. The implication is that the early availability of high-resolution information may obviate the development of strategies for integrating over extended spatial extents. Such integration is a prerequisite for the analysis of configural relationships in images (40), which, in turn, are believed to be especially important for facial identification (41, 42). This line of reasoning, first suggested as a possible explanation for some of the observed “sleeper effects” following early visual deprivation (43), predicts that children with initially high acuity would be biased toward local processing. Indeed, this is consistent with data showing that children treated for congenital cataracts are able to perform face discrimination tasks where the patterns differ in local features but are impaired on a task requiring detection of configural changes (44), as well as on holistic face processing (45). In summary, the abnormally high initial retinal acuity experienced by children treated for congenital blindness later in life, compared with typical newborns, may result in compromised spatial integration. We refer to this as the high initial acuity (HIA) hypothesis.

Fig. 1.

Fig. 1.

(A and B) Confusion matrices derived from all pair-wise comparisons of 400 face images (10 exemplars per 40 individuals). The matrices are averaged across 10 matching regions distributed across the image (an example matching region is indicated by the boxes overlaid on the face images), when the images are processed at high resolution (A) or with blur (B). Lighter shades indicate higher match scores. The reduced distinctiveness of content in the blurred image patches leads to the confusion matrix in B being dominated by indiscriminately high match scores, relative to A. Note that the two confusion matrices are plotted using identical color scales. (C) From each confusion matrix, we can determine how different the “within-class” scores (for the 10 images of the person shown in the reference image) are relative to the scores of the other 390 images. The plots show summary results of 5,208 simulations using different matching region locations and sizes when the images are in high resolution (red curves) or blurred (blue curves). The bold curves represent the averages across all individual simulations (seen here as diffuse background curves). Within-class z-scores are plotted against region sizes. A higher z-score corresponds to better differentiation of within-class images relative to out-of-class images. As is evident from the plots, to achieve the same level of class discrimination, a smaller image fragment suffices for the high-resolution case compared with the blurred one. Face images courtesy of AT&T Laboratories Cambridge.

To test the HIA hypothesis’ predictions about how experience with different acuity levels can affect the development of spatial integration fields and overall face-recognition performance, we trained different instances of a deep convolutional neural network (CNN) (46) on a large database of face images (47) while manipulating the blur of training and test data. It is worth pointing out the analogies between the computational and biological entities implicit in this approach. First, image acuity corresponds to the resolution of the visual input to primary visual cortex, with limitations deriving primarily from retinal factors. This is in keeping with empirical data suggesting that retinal immaturities are significant determinants of observed visual sensitivity (48). In the CNN simulations, the resolution change is imposed at the input layer (akin to the output of the retina), and we examine how the response properties of units in the convolutional layers, which are posited to roughly correspond to primary visual cortex (49), change as a function of the input resolution. Pliability of cortical units is based on results from studies showing that visual deprivation can extend plastic periods (50, 51). The implication of these results is that with the onset of sight, some units in the visual cortex can develop differently depending on input properties, such as resolution. With this background, we examined how receptive fields (RFs) in the first convolutional layer and performance levels of a CNN change when image blur is varied. These tests yielded several notable results.

First, CNNs trained with image sets of different blur levels exhibited markedly different RFs in the first convolutional layer. As we had hypothesized, introduction of blur led to a progressive increase in the spatial extents of the RFs (Fig. 2 A and B) (P < 0.01; two-tailed t-tests comparing RF sizes for each pair of consecutive blur levels). Second, as is clear from Fig. 2C, the presence of high spatial frequencies during training drives the system to behave in a biased way in which making use of those higher spatial frequencies is prioritized, while low spatial-frequency information (necessary for effective generalization to blurred images) is mostly disregarded. The third result, also evident in Fig. 2C, is that none of these training regimens yields broad generalization performance. When testing the network with new images spanning a range of blurs, recognition accuracy peaks at the blur level matching the one used during training. This has an interesting implication: The front end of a network (comprising convolutional layers) trained with blurred images does not impose mandatory low-pass filtering on the inputs. If such filtering were to be employed, then high-resolution images, which subsume low spatial-frequency information, would have yielded the same performance as blurred inputs. Instead, high spatial-frequency information does progress through the network and influences eventual classification. That this influence is a detrimental one suggests that besides being able to utilize low-spatial frequency information, the network also needs to learn the appropriate weights for the high spatial-frequency components of the image data. Taken together, these results bear out our expectation of image resolution as a modulator of the extent of spatial integration by the initial RFs. They also point to the insufficiency of blurred images on their own for achieving broad generalization across resolution levels. More pragmatically, they reveal a limitation of CNN-based image recognition systems that are often trained on only high-resolution inputs (46, 52).

Fig. 2.

Fig. 2.

Results of uniform training on CNNs. (A) Top-10 RFs in the first layer of CNNs trained on images blurred with a Gaussian filter with σ = 0, 1, 2, 3, 4, and corresponding acuities. The stronger the blur, the bigger and smoother the Gabor patches of the depicted RFs appear to be. (B) RF size increases with increasing blur. (C) Performance curves for training and testing CNN instances on different levels of blur. Performance is tied to the blur level of the training set, peaking at the test level that matches training level, although training on blurred images results in better generalization across other resolutions than training on high-resolution images. Face images courtesy of AT&T Laboratories Cambridge.

Motivated by the fact that visual experience in the normally developing brain involves a temporal progression from low to high acuity, we trained additional instances of the CNN using a staged regimen that commenced with blurred images, which were then followed by high-resolution ones. To examine ordering effects comprehensively, we trained four different instances of the CNN, one on each of the following regimens: “blurred-to-high-resolution” (250 epochs of training with blurred images followed by 250 epochs of training with high-resolution images); “high-resolution-to-blurred” (250 epochs of training with high-resolution images followed by 250 epochs of training with blurred images); “blurred-to-blurred” (500 epochs with exclusively blurred training images); and “high-resolution-to-high-resolution” (500 epochs with exclusively high-resolution training images).

These simulations yielded three primary results, summarized in Fig. 3. First, the introduction of blurred images in the training phase consistently induced the system to increase the sizes of its RFs, irrespective of when in the training regimen the blurred training was introduced (Fig. 3A). Interestingly, having high-resolution images follow the blurred ones did not shrink the size of the RFs. In other words, there is a notable asymmetry in terms of the effects of ordering: Starting with high-resolution images and then later introducing blurred ones leads to a significant increase in RF sizes, but the converse (blurred followed by high-resolution images) does not cause the network to reduce the sizes of the already established large RFs.

Fig. 3.

Fig. 3.

Results of nonuniform training on CNNs. (A) Impact of nonuniform training paradigms on RF sizes. (B) When training paradigm begins with blur, even 20% blurred images are sufficient for the system to produce large RFs. (C) Impact of different training paradigms on performance level obtained with test images subjected to different levels of blur (here quantified as the σ, in pixels, of the convolving Gaussian): Blurred-to-high-resolution shows the best performance and generalization. (D) Two metrics of performance of the four different nonuniform training regimens. Black bars, area under the performance curve, normalized to 1.0 for perfect performance across all resolutions. Gray bars, slope of performance curve (perfect generalization would correspond to a slope of 0.0).

Second, even when the initial phase of training on blurred images is a small proportion of the full training regimen (in our simulations, 20%), it is sufficient to induce the creation of large RFs that are stable over the subsequent training epochs with high-resolution images. Increasing the amount of initial blurred training beyond this proportion does not cause a further expansion of the RFs (Fig. 3B). Drawing a parallel to human development, this observation suggests that although acuity improves rapidly in infancy (26), even the limited amount of time that a child experiences very poor acuity may well be sufficient for the instantiation and consolidation of RFs with large spatial extents.

Third, the order in which images of different resolutions are presented affects subsequent classification performance (Fig. 3C). Blurred-to-high-resolution training produces by far the most superior and generalized classification performance across all resolution levels, as indicated by it having the largest area under the curve (AUC) and lowest absolute slope (Fig. 3D). In contrast, when the network is trained on high-resolution to blurred, it yields poor performance when tested on high-resolution images, although, in aggregate, it has been trained with precisely the same set of images as in the blurred-to-high-resolution condition. Thus, the superior performance of blurred-to-high-resolution training is a direct consequence of the initial period of blurred training, akin to the typically developing acuity trajectory of the human visual system.

A potential explanation for the pronounced ordering effects we observe is that the initial learning phase with only high-resolution images has a detrimental effect on generalization across resolution (Fig. 2C), causing a large training error in subsequent blurred learning trials, and finally a strong weight adaptation, even within the first convolutional layer. Thus, high spatial frequency features end up being replaced by those effective for classifying blurred images. This is not the case when the resolution progression is from low to high. Initial training on blurred images produces RFs with large spatial extents and superior generalization across resolutions; this diminishes the need for radical weight adaptation during follow-up training with high-resolution images. Corroborating this point, the blurred-to-blurred and blurred-to-high-resolution training regimens lead to similar RF sizes in the first convolutional layer.

Taken together, results from simulations with deep convolutional neural nets lend support to the proposal that in the developmental progression of visual acuity, initial experience with blurred imagery may help set up RFs capable of encoding image structure over extended spatial extents. This helps reduce reliance on local features and improves generalization performance across a range of image resolutions. The HIA account, while not ruling out the critical period-based explanation, can parsimoniously account for the puzzling observations reported in the literature about early visual deprivation resulting in specific face-processing impairments.

Complementing data from the computational tests described here, the HIA hypothesis finds tentative support in the experimental literature. First, interventions like lid suture, which are akin to strong low-pass filtering, have been reported to result in enlarged RFs in the visual cortex (53). Second, recent studies of population RFs in human visual cortex (54) have reported an association between RF sizes and performance on face tasks, with smaller fields corresponding to compromised configural analysis.

The parsimony of the HIA hypothesis predicts that the consequences of sight recovery after early deprivation are not face-specific but should also apply to other tasks that rely on extended spatial processing. Empirical data support this prediction. Global motion processing is found to be compromised in this population even years after treatment (55). By contrast, patients who developed cataracts after one year show normal levels of global motion thresholds when tested several years after surgery. Individuals who undergo late treatment of congenital blindness have also been shown to be impaired at illusory contour perception and global shape completion but not on local shape discrimination (56). One study (57) found a greater deficit in a feature-spacing change detection task for faces relative to houses, but they attribute this impairment to compromised experience-dependent perceptual narrowing for faces that typically developing individuals exhibit. In a follow-up study (58), they highlight the idea that there may well be a general, rather than face-specific, mechanism used for spacing judgment tasks.

The results here suggest that the initially poor acuity in the normal developmental progression may be a feature of the system, rather than a limitation. Looking beyond acuity, it remains to be seen whether developmental progressions along other dimensions of visual function, such as field of view (59), color saturation (60), windows of simultaneity (61), and attentional control (62), might also confer adaptive benefits for instantiating robust processing strategies. In addition, there is abundant literature on the temporal dynamics of spatial-frequency processing, simulating HIA in computational models or in various behavioral, physiological, and imaging studies on normal adults. This body of research establishes the prevalence of coarse-to-fine sequencing at all levels of cortical processing of sensory information and across different modalities of sensation. For example, a study of temporal integration of rapidly sequenced spatially filtered visual images (63) found that success of the integration process in a normal adult population depended on a coarse-to-fine order of frame presentation. An fMRI study of temporal dynamics of face processing in higher-level visual cortex (64) varied exposure duration and spatial frequency content of filtered images. They demonstrated a low-frequency response to short duration exposures in a number of face-responsive regions of adult subjects, more robust and decaying more slowly with increasing exposure durations in bilateral fusiform face area (FFA), rFFA > lFFA. Other work (65, 66) has examined the benefits of adopting a “coarse to fine” matching strategy especially in the context of binocular stereopsis. Simulations suggest that such a progression might be beneficial for fine-tuning disparity sensitivity in binocular RFs (67). The researchers compared the results of simulated learning in networks, trained with four different sequences of filtered input, to detect binocular disparities in pairs of visual images. They found what they referred to as “developmental” sequences, namely those initial training sequences that employed exclusively either low or high frequency images, consistently outperformed the others in which the spectral content in the first stage of training was either random or identical to the following one.

The adverse effects of HIA may also have relevance beyond the domain of vision. For instance, a fetus’ auditory experience in the womb comprises low-pass filtered versions of voices and other sounds in the external world (68). This kind of input may induce the development of extended temporal integration mechanisms so as to be able to detect envelope modulation of the auditory inputs. Significantly premature birth limits this low-frequency exposure and immerses the baby in an auditory environment teeming with high temporal frequencies. In a manner analogous to what we have proposed for high initial visual acuity, this abnormal auditory experience may lead to compromised hearing skills. Interestingly, premature birth is indeed found to be associated with higher order auditory impairments (69).

From the perspective of computer vision, our results point to an improved training strategy for enhancing the generalization performance of current CNN-based recognition systems and, more generally, illustrate the potential benefits of incorporating aspects of human developmental trajectories in designing training routines for machine-based systems.

The HIA hypothesis of potential benefits of exposure to reduced resolution imagery early in the developmental timeline raises interesting questions with basic and applied implications. A particularly far-reaching one regarding clinical practice relates to the issue of refractive correction following cataract surgery: When treating infants with cataracts, might it be better to leave their eyes without aphakic correction? We believe that a few considerations argue against changing the standard approach of providing aphakic correction, either via intraocular lenses or external glasses. First, uncorrected aphakia significantly degrades image resolution beyond the acuity limitations imposed by retinal immaturities in infants’ eyes. This is especially true of newborn eyes given that their mean lens power is significantly greater than that of adults (34.4 D vs. 18.8 D) (70). The excessively degraded images in the absence of lenticular correction may prove inadequate for helping to define RF structures. Interestingly, from this perspective, the level of image blur experienced by a typical newborn may be seen as occupying a “Goldilocks” spot: not too high, to prevent shrinkage of RFs, and not too low to contain enough image structure that can guide the development of RF morphology. Second, our simulations indicate that the period of low acuity needed to support extended spatial integration is quite short and needs to be followed by a period of improved acuity. Just extending the duration of poor vision by failing to provide aphakic correction would not be sufficient for obtaining classification performance gains. Also, leaving an eye without aphakic correction might set in motion mechanisms for changing eyeball shape that might permanently compromise visual quality (71). For these reasons, leaving an eye without aphakic correction might not be a recommended course of action following cataract surgery. However, it may be worth considering the option of initially undercorrecting the refractive errors and progressively moving toward full correction. In the context of ongoing efforts to provide sight-initiating surgeries for congenitally blind children (72), our results suggest that patients with high postoperative acuity (31) might benefit from a temporary period of optical blur via refractive undercorrection to induce the visual system to develop appropriate spatial integration strategies.

Broadening our scope to include children born without any ocular pathologies, the HIA hypothesis induces us to consider whether newborns who start with better-than-typical acuity might be at risk for developing impairments in configural analysis and, specifically, face recognition. A longitudinal study of many children, whose acuity can be characterized at birth and whose configural processing can be assessed a few years later, can help detect such a link.

Methods

Simulation 1 (Results Presented in Fig. 1).

We used 400 112 × 92 pixel grayscale face images taken from the AT&T ORL Database (73), comprising 10 instances of 40 individuals, allowing for matching across appearance transformations. All images were duplicated and subjected to a nine-pixel blur kernel such that the final set of 800 images consisted of two copies of each image: one in high-resolution and one blurred. For each sample image that was used to query the rest of the database, we matched image patches from 2,604 locations, with 20 box sizes per location. For each image and face patch, we calculated the distance (L1 norm) between the given face patch, Fi, and all other corresponding face patches in each of the remaining images. Ideally, this returns distances to other instances of the same ID (“within ID” values) that are smaller than the distances to all other faces (“across ID” values). The z-score allows to compare these within-ID distances with the across-ID distances; the larger the gap between the average distances, the larger is the corresponding z-score for a given patch. Repeated across all images, this generates a distance matrix from which one can extract the average within-class z-score across all identities, for a given face patch. This is averaged across all patch locations to determine aggregate z-scores. For each image, at every location that a face patch is sampled from, we parametrically change the patch size to calculate the z-score as a function of query box size. By comparing these functions across high-resolution vs. blurred images, we characterize how different patch sizes contribute to achieving the same z-scores depending on image resolution.

Simulation 2.

Data for training and testing the CNN.

A total of 50,429 face-cropped color images (100 × 100 pixels) belonging to 388 different face identities (average = 130 ± 20.4 images per class) were used for training and testing our CNN. The data were provided by the Facescrub image database (47), and all identities with less than 100 available images were rejected. Images were converted from color to grayscale, zero-centered to the overall mean luminance, and scaled by the overall SD. During training, images were randomly left-right flipped and rotated to up to 25°. For blurring images, we applied a Gaussian filter with σ = 0.5, 1, 1.5, 2, 2.5, 3, 3.5, 4. The original images (σ = 0) are termed high-resolution. At a viewing distance of 60 cm, a head subtends ∼15° of visual angle. An individual with 20/20 acuity would be able to resolve 450 cycles in this extent, while a person with 20/600 acuity could resolve 15 cycles. With this basic constraint and given a face image width of w pixels (the specific value of w is dictated by the requirements of the input layer of the CNN), we computed a Gaussian filter that removed spatial frequencies higher than 1/(w/15). Convolving the image with this filter yielded an image approximating the effects of 20/600 acuity. The same procedure was applied for other acuity values. The data were split into 45,385 training and 5,044 test images aiming at a 90/10% training/test data split while keeping a constant number of test instances per class.

Network architecture and parameters.

We utilized the AlexNet architecture (46), adjusted the network’s input size to 100 × 100 pixels, and changed the shape of the 96 kernels in the first layer from 11 × 11 to 22 × 22 pixels to allow more reliable RF analyses. The net contains five convolutional layers (ReLU activation function), followed by three fully connected layers (tanh activation function; last layer equipped with softmax, crossentropy loss function and momentum optimizer). In between fully connected layers, a dropout of 50% is built into the network. The CNN was implemented in the TensorFlow deep learning library TFLearn and trained on a single GPU using a fixed learning rate of 0.001, minibatch learning with batch sizes of 128 and 500 epochs. The CNN instances were trained on images in high-resolution and several different blur levels. In addition, we utilized a high-resolution-to-blurred configuration by training a CNN instance first on high-resolution and later on blurred images (for 250 epochs each). Training an instance in reversed order provided us with a blurred-to-high-resolution configuration.

Weight analysis.

For analyzing the RFs of the CNN, we investigated the shape of the Gabor wavelets in the kernels of the first convolutional layer. To quantify their geometric properties, we considered the two most dominant neighboring ellipses, one containing positive and the other containing negative values. We defined the separation between these two ellipses as the critical metric capturing the RF size. We applied thresholding, erosion, and dilation to extract individual ellipses and separated this process for positive and negative weights. For both, we only kept the detected ellipses with largest area. We then determined the two points of largest Euclidean distance in each region. To measure RF size, we used the distance between the middle point of the two points of the smaller ellipses and the line spanned by the two extreme points of the bigger ellipses. The procedure we use for fitting ellipses to the CNN “RFs” to characterize their structure and size is analogous to approaches adopted in neurophysiology wherein parametric forms, such as Gaussian or Gabor functions, are fit to empirically recorded RF maps (74, 75). We decided to fit each subfield with an ellipse, rather than one functional form to the overall RF, so as not to impose any preconceptions on the overall CNN RF organization.

Since some but not all of the 96 kernels in the first layer depicted a reliable Gabor pattern, we took into account the 10, 15, 20, and 24 kernels showing the most dominant structure (assessed using SD). In all configurations, the color translation has been scaled by a fixed value representing the inner 99% of the histogram of all pixel values in the first layer of the CNN instance trained on high-resolution.

Acknowledgments

We thank Drs. Shlomit Ben-Ami, Rachel Robbins, Bas Rokers, Jitendra Sharma, and Galit Yovel and specially acknowledge the contributions of late Dr. R.H. to this work, and also, more broadly, the fields of visual perception and development. This work was supported by the Nick Simons Foundation and National Eye Institute Grant EYR01020517 (to P.S.).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission. D.G. is a guest editor invited by the Editorial Board.

References

  • 1.Geldart S, Mondloch CJ, Maurer D, de Schonen S, Brent H. The effects of early visual deprivation on the development of face processing. Dev Sci. 2002;5:490–501. [Google Scholar]
  • 2.Putzar L, Hötting K, Röder B. Early visual deprivation affects the development of face recognition and of audio-visual speech perception. Restor Neurol Neurosci. 2010;28:251–257. doi: 10.3233/RNN-2010-0526. [DOI] [PubMed] [Google Scholar]
  • 3.de Heering A, Maurer D. Face memory deficits in patients deprived of early visual input by bilateral congenital cataracts. Dev Psychobiol. 2014;56:96–108. doi: 10.1002/dev.21094. [DOI] [PubMed] [Google Scholar]
  • 4.Rivolta D. Prosopagnosia. Springer; Berlin: 2013. Cognitive and neural aspects of face processing; pp. 19–40. [Google Scholar]
  • 5.Röder B, Ley P, Shenoy BH, Kekunnaya R, Bottari D. Sensitive periods for the functional specialization of the neural system for human face processing. Proc Natl Acad Sci USA. 2013;110:16760–16765. doi: 10.1073/pnas.1309963110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.de Schonen S, Mathivet H. First come, first served: A scenario about the development of hemispheric specialization in face recognition during early infancy. Eur Bull Cogn Psychol. 1989;9:3–44. [Google Scholar]
  • 7.Nelson CA. The development and neural bases of face recognition. Infant Child Dev. 2001;10:3–18. [Google Scholar]
  • 8.Arcaro MJ, Schade PF, Vincent JL, Ponce CR, Livingstone MS. Seeing faces is necessary for face-domain formation. Nat Neurosci. 2017;20:1404–1412. doi: 10.1038/nn.4635. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Ostrovsky Y, Andalman A, Sinha P. Vision following extended congenital blindness. Psychol Sci. 2006;17:1009–1014. doi: 10.1111/j.1467-9280.2006.01827.x. [DOI] [PubMed] [Google Scholar]
  • 10.Ostrovsky Y, Meyers E, Ganesh S, Mathur U, Sinha P. Visual parsing after recovery from blindness. Psychol Sci. 2009;20:1484–1491. doi: 10.1111/j.1467-9280.2009.02471.x. [DOI] [PubMed] [Google Scholar]
  • 11.Held R, et al. The newly sighted fail to match seen with felt. Nat Neurosci. 2011;14:551–553. doi: 10.1038/nn.2795. [DOI] [PubMed] [Google Scholar]
  • 12.Gandhi TK, Ganesh S, Sinha P. Improvement in spatial imagery following sight onset late in childhood. Psychol Sci. 2014;25:693–701. doi: 10.1177/0956797613513906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Levi DM, Carkeet AD. Amblyopia: A consequence of abnormal visual development. In: Simons K, editor. Early Visual Development: Normal and Abnormal. Oxford Univ Press; New York: 1993. pp. 391–408. [Google Scholar]
  • 14.Lewis TL, Maurer D. Effects of early pattern deprivation on visual development. Optom Vis Sci. 2009;86:640–646. doi: 10.1097/OPX.0b013e3181a7296b. [DOI] [PubMed] [Google Scholar]
  • 15.Huttenlocher PR, de Courten C, Garey LJ, Van der Loos H. Synaptogenesis in human visual cortex–Evidence for synapse elimination during normal development. Neurosci Lett. 1982;33:247–252. doi: 10.1016/0304-3940(82)90379-2. [DOI] [PubMed] [Google Scholar]
  • 16.Banks MS, Crowell JA. Front-end limitations to infant spatial vision: Examiniation of two analyses. In: Simons K, editor. Early Visual Development: Normal and Abnormal. Oxford Univ Press; New York: 1993. pp. 91–116. [Google Scholar]
  • 17.Wilson HR. Theories of Infant Visual Development. In: Simons K, editor. Early visual development: Normal and abnormal. Oxford Univ Press; New York: 1993. pp. 560–569. [Google Scholar]
  • 18.Dobson V, Teller DY. Visual acuity in human infants: A review and comparison of behavioral and electrophysiological studies. Vision Res. 1978;18:1469–1483. doi: 10.1016/0042-6989(78)90001-9. [DOI] [PubMed] [Google Scholar]
  • 19.Sokol S. Measurement of infant visual acuity from pattern reversal evoked potentials. Vision Res. 1978;18:33–39. doi: 10.1016/0042-6989(78)90074-3. [DOI] [PubMed] [Google Scholar]
  • 20.Courage ML, Adams RJ. Visual acuity assessment from birth to three years using the acuity card procedure: Cross-sectional and longitudinal samples. Optom Vis Sci. 1990;67:713–718. doi: 10.1097/00006324-199009000-00011. [DOI] [PubMed] [Google Scholar]
  • 21.Banks MS, Bennett PJ. Optical and photoreceptor immaturities limit the spatial and chromatic vision of human neonates. J Opt Soc Am A. 1988;5:2059–2079. doi: 10.1364/josaa.5.002059. [DOI] [PubMed] [Google Scholar]
  • 22.Candy TR, Banks MS. Use of an early nonlinearity to measure optical and receptor resolution in the human infant. Vision Res. 1999;39:3386–3398. doi: 10.1016/s0042-6989(99)00035-8. [DOI] [PubMed] [Google Scholar]
  • 23.Yuodelis C, Hendrickson A. A qualitative and quantitative analysis of the human fovea during development. Vision Res. 1986;26:847–855. doi: 10.1016/0042-6989(86)90143-4. [DOI] [PubMed] [Google Scholar]
  • 24.Jacobs DS, Blakemore C. Factors limiting the postnatal development of visual acuity in the monkey. Vision Res. 1988;28:947–958. doi: 10.1016/0042-6989(88)90104-6. [DOI] [PubMed] [Google Scholar]
  • 25.Kiorpes L, Movshon JA. Neural limitations on visual development in primates. In: Chalupa LM, Werner JS, editors. The Visual Neurosciences. MIT Press; Cambridge, MA: 2004. pp. 158–173. [Google Scholar]
  • 26.Daw N. Visual Development. 3rd Ed Springer; New York: 2014. [Google Scholar]
  • 27.Boas JAR, Ramsey RL, Riesen AH, Walker JP. Absence of change in some measures of cortical morphology in dark-reared adult rats. Psychon Sci. 1969;15:251–252. [Google Scholar]
  • 28.Hendrickson A, Boothe R. Morphology of the retina and dorsal lateral geniculate nucleus in dark-reared monkeys (Macaca nemestrina) Vision Res. 1976;16:517–521. doi: 10.1016/0042-6989(76)90033-x. [DOI] [PubMed] [Google Scholar]
  • 29.Medsinge A, Nischal KK. Pediatric cataract: Challenges and future directions. Clin Ophthalmol. 2015;9:77–90. doi: 10.2147/OPTH.S59009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Lambert SR. Changes in ocular growth after pediatric cataract surgery. Dev Ophthalmol. 2016;57:29–39. doi: 10.1159/000442498. [DOI] [PubMed] [Google Scholar]
  • 31.Ganesh S, et al. Results of late surgical intervention in children with early-onset bilateral cataracts. Br J Ophthalmol. 2014;98:1424–1428. doi: 10.1136/bjophthalmol-2013-304475. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Ma F, Ren M, Wang L, Wang Q, Guo J. Visual outcomes of dense pediatric cataract surgery in eastern China. PLoS One. 2017;12:e0180166. doi: 10.1371/journal.pone.0180166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Maurer D, Lewis TL, Brent HP, Levin AV. Rapid improvement in the acuity of infants after visual input. Science. 1999;286:108–110. doi: 10.1126/science.286.5437.108. [DOI] [PubMed] [Google Scholar]
  • 34.Jacobson SG, Mohindra I, Held R. Development of visual acuity in infants with congenital cataracts. Br J Ophthalmol. 1981;65:727–735. doi: 10.1136/bjo.65.10.727. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Hartmann EE, Lynn MJ, Lambert SR. Infant Aphakia Treatment Study Group Baseline characteristics of the infant aphakia treatment study population: Predicting recognition acuity at 4.5 years of age. Invest Ophthalmol Vis Sci. 2014;56:388–395. doi: 10.1167/iovs.14-15464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Lesueur LC, Arné JL, Chapotot EC, Thouvenin D, Malecaze F. Visual outcome after paediatric cataract surgery: Is age a major factor? Br J Ophthalmol. 1998;82:1022–1025. doi: 10.1136/bjo.82.9.1022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Latif K, Shakir M, Zafar S, Rizvi SF, Naz S. Outcomes of congenital cataract surgery in a tertiary care hospital. Pak J Ophthalmol. 2015;30:28–32. [Google Scholar]
  • 38.Kwon M, Liu R, Chien L. Compensation for blur requires increase in field of view and viewing time. PLoS One. 2016;11:e0162711. doi: 10.1371/journal.pone.0162711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Smith FW, Schyns PG. Smile through your fear and sadness: Transmitting and identifying facial expression signals over a range of viewing distances. Psychol Sci. 2009;20:1202–1208. doi: 10.1111/j.1467-9280.2009.02427.x. [DOI] [PubMed] [Google Scholar]
  • 40.Goffaux V, Hault B, Michel C, Vuong QC, Rossion B. The respective role of low and high spatial frequencies in supporting configural and featural processing of faces. Perception. 2005;34:77–86. doi: 10.1068/p5370. [DOI] [PubMed] [Google Scholar]
  • 41.Young AW, Hellawell D, Hay DC. Configurational information in face perception. Perception. 1987;16:747–759. doi: 10.1068/p160747. [DOI] [PubMed] [Google Scholar]
  • 42.Peterson MA, Rhodes G. Perception of Faces, Objects, and Scenes: Analytic and Holistic Processes. Oxford Univ Press; Oxford: 2003. [Google Scholar]
  • 43.Maurer D, Mondloch CJ, Lewis TL. Sleeper effects. Dev Sci. 2007;10:40–47. doi: 10.1111/j.1467-7687.2007.00562.x. [DOI] [PubMed] [Google Scholar]
  • 44.Le Grand R, Mondloch CJ, Maurer D, Brent HP. Neuroperception. Early visual experience and face processing. Nature. 2001;410:890. doi: 10.1038/35073749. [DOI] [PubMed] [Google Scholar]
  • 45.Le Grand R, Mondloch CJ, Maurer D, Brent HP. Impairment in holistic face processing following early visual deprivation. Psychol Sci. 2004;15:762–768. doi: 10.1111/j.0956-7976.2004.00753.x. [DOI] [PubMed] [Google Scholar]
  • 46.Krizhevsky A, Sutskever I, Hinton GE. 2012. ImageNet Classification with Deep Convolutional Neural Networks, Advances in Neural Information Processing Systems (Curran Assoc, Red Hook, NY), pp 1097–1105.
  • 47.Ng HW, Winkler S. International Conference on Image Processing (ICIP) IEEE; Paris: 2014. A data-driven approach to cleaning large face datasets; pp. 343–347. [Google Scholar]
  • 48.Kiorpes L, Movshon JA. Peripheral and central factors limiting the development of contrast sensitivity in macaque monkeys. Vision Res. 1998;38:61–70. doi: 10.1016/s0042-6989(97)00155-7. [DOI] [PubMed] [Google Scholar]
  • 49.Yamins DLK, DiCarlo JJ. Using goal-driven deep learning models to understand sensory cortex. Nat Neurosci. 2016;19:356–365. doi: 10.1038/nn.4244. [DOI] [PubMed] [Google Scholar]
  • 50.Cynader M, Berman N, Hein A. Recovery of function in cat visual cortex following prolonged deprivation. Exp Brain Res. 1976;25:139–156. doi: 10.1007/BF00234899. [DOI] [PubMed] [Google Scholar]
  • 51.Fagiolini M, Pizzorusso T, Berardi N, Domenici L, Maffei L. Functional postnatal development of the rat primary visual cortex and the role of visual experience: Dark rearing and monocular deprivation. Vision Res. 1994;34:709–720. doi: 10.1016/0042-6989(94)90210-0. [DOI] [PubMed] [Google Scholar]
  • 52.LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
  • 53.Swindale NV, Mitchell DE. Comparison of receptive field properties of neurons in area 17 of normal and bilaterally amblyopic cats. Exp Brain Res. 1994;99:399–410. doi: 10.1007/BF00228976. [DOI] [PubMed] [Google Scholar]
  • 54.Witthoft N, et al. 2016 Reduced spatial integration in the ventral visual cortex underlies face recognition deficits in developmental prosopagnosia. bioRxiv:10.1101/051102. Preprint, posted April 29, 2016. [Google Scholar]
  • 55.Ellemberg D, Lewis TL, Maurer D, Brar S, Brent HP. Better perception of global motion after monocular than after binocular deprivation. Vision Res. 2002;42:169–179. doi: 10.1016/s0042-6989(01)00278-4. [DOI] [PubMed] [Google Scholar]
  • 56.McKyton A, Ben-Zion I, Doron R, Zohary E. The limits of shape recognition following late emergence from blindness. Curr Biol. 2015;25:2373–2378. doi: 10.1016/j.cub.2015.06.040. [DOI] [PubMed] [Google Scholar]
  • 57.Robbins RA, Nishimura M, Mondloch CJ, Lewis TL, Maurer D. Deficits in sensitivity to spacing after early visual deprivation in humans: A comparison of human faces, monkey faces, and houses. Dev Psychobiol. 2010;52:775–781. doi: 10.1002/dev.20473. [DOI] [PubMed] [Google Scholar]
  • 58.Robbins RA, Shergill Y, Maurer D, Lewis TL. Development of sensitivity to spacing versus feature changes in pictures of houses: Evidence for slow development of a general spacing detection mechanism? J Exp Child Psychol. 2011;109:371–382. doi: 10.1016/j.jecp.2011.02.004. [DOI] [PubMed] [Google Scholar]
  • 59.Patel DE, et al. OPTIC Study Group Study of optimal perimetric testing in children (OPTIC): Normative visual field values in children. Ophthalmology. 2015;122:1711–1717. doi: 10.1016/j.ophtha.2015.04.038. [DOI] [PubMed] [Google Scholar]
  • 60.Dobkins KR, Anderson CM, Kelly J. Development of psychophysically-derived detection contours in L- and M-cone contrast space. Vision Res. 2001;41:1791–1807. doi: 10.1016/s0042-6989(01)00070-0. [DOI] [PubMed] [Google Scholar]
  • 61.Lewkowicz DJ. Perception of auditory-visual temporal synchrony in human infants. J Exp Psychol Hum Percept Perform. 1996;22:1094–1106. doi: 10.1037//0096-1523.22.5.1094. [DOI] [PubMed] [Google Scholar]
  • 62.Colombo J. The development of visual attention in infancy. Annu Rev Psychol. 2001;52:337–367. doi: 10.1146/annurev.psych.52.1.337. [DOI] [PubMed] [Google Scholar]
  • 63.Parker DM, Lishman JR, Hughes J. Temporal integration of spatially filtered visual images. Perception. 1992;21:147–160. doi: 10.1068/p210147. [DOI] [PubMed] [Google Scholar]
  • 64.Goffaux V, et al. From coarse to fine? Spatial and temporal dynamics of cortical face processing. Cereb Cortex. 2011;21:467–476. doi: 10.1093/cercor/bhq112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Wilson HR, Blake R, Halpern DL. Coarse spatial scales constrain the range of binocular fusion on fine scales. J Opt Soc Am A. 1991;8:229–236. doi: 10.1364/josaa.8.000229. [DOI] [PubMed] [Google Scholar]
  • 66.Rohaly AM, Wilson HR. Nature of coarse-to-fine constraints on binocular fusion. J Opt Soc Am A Opt Image Sci Vis. 1993;10:2433–2441. doi: 10.1364/josaa.10.002433. [DOI] [PubMed] [Google Scholar]
  • 67.Dominguez M, Jacobs RA. Developmental constraints aid the acquisition of binocular disparity sensitivities. Neural Comput. 2003;15:161–182. doi: 10.1162/089976603321043748. [DOI] [PubMed] [Google Scholar]
  • 68.Griffiths SK, Brown WS, Jr, Gerhardt KJ, Abrams RM, Morris RJ. The perception of speech sounds recorded within the uterus of a pregnant sheep. J Acoust Soc Am. 1994;96:2055–2063. doi: 10.1121/1.410147. [DOI] [PubMed] [Google Scholar]
  • 69.Ragó A, Honbolygó F, Róna Z, Beke A, Csépe V. Effect of maturation on suprasegmental speech processing in full- and preterm infants: A mismatch negativity study. Res Dev Disabil. 2014;35:192–202. doi: 10.1016/j.ridd.2013.10.006. [DOI] [PubMed] [Google Scholar]
  • 70.Gordon RA, Donzis PB. Refractive development of the human eye. Arch Ophthalmol. 1985;103:785–789. doi: 10.1001/archopht.1985.01050060045020. [DOI] [PubMed] [Google Scholar]
  • 71.Baily C, O’Keefe M. Paediatric aphakic glaucoma. J Clin Exp Ophthalmol. 2012;3:203. [Google Scholar]
  • 72.Sinha P, Held R. Sight-restoration. F1000 Med Rep. 2012;4:17. doi: 10.3410/M4-17. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Samaria F, Harter A. Proceedings of 2nd IEEE Workshop on Applications of Computer Vision. IEEE; Sarasota, FL: 1994. Parameterisation of a stochastic model for human face identification. [Google Scholar]
  • 74.Ringach DL. Spatial structure and symmetry of simple-cell receptive fields in macaque primary visual cortex. J Neurophysiol. 2002;88:455–463. doi: 10.1152/jn.2002.88.1.455. [DOI] [PubMed] [Google Scholar]
  • 75.Niell CM, Stryker MP. Highly selective receptive fields in mouse visual cortex. J Neurosci. 2008;28:7520–7536. doi: 10.1523/JNEUROSCI.0623-08.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES