Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2021 Feb 4;17(2):e1008548. doi: 10.1371/journal.pcbi.1008548

Functional parcellation of mouse visual cortex using statistical techniques reveals response-dependent clustering of cortical processing areas

Mari Ganesh Kumar 1,*,#, Ming Hu 2,*,#, Aadhirai Ramanujan 1, Mriganka Sur 2,‡,*, Hema A Murthy 1,‡,*
Editor: Blake A Richards3
PMCID: PMC7888605  PMID: 33539361

Abstract

The visual cortex of the mouse brain can be divided into ten or more areas that each contain complete or partial retinotopic maps of the contralateral visual field. It is generally assumed that these areas represent discrete processing regions. In contrast to the conventional input-output characterizations of neuronal responses to standard visual stimuli, here we asked whether six of the core visual areas have responses that are functionally distinct from each other for a given visual stimulus set, by applying machine learning techniques to distinguish the areas based on their activity patterns. Visual areas defined by retinotopic mapping were examined using supervised classifiers applied to responses elicited by a range of stimuli. Using two distinct datasets obtained using wide-field and two-photon imaging, we show that the area labels predicted by the classifiers were highly consistent with the labels obtained using retinotopy. Furthermore, the classifiers were able to model the boundaries of visual areas using resting state cortical responses obtained without any overt stimulus, in both datasets. With the wide-field dataset, clustering neuronal responses using a constrained semi-supervised classifier showed graceful degradation of accuracy. The results suggest that responses from visual cortical areas can be classified effectively using data-driven models. These responses likely reflect unique circuits within each area that give rise to activity with stronger intra-areal than inter-areal correlations, and their responses to controlled visual stimuli across trials drive higher areal classification accuracy than resting state responses.

Author summary

The visual cortex has a prominent role in the processing of visual information by the brain. Previous work has segmented the mouse visual cortex into different areas based on the organization of retinotopic maps. Here, we collect responses of the visual cortex to various types of stimuli and ask if we could discover unique clusters from this dataset using machine learning methods. The retinotopy based area borders are used as ground truth to compare the performance of our clustering algorithms. We show our results on two datasets, one collected by the authors using wide-field imaging and another a publicly available dataset collected using two-photon imaging. The proposed supervised approach is able to predict the area labels accurately using neuronal responses to various visual stimuli. Following up on these results using visual stimuli, we hypothesized that each area of the mouse brain has unique responses that can be used to classify the area independently of stimuli. Experiments using resting state responses, without any overt stimulus, confirm this hypothesis. Such activity-based segmentation of the mouse visual cortex suggests that large-scale imaging combined with a machine learning algorithm may enable new insights into the functional organization of the visual cortex in mice and other species.

Introduction

Visual cortex of higher mammals can be segmented into different functional visual areas. Each area has a distinct representation of the visual field and presumably a unique contribution to visual information processing. Historically, the functions of multiple cortical visual areas have been studied in non-human primates, which have well-defined areal parcellations based on visual field representations [14]. In the past few years, it has become clear that the mouse visual cortex can also be divided into different visual areas based on retinotopic organization. Indeed, mice have emerged as important models for studying the structure, function, and development of visual cortical circuits owing to their size, cost, and amenability to genetic perturbations [5].

Different methods have been used to define retinotopically organized visual cortical areas of the mouse brain. These methods include electrophysiological recording of receptive field [6], intrinsic signal imaging of visual field maps [79], voltage-sensitive dye imaging [10], and two-photon calcium imaging of receptive fields and maps [11, 12]. These techniques rely on retinotopic maps within representations of the contralateral visual field to derive visual areas. Precise visual area boundaries can be identified based on the sign of visual field representations [1, 9, 1318], based on the principle that adjacent visual areas share a common representation of either the vertical or horizontal meridian and have essentially mirror-imaged maps across the common border. However, it is not clear if each of these retinotopically defined regions also has a unique functional role in processing visual information. Here, we use data-driven classifiers for studying the six most reliably identified visual areas in mice, namely, primary visual cortex (V1), lateromedial area (LM), anterolateral area (AL), rostrolateral area (RL), anteromedial area (AM) and posteromedial area (PM).

In contrast to the classical approach of studying these different visual areas using their neuronal tuning properties to different stimuli, this paper attempts to study whether the visual area responses can be differentiated from each other for a given stimulus-set utilizing data-driven approaches. We use datasets obtained using wide-field and two-photon imaging. Previous studies have used data-driven approaches such as convolutional neural networks (CNNs) [19] and localized semi-nonnegative matrix factorization (LocalNMF) [20] to derive insights from mouse visual cortex responses obtained using wide-field imaging. In this work, we use principal component analysis (PCA) and linear discriminant analysis (LDA) as a dimension reduction technique to define a subspace which discriminates between neurons/pixels from different visual areas.

The wide-field dataset consists of responses of visual cortex to stimuli such as drifting gratings (varying orientation, spatial frequency, and temporal frequency), and natural movies collected using single-photon wide-field imaging. This imaging technique enables us to acquire data from a large field of view, albeit at a single pixel, multiple-neuron, spatial resolution due to light scattering. The retinotopic border of each area gives a segmentation of visual areas, which is used as ground truth for machine learning models. Utilizing this ground truth information, data-driven models are built using supervised and semi-supervised approaches to identify the visual area boundaries.

The population responses are first projected to a lower dimension space using techniques such as principal component analysis (PCA) and linear discriminant analysis (LDA). The data-driven models are trained using the reduced dimension representation and tested on unseen data. We find that supervised data-driven approaches are able to identify visual area borders with high accuracy. This supervised pipeline is then applied to a publicly available dataset provided by the Allen Institute for Brain Science [21] (available from: http://observatory.brain-map.org/visualcoding). This dataset is collected using two-photon microscopy and consists of individual neuronal responses recorded from the six visual areas. With this dataset, we get an accuracy significantly better than random chance. Resting state or spontaneous responses, obtained without overt visual stimuli, from both datasets are also able to classify areal borders, though classification is better with trial-averaged visually driven responses. The findings suggest that the activity patterns of different visual areas can be used to reliably and accurately classify their borders. Correlation analyses indicate that intra-area correlations are significantly higher than inter-area correlations for both datasets, providing a basis for the classification. We further validate these results by removing retinotopic information gradually from the wide-field training data. The results indicate that different visual areas can be distinguished by statistical characteristics expressed in their visually-driven or resting state activity.

1 Materials and methods

1.1 Ethics statement

The experiments for collecting the wide-field dataset (Section 1.2.1) were carried out under protocols approved by MIT’s Animal Care and Use Committee (Protocol Approval Number: 1020-099-23) and conform to NIH guidelines.

1.2 Datasets

This section briefly describes two datasets collected using wide-field and two-photon imaging, respectively. In both datasets, the mouse visual cortex was first partitioned into different visual areas using a retinotopic map [9]. The wide-field dataset was collected by the authors on awake, head-fixed mice, which transgenically express GCaMP6f or GCaMP6s. Since this dataset was collected using single-photon wide-field imaging, the spatial resolution is limited to the pixels of the microscopic image. The two-photon dataset is a public-dataset released by the Allen Institute for Brain Science. Unlike the wide-field imaging dataset, this dataset has individual neuronal responses recorded from different areas and mice.

1.2.1 Wide-field dataset

Animals and surgery. The dataset for this work was collected from five adult mice (>8 weeks old) of either sex. These mice expressed GCaMP6f or GCaMP6s in excitatory neurons of the forebrain. The mouse lines were generated by crossing Ai93 (for GCaMP6f) and Ai94 (for GCaMP6s) with Emx1-IRES-Cre lines from Jackson Labs. The surgical procedure for implanting headplates and imaging window was similar to that previously described [22]. Mice were anesthetized using isoflurane (3% induction, 1.5-2% during surgery). A custom-built metal head-post was attached to the skull using dental cement, a 5 mm diameter craniotomy was performed over visual cortex of the left hemisphere, and a glass coverslip was glued over the opening (Fig 1A). Care was taken not to rupture the dura mater. The core body temperature was maintained at 37.5° C using a heating blanket (Harvard Apparatus). After recovery from surgery, mice were acclimatized to head fixation and then imaged while awake.

Fig 1. Experimental setup.

Fig 1

A) Diagrammatic representation of a mouse prepared for wide-field imaging. B) Diagrammatic representation of the custom-made one-photon wide-field imaging setup along with the display screen. C) Display configuration relative to imaging the left cortex. D) Visual cortex map showing the sign of visual field representations along with areal boundaries for six visual areas used in this study.

Imaging and visual stimulation. The imaging device used to prepare this dataset was a custom-made one-photon wide-field microscope. The light emitted from a blue LED (Thorlabs) was used to excite the GCaMP, and a monochrome CCD camera (Thorlabs) collected the emitted fluorescent signal (Fig 1B). Cortical responses were recorded at 1392 x 1040 resolution, with a spatial resolution of 200 pixels per mm of the cortex at a maximal frame rate of 20 Hz.

Visual stimuli were presented to head-fixed mice using a large display screen placed perpendicular to the right retina at an angle of 30 relative to the body axis of the animal. The visual display subtended 131° horizontal x 108° vertical at 12 cm eye-to-screen distance. The display was gamma-corrected, and placement was ensured to cover nearly the entire contralateral visual field (Fig 1C). The mean luminance of the screen was kept at ≈55cd/m2.

Retinotopic mapping of the visual cortex was first performed using periodic moving bars with checker-board texture that were 14° wide and spanned the width or height of the monitor. The boundaries of the 6 core visual areas were defined according to procedures described in [9] (Fig 1D). In S1 Fig, we show the horizontal and vertical retinotopy for all five mice along with the area borders. These retinotopically defined boundaries were considered as ground truth for delineating visual areas based on responses to different visual stimuli.

Table 1 gives a summary of different stimuli that were presented. The stimulus set included drifting sinusoidal gratings of different orientations and movement directions, with varying spatial and temporal frequencies; and four natural movies chosen from the Van Hateren movie database [23]. For each movie, additional noisy versions were created by perturbing their spatial correlations, as demonstrated in [22].

Table 1. Summary of different stimuli shown to mice.
S. No Stimuli Name Description
1 Directions/Orientation 16 different sinusoidal gratings with varying directions from 0° to 360° with a step of 22.5°. The spatial and temporal frequencies were fixed at 0.03cycles/degree and 3 Hz, respectively. Michelson contrast of 0.8 was used.
2 Spatial-Frequency 5 different sinusoidal gratings with spatial frequency increasing exponentially from 0.01cycles/degree to 0.16cycles/degree. The temporal frequency was fixed at 3 Hz. For each spatial frequency, the direction was varied from 0° to 360° with a step of 45°. Michelson contrast of 0.8 was used.
3 Temporal-Frequency 5 different sinusoidal gratings with temporal frequency increasing exponentially from 0.5 Hz to 8 Hz. The spatial frequency is fixed at 0.03cycles/degree. For each temporal frequency, the direction was varied from 0° to 360° with a step of 45°. Michelson contrast of 0.8 was used.
4 Natural Movies 4 different movies with natural scenes. For each movie, additional noisy versions were created by perturbing their spatial correlations, as demonstrated in [22].

The different stimuli mentioned in Table 1 were presented 10 times in a block design. In each block, a random permutation of different stimuli was used (for example, in case of directional stimuli, gratings drifting in different directions were presented in random order for each block or trial). The duration of each natural movie stimulus was 4 secs, and that of all the other stimuli was 2 secs. In between two consecutive stimuli, a grey and blank screen was shown for 2 secs to capture the baseline responses. All imaging was done with awake head-fixed mice at rest. In addition to the stimuli mentioned in Table 1, resting state responses were also collected for 15 mins while awake head-fixed mice rested in complete darkness.

1.2.2 Two-photon dataset

This dataset is a subset of the Allen Brain Observatory dataset, available publicly at http://observatory.brain-map.org/visualcoding), as a part of the Allen Mouse Brain Atlas [21]. This dataset has individual neuronal responses recorded from mice of different transgenic Cre-lines. This work uses the Cre-lines “Emx1-IRES” (the whole cortex), and “Nr5a1” (layer 4 specific), which has recordings from all the six visual areas.

The neuronal responses to three different natural movies from this dataset were used in our data-driven analysis. The duration of these three movies were 30, 30, and 120 secs, in order. It is to be noted that the set of natural movies presented in this dataset is different from the ones used for collecting the wide-field dataset. In addition to natural movie responses, we also use the spontaneous/resting state activity of the cells. Resting state activity in this dataset was collected for 5 mins with a plain grey screen.

For each transgenic line, the dataset had neuronal responses of various mice collected using four different session types. In each session, different stimuli were used. For Natural Movies 1, 3, and spontaneous activity, all the neurons that were recorded during “Session A” from a particular transgenic line were selected for analysis. Similarly, for Natural Movie 2, all the neurons from “Session C2” were selected for analysis. Detailed information on different sessions, stimuli, and the data-collection procedure is given in [24]. Since this dataset consists of individual neuronal responses, the numbers of neurons from each area varied from session to session. In Table 2, we detail the number of neurons available from each area for the Cre-lines “Emx1-IRES” and “Nr5a1”. Similar details for other Cre-lines in the Allen Brain Observatory dataset are shown in S1 Table and the corresponding results are given in S2 Table.

Table 2. Number of neurons available for analysis for each Cre-line and session from the Allen Institute dataset.
Cre-line AL LM RL AM PM V1
Emx1-IRES (Session A) 1235 1446 1963 241 536 2199
Emx1-IRES (Session C2) 1148 1238 2085 226 552 964
Nr5a1 (Session A) 178 256 1074 110 203 441
Nr5a1 (Session C2) 106 267 1023 115 234 149

1.3 Feature extraction from neuronal responses

In a traditional setting, the selectivity of neurons to a given visual stimulus set is characterized by their tuning curves. The tuning curves are the averaged firing rate expressed as a function of one or more parameters describing stimuli. The problem with this approach is that the response of the neuron to a particular stimulus is defined just by a scalar value. This section proposes an alternate way to represent the selectivities of neurons using dimensionality reduction techniques such as principal component analysis (PCA) and linear discriminant analysis (LDA).

1.3.1 Principal component analysis

PCA is a statistical dimensionality reduction technique which can be used to find directions of maximum variability. PCA was first introduced in [25] as an analog of the principal axis theorem; it was later developed and proposed in [26]. PCA is used extensively in diverse fields from neuroscience to physics because it is a simple, non-parametric method of obtaining relevant information from complex datasets. Mathematically, PCA can be defined as an orthogonal linear transform from a set of possibly correlated bases into a set of orthogonal bases called principal components. The principal components are the eigenvectors of the covariance matrix obtained by solving the generalized eigenvalue problem.

Σ=E[(X-μ¯)(X-μ¯)t] (1)
Σ×V=V×D (2)

where X is the data matrix with each column corresponding to an instance of the data, μ¯ is the mean of the data matrix, Σ is the covariance matrix, V is the eigenmatrix containing eigenvectors of Σ, and D is a diagonal matrix of eigenvalues. Any given data instance xi can be represented as a weighted sum of the principal components.

xi=n=1Nαinvk (3)

where N is the total number of principal components, the contribution of each principal component vk to the ith example is given by αin. The principal component corresponding to the largest eigenvalue captures the maximum variance present in the dataset. For any given K, the principal components corresponding to top K eigenvalues weighted by the corresponding αis can be shown to give minimum reconstruction error [27].

PCA is a general technique for reducing dimensionality. In the literature, it has been widely used for modeling neuronal responses [2832]. In this paper, we use PCA to reduce the response length (time series) of a neuronal population to a lower dimension by accumulating statistics across neurons. The PCA features computed in this way can be used to represent the neuronal response in a lower dimension space.

1.3.2 Linear discriminant analysis

LDA is a supervised dimension reduction technique. PCA finds a set of directions explaining the variance in the dataset without using any class labels. On the other hand, LDA uses the class labels to find a linear projection that discriminates them [33].

Let C be the total number of visual areas. The scatter between different visual areas can be explained by the matrix Σb as given in Eq 4

Σb=1Cc=1C(μ¯c-μ¯)(μ¯c-μ¯)t (4)

where μ¯c is the mean of the responses of visual area c, and μ¯ is the mean of the entire dataset. The LDA projection matrix W^ is computed using scatter matrix Σb and covariance matrix Σ as given in Eq 5.

W^=argmaxW|WtΣbWWtΣW| (5)

Any given data instance xi can be represented as a weighted sum of LDA components.

xi=n=1Nβinwk (6)

where N is the total number of LDA dimensions chosen, the contribution of each LDA basis wk to the ith example is given by βin. wk is the k-th LDA direction. Similar to PCA, LDA is also used to reduce the dimension of population neuronal responses. However, PCA is primarily a technique for data compression, whereas LDA projects to a sub-space where the discrimination between the classes is maximal. We specifically employed LDA using the visual area segmentation (obtained by retinotopy) to obtain the projection matrix.

2 Results

2.1 Supervised classification of mouse visual areas

In this experiment, the responses of visual areas to various stimuli were tested for signatures that discriminate among them using supervised models. To ensure that the results were not biased towards a classifier, different supervised classifiers, namely, parametric unimodal and multimodal Bayesian classifiers [27], neural networks [34] and SVM classifiers [35] were used to classify visual areas based on their responses.

The wide-field/two-photon calcium response for any given visual stimulus was first averaged across trials and converted to a lower dimension space using PCA followed by LDA (Section 1.3). It is to be noted that the stimulus information (Section 1.2) was only used to average across trials, and no other explicit information about the stimulus configuration or retinotopy was given as input (we cannot exclude the possibility that stimuli such as natural movies contain implicit retinotopic information). The reduced dimension feature vectors were then used to train different supervised classifiers.

For the wide-field dataset, pixels in each visual area were divided into ‘training’ and ‘test’ sets. The models were trained using only about 5% of data from each area that were chosen randomly. This is because a high correlation was observed among the nearby pixels in wide-field data. With 5% sampling, these pixels are scattered such that their correlations are minimal. If the features of pixels within each area are similar, then it should be possible to classify them using the trained models.

The two-photon dataset captures responses of only few individual neurons from each area (Table 2). Hence, 5% of the total number of neurons from each area were not sufficient for training the supervised models. Thus, for the two-photon dataset, we randomly sampled 50% of the neurons from each area and kept the remainder for test. A general pipeline for the supervised model is given in Fig 2. The different classifiers used are described briefly below.

Fig 2. Pipeline for supervised classification of visual areas.

Fig 2

A) Block diagram for supervised classification of visual cortex. B) The pixels chosen for training the classifiers are shown as black dots. C) Result of classifying visual cortex using the supervised GMM classifier. Boundaries in black denote the ground truth.

2.1.1 Uni-modal Bayesian classifier

In this method, visual areas are modeled using a parametric unimodal distribution. The equation for the Gaussian distribution is given below:

p(x|λ)=N(x|μ¯,Σ)=(2π)-d/2|Σk|-1/2× (7)
exp{-12(x-μ¯)TΣ-1(x-μ¯)} (8)

where x is a D-dimensional random vector that describes the response of a pixel or neuron. λ={μandΣ} are the parameters that describe the Gaussian distribution. For each visual area, the parameters are determined using maximum likelihood estimation (MLE). With the generative models trained using MLE, Bayes’ rule is used for classification, where it is assumed that the priors are all equal.

A^=argmax1kAp(x|λk) (9)

where A is the total number of visual areas, λk describes the model for the kth visual area, and A^ is the predicted visual area. Bayes classifier in this setup reduces to a maximum likelihood classifier (Eq 9). As the covariance matrix of each class is different, the boundaries between different areas can be a hyper-quadratic surface.

2.1.2 Multimodal Bayesian classifier

Gaussian mixture models (GMMs) are used to model visual areas in this method. GMMs are generative models. The motivation for using GMM is that it uses a multimodal probability density function to represent each visual area. GMMs can model any non-linear boundary between the visual areas in principle. A GMM is a density function which is a weighted sum of M independent component densities given by the equation:

p(x|λ)=k=1MwkN(x|μ¯k,Σk) (10)

where x is a D-dimensional random vector that describes the neuronal response, N(x|μ¯k,Σk),k=1,,M, are the unimodal component densities with mean μk and covariance Σk. wk, denote the respective component weight. λ={wk,μk,Σk}k=1M defines the GMM for a particular visual area. The problem of fitting a GMM is an incomplete data problem. Hence, the mixtures need to be estimated iteratively using Expectation-Maximization (E-M). The procedure for training these parameters using E-M is detailed in [27]. A separate GMM is trained for each visual area by randomly sampling pixels from each visual area. For classifying the test data, the Bayes classifier given in Eq 9 was used. The number of mixtures for the GMMs is a hyperparameter and was estimated empirically.

2.1.3 Support vector machine

Let D={xi,yi}i=1N, where xiRd represent the training data, yi ∈ {−1, 1} represent the corresponding labels, N is the total number of data points, and d is the dimension of the feature vectors. In this setup, the support vector machine (SVM) finds a maximum separating hyperplane as follows:

f(x)=wTψ(x)+b (11)

where w is a normal vector to the separating hyperplane, and b is the bias of the same. ψ(.) is a transformation function from input feature space to kernel space. The optimization problem for obtaining the separating hyperplane is given by:

minimize12||w||2+Ci=1Nξi (12)
subjecttoyi[wTψ(x)+b]1-ξii=1N (13)
ξi0i=1N (14)

where C is a hyperparameter and ξi is a slack variable that accounts for non-separable data problems. The details of the optimization algorithm can be found in [35]. A single SVM can solve only a binary class problem, a one-against-one approach was used to model the multi-class problem, and the final class label was determined using voting strategy. An open-source library LIBSVM [36] was used to implement visual area classifier using SVM.

2.1.4 Artificial neural networks

Artificial neural networks (ANN) are non-linear classifiers that can be trained to predict the area label from the response of the pixel. The advantage of using a feed-forward neural network over a conventional GMM is that the ANN is trained in a discriminative way. Let oi = f(xi) be a non-linear function modeled by ANN, where xi is the input vector representing the neuronal response and f(.) is a differentiable non-linear function that can be modeled using ANN. Since this is a multi-class classification problem, softmax non-linear function is used as the activation function for the output layer. Softmax function outputs the probability of different classes. Therefore, the natural choice for the cost function is the cross-entropy between the target class labels and the output of the softmax function. The ANN is trained using backpropagation of the gradient of this cost function. A simple single hidden-layer ANN with 30 nodes was chosen to classify the neuronal response.

For each classifier, the rank-1 classification accuracy of the test data was used as the evaluation metric. The results were averaged across five random initializations of training data. The average and standard deviation of the classification accuracy obtained for the wide-field dataset are given in Table 3.

Table 3. Accuracy of supervised classification on wide-field data.

The results are averaged across random initializations. The entries denote “% accuracy (± standard deviation)”.

Mouse Number Stimulus Supervised
GMM SVM ANN Bayes
1 Directions 94.3 (±0.63) 94.1 (±0.16) 94.3 (±0.32) 93.0 (±0.48)
Spatial-Frequency 94.4 (±0.43) 94.2 (±0.51) 94.4 (±0.65) 94.0 (±0.26)
Temporal-Frequency 90.8 (±0.92) 90.2 (±0.86) 90.8 (±0.81) 90.1 (±0.73)
Natural Movies 97.0 (±0.13) 96.2 (±0.27) 97.6 (±0.44) 92.7 (±0.50)
2 Natural Movies 97.4 (±0.18) 97.4 (±0.15) 97.4 (±0.36) 96.8 (±0.11)
3 Natural Movies 97.2 (±0.14) 97.0 (±0.28) 97.1 (±0.08) 96.4 (±0.12)
4 Natural Movies 94.9 (±0.33) 82.5 (±0.77) 95.9 (±0.82) 87.2 (±0.29)
Resting State 98.4 (±0.15) 98.3 (±0.15) 98.7 (±0.20) 97.7 (±0.10)
5 Natural Movies 96.1 (±0.38) 83.9 (±0.52) 97.4 (±1.01) 84.5 (±0.50)
Resting State 96.7 (±0.32) 96.9 (±0.14) 97.8 (±0.17) 95.4 (±0.35)

From the results of supervised classifiers in Table 3, it can be observed that the classifiers have similar performance across visual stimuli. Further, the unimodal Bayesian classifier gave the poorest classification accuracy, while the non-linear classifiers GMM and ANN gave the best performance.

We obtained areal boundaries by applying supervised GMM classification to neuronal population responses of different visual stimuli (Fig 3A). The visual areas predicted by the supervised models are color-coded. The area borders obtained by the classification are close to the retinotopic boundaries for all visual stimuli and all mice used. These results were verified to be consistent by training and testing responses of different mice to natural movie stimuli (Fig 3B).

Fig 3. Analysis of visual cortex responses to different stimuli using supervised GMM classifiers.

Fig 3

A, B) The boundaries obtained by classifying all the pixels using the GMM classifier. Each color represents the visual area identified by the classifier and the black boundary within the cortex corresponds to ground truth retinotopic boundaries. The values within the bracket denote the classification accuracy. In A, the results are compared across different visual stimuli. The title of the plot indicates the visual stimuli shown to the mice. In B, the supervised classifier is verified to be consistent across different mice for natural movie stimuli. C) Results on resting state responses for two mice. D) Pixels selected for training the supervised model are limited to center x% of the radius of the visual area. This x% is mentioned as sample radius in the title of the plots in the first row of D, and the pixel used for training the supervised model is shown as black dots. The corresponding classification boundaries are shown in the second row of D, and the “ACC” values denote the accuracy.

Wide-field imaging captures the aggregated responses of hundreds of neurons, and pixels that are close together are highly correlated. Since the training data for the supervised model are sampled randomly from each visual area, the classification accuracy observed in Fig 3A and 3B can be an artifact of correlated responses. Consequently, this pipeline for the supervised classifiers was further verified with the two-photon dataset described in Section 1.2.2, which is obtained from individual neuron responses. The results are summarized in Table 4. Here, we obtain an average and a maximum accuracy of ≈ 58% and 70%, respectively.

Table 4. Accuracy of supervised classification on two-photon dataset.

The results are averaged across random initializations. The entries denote “% accuracy (± standard deviation)”.

Cre-line (Session) Stimuli Accuracy of Supervised Classifier
GMM SVM ANN Bayes
Emx1-IRES (Session A) Natural Movie 1 54.6 (±0.75) 56.8 (±0.65) 56.6 (±0.88) 54.3 (±0.70)
Natural Movie 3 70.1 (±0.86) 70.8 (±1.15) 70.2 (±1.39) 68.9 (±0.93)
Resting State 57.2 (±1.33) 61.3 (±1.30) 60.6 (±1.73) 48.1 (±1.31)
Emx1-IRES (Session C2) Natural Movie 2 52.6 (±0.58) 55.2 (±0.70) 55.0 (±0.65) 52.0 (±0.18)
Nr5a1 (Session A) Natural Movie 1 53.0 (±1.14) 52.7 (±0.99) 51.18 (±0.92) 45.3 (±0.75)
Natural Movie 3 59.6 (±1.19) 61.0 (±0.98) 60.2 (±0.89) 54.9 (±1.58)
Resting State 45.6 (±2.89) 53.8 (±1.18) 52.7 (±1.10) 44.3 (±0.93)
Nr5a1 (Session C2) Natural Movie 2 55.1 (±1.20) 56.2 (±1.76) 54.3 (±1.45) 46.3 (±0.87)

In Fig 4, we show examples of confusion matrices, obtained using mouse M1 from wide-field dataset and Emx1-IRES Cre-line from two-photon dataset, respectively. In S2 Fig, we show the same for the entire dataset. For the wide-field dataset, responses from other areas were mostly predicted as V1 (Fig 4), which is not unexpected since V1 projects to each of the other areas. Following V1, the areas AL, LM and AM, PM had the next highest confusion. This is consistent with previous studies suggesting that these areas may constitute different processing streams [12, 3739]. The confusion observed in the two-photon dataset were variable. Importantly, however, for both datasets, the majority of neurons were predicted correctly.

Fig 4. Confusion matrices for test data obtained using supervised classifier.

Fig 4

The diagonal values denote the precision (in %) of each class. Off-diagonal values denotes the false prediction rate (in %) for the predicted class given the actual class. A) Confusion matrix obtained using responses of Mouse M1 and Natural Movie stimuli. B) Confusion matrix obtained using the Cre-line Emx1-IRES and Natural Movie 3 stimuli from dataset 2. In S2 Fig, we show the confusion matrices for all the remaining data.

To demonstrate the significance of results obtained in Tables 3 and 4, we compared the results with two random classifiers. First, a random, unbiased six-faced die was considered. This random classifier will give a chance level accuracy of 16.67%, irrespective of the dataset. Secondly, we considered a six-faced die biased by the proportion of different area sizes (or the number of pixels/neurons used during training). For the wide-field dataset, this random classifier will give an average chance level accuracy of 37.6% (averaged across all five mice) and a maximum of 51.1% (for M1). Similarly, for dataset 2, this classifier will give an average chance accuracy of 26.7% and a maximum of 33.9% (for Nr5a1 Session C2). For both datasets, the results obtained in Tables 3 and 4 were much higher than the random classifiers. These results suggest that the responses are indeed discriminative between different areas.

The major difference between the two-photon and the wide-field dataset is that in the latter, the neighboring pixels can have correlated responses. To further test the effect of correlated responses in the wide-field dataset, the selection of training pixels was restricted to the center of visual areas. The training samples were limited to the circle formed with a sample radius from the center of each visual area. (first row of Fig 3D). The corresponding classification accuracies are reported in the second row of Fig 3D. Note that, despite the limited sampling, the classifiers were still able to classify the visual area accurately. These results thus indicate that “the responses of different visual areas to a range of visual stimuli can be used to reliably and accurately classify their borders”.

These visual responses represent the net effect of feed-forward, local, long-range, and feedback circuits that drive neurons. We hypothesized that even the background or resting state responses, obtained without any overt visual stimulation, should contain the signatures required to identify each area. We tested this hypothesis by using resting state recordings from both datasets. In all experiments done with visual stimuli, the responses were averaged across trials to obtain stimulus-specific responses. However, in the resting state dataset, spontaneous responses changed with time without any explicitly defined trial structure. For the wide-field dataset, the non-averaged 800 secs of resting state responses excluding initial and final 50 secs of 15 mins recording sessions (to exclude non-stationary transients) were used in this analysis. The results of this analysis are shown in Fig 3C and show consistent high accuracy comparable to that obtained with visual stimuli (Table 3). Similarly, for the two-photon dataset, non-averaged 240 secs of resting state responses were used after excluding the initial and final 30 secs of total 5 mins recording from “Session A” of the dataset. Table 4 shows the result of classifying the resting state neuronal responses from dataset 2.

Irrespective of stimulus-driven or resting state responses, on both datasets, the proposed classification pipeline gave results that were much better than chance. The results of supervised classifiers suggest that the activity of each area has a specific statistical characteristic which can be used identify the area. As a control experiment, keeping the test data labels intact, we shuffled the training data labels randomly. For both the datasets, this led to accuracy of ≈ 14% to 18% (similar to the random chance level of 16% for 6 classes). This experiment shows that the observed results are not an artifact of the supervised approach. To further evaluate the influence of supervised labels in the results observed, in Section 2.2 we use a semi-supervised clustering approach for the wide-field dataset.

2.2 Semi-supervised clustering of mouse visual areas

In Section 2.1, the boundaries defined by retinotopy were used as ground truth, and the visual areas were tested for discriminative responses. Even after restricting the training data points to small regions at the center, reliable classification accuracy was obtained. In this Section, the supervised retinotopic information was further decreased to few pixels in the center of each visual area (shown in Fig 5B). In the supervised approach, 5% of pixels were randomly chosen from each visual area for training the classifier. In such a sampling, the number of pixels chosen for training is proportional to the size of the area. In semi-supervised clustering, the same amount of training data is used from each area.

Fig 5. Pipeline for semi-supervised clustering of visual areas.

Fig 5

A) Block diagram of the clustering steps. B) Initial clusters that are labeled using the retinotopic map. C) Neighbors of a labeled cluster for area AM. BIC score is computed between these neighbors, and closest few are merged every iteration. D) An intermediate step in the clustering process. E) Final clustering result with accuracy (ACC).

In this approach of semi-supervised clustering, a special GMM called universal background model (UBM) was used. UBM is a model widely used for biometric verification, especially in speech [40]. UBM models the person-independent characteristics which can be used as a reference against the person-specific models. For example, in the case of speaker verification biometric, UBM is a speaker-independent GMM that is trained using speech samples obtained from a large set of speakers. Using maximum-a-posteriori (MAP) adaptation, the UBM can be adapted to be speaker-specific [41]. For our application, the UBM refers to a GMM trained using the responses of the entire visual cortex. Since the UBM is expected to model the entire visual cortex, a large number of Gaussians are required as compared to a GMM for individual visual areas.

The visual cortex was initially divided into small clusters of equal size such that they each had adequate data for MAP adaptation. Using MAP, UBM means were adapted to obtain cluster-specific GMMs. These cluster-specific GMMs are probability density functions that represent the responses of each cluster with UBM as the reference. The entire visual cortex was then hierarchically clustered, starting from the center clusters of each visual area (Fig 5B).

Let a and b represent the two neighboring clusters. The score for merging the clusters is calculated as follows:

Sa,b=logp(D|λ)-(logp(Da|λa)+logp(Db|λb)) (15)

where,

  • Da and Db represent the data in each of the clusters. D is the data of the combined cluster, i.e., DaDb.

  • λa represents the parameters of the GMM obtained by MAP-adapting the UBM to Da. Similarly, λb and λ are the parameters obtained by adapting the UBM to Db and D, respectively.

Eq 15 is the measure of the increase in likelihood after merging the models. It is a modified version of the Bayesian Information Criterion (BIC) [42] first proposed for clustering in [43]. For every iteration, a threshold of the score is determined, and the clusters are merged. Adaptive thresholding for every iteration eliminates the need for the penalty term given in [42]. The center-most cluster is the only supervised information given to the algorithm. The details of the semi-supervised clustering approach used in this paper are given below:

  • Step 1: The dimensions of wide-field responses are reduced using PCA (as described in Section 1.3).

  • Step 2: An UBM is trained using the response of the entire visual cortex.

  • Step 3: The entire visual cortex is divided into small grids of equal size.

  • Step 4: The center cluster/grid of each visual area is labeled using the retinotopic map. This step has been demonstrated in Fig 5B.

  • Step 5: The UBM is adapted to the labeled clusters and their neighbors to form cluster-specific-GMMs.

  • Step 6: The BIC score is calculated between the labeled clusters and their neighbors (as shown in Fig 5C).

  • Step 7: The BIC scores computed in Step 5 are sorted, and the top x% are merged. This x% starts with 80% and reduces to 20% as the clusters are labeled.

  • Step 8: Steps 4 to 6 are repeated until all the grids are labeled.

  • Step 9: Finally, to smoothen the boundaries, a supervised classifier is trained by sampling from the final clusters.

The result of clustering the visual areas using the semi-supervised approach is given in Table 5. The UBM modeled using the entire visual cortex response can be slightly different for different initialization, and it could affect the final clustering result. Hence, the clustering was performed using three different random initializations, and the mean and standard deviation of the obtained accuracy is given in Table 5. The boundaries obtained for different mice and stimuli are shown in Fig 6.

Table 5. Accuracy of semi-supervised segmentation on wide-field dataset.

The results are averaged across random initializations of UBM. The entries denote “% accuracy (± standard deviation)”.

Mouse Number Stimuli Accuracy
1 Direction 64.54 (±4.19)
Spatial-Frequency 56.3 (±3.42)
Temporal-Frequency 59.5 (±4.74)
Natural Movies 61.1 (±2.01)
2 Natural Movies 71.1 (±0.87)
3 Natural Movies 78.0 (±0.90)
4 Natural Movies 78.0 (±3.43)
Resting State 72.1 (±1.71)
5 Natural Movies 77.1 (±1.12)
Resting State 78.2 (±0.40)

Fig 6. Boundaries obtained by clustering visual cortex areas using the semi-supervised pipeline.

Fig 6

A) Boundaries derived using different visual stimuli in one mouse. Visual stimuli and accuracy (in %) are noted. B) Boundaries obtained for different mice using natural movies as stimuli. C) Boundaries obtained for different initializations of UBM for the same mouse and keeping visual stimuli unchanged. D) Boundaries derived from resting state responses.

The semi-supervised clustering resulted in a classification accuracy of about 70% (Table 5). Although the results are inferior compared to the supervised classification, the boundaries were observed to be close to the ground truth retinotopic boundaries in Fig 6. Irrespective of mice and visual stimuli, V1 was identified as the most significant visual area even though the same amount of labeled data was used from each area (Fig 6).

In Fig 6C, two different boundaries obtained by different random initializations of the UBM are shown. Although the accuracies obtained were different, the boundaries were consistent with each other. The clusters formed using the resting state responses were also consistent with the retinotopic maps (Fig 6D). This result provides additional support for our conjecture that there are intrinsic response characteristics of each visual area that generalize across stimuli and enable classification.

2.3 Resting state vs. stimulus-induced responses

In Sections 2.1 and 2.2, the natural movies and other stimuli were presented multiple times and averaged to obtain the stimulus-induced response. Since a trial structure cannot be defined for resting state responses, the dF/F of the signal was used directly as the input. In this section, we compare resting state responses with single-trial and trial-averaged stimulus-induced responses of various durations.

In Fig 7, the accuracies obtained by supervised and semi-supervised approaches are compared between the resting state responses, the single-trial responses of natural movies, and the trial averaged version of the same stimuli by varying the duration. All three results for the wide-field dataset are shown for responses varying from 4.5 secs to 54 secs (12 movies). In the two-photon dataset, the responses were sampled varying from 10 secs to 110 secs for natural movie 3 (120 secs) and spontaneous responses from “Session A” of the dataset.

Fig 7. Accuracies obtained by the supervised/semi-supervised pipeline with varying response lengths of resting state and stimulus-induced responses.

Fig 7

A, B) Results for Mouse M4 and M5, respectively, using the supervised approach. C, D) Results using the semi-supervised approach. E, F) Results for the two-photon dataset, using the Cre-lines Emx1-IRES and Nr5a1, respectively.

Fig 7 shows that for both datasets, the results of stimulus-driven responses that were averaged across multiple trials had higher accuracy and asymptoted faster than the resting state responses. In the constrained semi-supervised clustering, the responses to natural movies were observed to classify the areas much better than the resting state responses. It is interesting to note that the supervised and semi-supervised approach was able to predict the boundaries accurately with just 4.5 secs of data. For the resting state responses, when the duration was increased to 800 secs (as in Sections 2.1 and 2.2), they also reliably clustered the visual areas, with accuracy close to natural movie responses. The same results were observed with the two-photon dataset as well. This result suggests that stimulus-driven responses contain better-discriminating responses compared to responses without an overt stimulus. Stimulus driven responses are trial averaged, thus have low intra-class variability. However, with resting state responses, a longer duration is required to model the intra- and inter-class variabilities.

Fig 7 also shows the result of using single-trial natural movie responses. In the wide field dataset, we observed that the single-trial responses give better accuracy than the resting state responses. However, with the two-photon dataset, the performance of spontaneous and single-trial movie responses was similar.

3 Discussion

In this paper, the responses of six visual areas of mouse cortex to various stimuli were studied using data-driven methods. First, in Section 2.1, the retinotopically defined boundaries were used as ground truth to train supervised classifiers, and the functional uniqueness of the visual areas was evaluated in terms of classification accuracy on held-out test data. The result indicates that the supervised models were able to discriminate retinotopic borders of different areas using any of the stimuli. Although limiting the training data to the center resulted in a decrease in classification accuracy, the degradation in performance was graceful. It is to be noted that even the retinotopically defined boundaries can vary in detail between recording sessions [9]. Given these results of supervised classifiers trained on a range of stimuli, we conjectured that each area of the mouse brain has specific response patterns that reflect their underlying local and long-range input circuits. The scalability of the proposed approach on the dataset from the Allen Institute and the experiments on the resting state analysis (with no overt stimuli) adds to this hypothesis. To further test this hypothesis and to reduce the impact of the correlated dataset, the supervised information was minimized in Section 2.2 using a semi-supervised approach.

In the semi-supervised approach, the center-most pixels were the only supervised information given to the algorithm. From the center-most cluster, each visual area was expanded iteratively. The accuracy of the final clustering was used as a measure of functional uniqueness (and activation pattern) of the visual areas. Even with this small amount of supervised information, the algorithm was able to find boundaries that were consistent with that of retinotopically obtained boundaries. This result, together with the resting state data, adds to the conclusion that visual areas have characteristic responses that can be used to classify their borders.

To analyze the nature of such response features, in Fig 8, we show the inter- and intra-areal correlations between the neuronal responses that were input into the pipeline. Fig 8A and 8B shows the correlation computed for wide-field data using natural movies and resting state responses, respectively. Similarly, Fig 8E and 8F shows the same result for data from the Emx1-IRES subset of the two-photon dataset. For both datasets, the average intra-area correlation is consistently higher than the inter-area correlation. Even with resting state responses, in the absence of overt visual stimuli, the responses are more correlated within an area. The ratio of average intra- to inter-area correlations calculated on these responses were 1.1 and 2.5 for the wide-field and two-photon datasets, respectively. These raw signals were preprocessed using PCA and LDA before they were given to the classifiers. In Fig 8C, 8D, 8G, and 8H, the correlations computed in the LDA domain are shown. The LDA space significantly improved the ratio of average intra-area and inter-area correlations to 8.8 and 9.7 for the wide-field and two-photon datasets, respectively. In Fig 9, we show the 2D visualization of LDA subspace obtained using wide-field (mouse M4) and two-photon (Emx1-IRES Cre-line) datasets. T-distributed stochastic neighbor embedding (tSNE, [44]) was used to convert the multi-dimensional LDA subspace into a visualizable 2D space. The LDA subspace is able to cluster neurons from different areas owing to correlated examples from the training data, in both visually driven and resting state responses from wide-field and two-photon datasets (Fig 9). By applying different supervised classifiers to these subspaces, we were thus able to identify the area labels with high accuracy.

Fig 8. Intra-area and inter-area correlations computed on input responses and LDA features.

Fig 8

AD) Correlations computed from mouse M4 of wide-field dataset. EH) Correlations computed from Emx1-IRES Cre-line of two-photon dataset. The correlations are computed as averages over all unique pairs of neurons/pixels in the test data, which were not used to train the LDA projection matrix. The correlations in the two-photon dataset are computed using data pooled from different mice and multiple sessions. In S3 and S4 Figs, we present a detailed correlation analysis with information about individual mice and session. Further examples of correlations analysis are shown in S5S7 Figs.

Fig 9. Two-dimensional representation of the supervised LDA subspace.

Fig 9

A, B) LDA subspace of wide-field dataset (mouse M4). C, D) LDA subspace of two-photon dataset (Cre-line Emx1-IRES). The plots on the left (A, C) are obtained from natural movie responses and that on the right (B, D) are obtained from resting state responses.

The results obtained in Tables 3 and 4 are not based on a single classifier. We have shown the proposed supervised methods to work with generative (GMM, Unimodal Bayes) and discriminative (SVM, ANN) classifiers. We have also shown the approach to work with linear (Unimodal Bayes, SVM) and non-linear (ANN, GMM) classifiers. This shows that the obtained results are mainly because of the proposed dimension reduction using PCA and LDA rather than the classifier (Fig 9). In addition to this, the results in Tables 3 and 4 and Fig 4, show that the classification accuracy and confusion matrix are poor for the two-photon dataset when compared to the wide-field dataset. We note that the pixels in the wide-field dataset represent the pooled response of neurons, whereas the two-photon dataset captures individual neuron responses. There is higher variability in the neuronal responses, which leads to poorer performance. In addition, for the two-photon dataset, we selected all the neurons available for the given Cre-line enabling us to study all six areas. However, this hardly represents the entire population response from these areas, as only a few neurons were recorded from each area, and they were pooled from different mice (but see S3 and S4 Figs).

Our findings show that resting state responses are an important complement to visual responses in classifying areas. Visual as well as resting state responses do not arise de novo; rather they both reflect specific underlying input-output connections and circuits in each area. These connections can be activated by internally generated activity without overt visual stimuli, or explicitly by visual stimuli. We show comparable results from two different datasets, each with multiple visual stimuli along with resting state responses, demonstrating that resting state and visual responses can be used to classify area borders (Section 2.1). In Section 2.3, we further analyzed the difference between the resting state and stimulus-induced responses. The result shows that the responses averaged across multiple trials give better classification results than the single-trial or resting state responses for a fixed duration.

These findings demonstrate two important features of visual areas in mice, relevant to processing of visual stimuli. First, they are consistent with the fact that each cortical area is characterized by a unique pattern of connections and circuits. Some of these may be common to many areas of cortex (eg., local recurrent excitatory connections) whereas others are crafted by a combination of specificity and plasticity (eg., local inhibitory connections, long-range excitatory connections). These connection patterns come into play even with internally generated activity, which characterizes resting state responses. Thus, the intra-areal correlations are higher than inter-areal correlations for both visual and resting state responses (Fig 8). Second, visual stimuli are stronger drivers of internal circuits than resting state activity. Thus, the visually driven responses have higher classifier accuracy than resting state responses for given response durations, particularly when averaged across multiple trials, and in some instances the resting state responses never reach the accuracy of visual responses (Fig 7).

Motivated by the results obtained by supervised and semi-supervised classifiers, we attempted to cluster the pixels from the wide-field dataset without using any labeled data. It is a significantly under-constrained problem to arrive at the boundaries without any explicit retinotopic information. The clustering results were not well matched to retinotopically defined areas.

4 Conclusion

In this work, different machine learning techniques were explored to obtain the visual area boundaries of six cortical areas. The boundaries obtained by supervised models are highly consistent with the ground truth obtained by retinotopic imaging. The results of data-driven models degrade as the supervised information is removed. However, our critical observation is that data-driven models can classify these areas accurately based on responses to a range of visual stimuli, which extends as well to responses in the resting state. This result is consistent with the presence of unique area-specific circuitry in the visual cortex, which shapes visually driven or resting state activity in these areas.

Supporting information

S1 Fig. Horizontal and vertical retinotopy along with sign map within six visual areas of all the mice used in the paper.

Cortical areas of the left hemisphere are shown. Azimuth 0° and 90° correspond to the midline and the far periphery of the contralateral visual field, respectively. Negative values of elevation represent lower visual field and positive values represent upper visual field.

(EPS)

S2 Fig. Confusion matrices obtained using supervised GMM classifiers for all animals and data.

In Fig 4, confusion matrices were shown for an example mouse and a Cre-line from the wide-field and two-photon datasets, respectively. Here we show the confusion matrices for all mice and Cre-lines from the wide-field and two-photon datasets, respectively.

(EPS)

S3 Fig. Intra-area and inter-area correlations computed from natural movies responses of individual mice from the two-photon dataset.

In Fig 8, intra-area and inter-area correlations were computed at the Cre-line level with data pooled from different mice and sessions for the two-photon dataset. Here we present the inter and intra-area correlations for various individual mice from the dataset, which had responses recorded from three or more sessions.

(EPS)

S4 Fig. Intra-area and inter-area correlations computed from resting state responses of individual mice from the two-photon dataset.

In Fig 8, intra-area and inter-area correlations were computed at the Cre-line level with data pooled from different mice and sessions for the two-photon dataset. Here we present the inter and intra-area correlations for various individual mice from the dataset, which had responses recorded from three or more sessions.

(EPS)

S5 Fig. Intra-area and inter-area correlation computed on input responses for all animals and data.

In Fig 8, average intra-area and inter-area correlations computed from input responses were shown for an example mouse and a Cre-line from the wide-field and two-photon datasets, respectively. Here, we show the correlations for all mice and Cre-lines from the wide-field and two-photon datasets, respectively.

(EPS)

S6 Fig. Intra-area and inter-area correlation computed on LDA features for all animals and data.

In Fig 8, average intra-area and inter-area correlations computed from LDA features were shown for an example mouse and a Cre-line from the wide-field and two photon-datasets, respectively. Here, we show the correlations for all mice and Cre-lines from the wide-field and two-photon datasets, respectively.

(EPS)

S7 Fig. Comparison of intra and Inter-area correlation computed using all pixels and patches of pixels with the same retinotopic map.

A, C) Intra and inter-area correlations computed using all pixels. B, D) Corresponding correlations computed using patches of pixels with approximately the same retinotopy from each area and averaged across the entire visual space. These data show that intra-area pixels are more correlated than inter-area pixels.

(EPS)

S1 Table. Number of neurons available for other Cre-lines in the Allen Institute dataset.

In the text, we analyzed Emx1-IRES and Nr5a1 Cre-lines from dataset 2 (Section 1.2.2). Here we present the number of neurons available from all the other Cre-lines that contain all the six visual areas considered in the paper.

(PDF)

S2 Table. Classification accuracy for other Cre-lines in the Allen Institute dataset.

In the text, we presented the classification accuracy for Emx1-IRES and Nr5a1 Cre-lines from dataset 2 (Table 4). Here we present the same for all the other Cre-lines in dataset 2 that contain all the six visual areas considered in the paper.

(PDF)

Acknowledgments

We thank the Centre for Computational Brain Research (CCBR), IIT Madras for enabling the collaboration between Sur Lab, Massachusetts Institute of Technology and Indian Institute of Technology Madras.

Data Availability

The paper presents results on two datasets, one collected using two-photo imaging and another collected using wide-field imaging. The two-photon dataset is a public dataset and can be accessed from http://observatory.brain-map.org/visualcoding. The wide-field dataset used in this study can be accessed from https://doi.org/10.6084/m9.figshare.13476522.v1. For both datasets, the software to reproduce the results are given in https://github.com/CCBR-IITMadras/visual-cortex-response-classification.

Funding Statement

Supported by National Institutes of Health (NIH) grants EY007023 and EY028219 (MS), and the Center for Computational Brain Research (CCBR), IIT Madras, N.R. Narayanamurthy Chair Endowment. Narayanamurthy Chair endowment (MS). The funders had no role in study design, data collection, and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Van Essen DC. Organization of visual areas in macaque and human cerebral cortex. The visual neurosciences. 2003;1:507–521. [Google Scholar]
  • 2. Rosa MG, Tweedale R. Brain maps, great and small: lessons from comparative studies of primate visual cortical organization. Philosophical Transactions of the Royal Society B: Biological Sciences. 2005;360(1456):665–691. 10.1098/rstb.2005.1626 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Gattass R, Nascimento-Silva S, Soares JG, Lima B, Jansen AK, Diogo ACM, et al. Cortical visual areas in monkeys: location, topography, connections, columns, plasticity and cortical dynamics. Philosophical Transactions of the Royal Society B: Biological Sciences. 2005;360(1456):709–731. 10.1098/rstb.2005.1629 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Allman JM, Kaas JH. The organization of the second visual area (V II) in the owl monkey: a second order transformation of the visual hemifield. Brain research. 1974;76(2):247–265. 10.1016/0006-8993(74)90458-2 [DOI] [PubMed] [Google Scholar]
  • 5. Seabrook TA, Burbridge TJ, Crair MC, Huberman AD. Architecture, function, and assembly of the mouse visual system. Annual review of neuroscience. 2017;40:499–538. [DOI] [PubMed] [Google Scholar]
  • 6. Wang Q, Burkhalter A. Area map of mouse visual cortex. Journal of Comparative Neurology. 2007;502(3):339–357. 10.1002/cne.21286 [DOI] [PubMed] [Google Scholar]
  • 7. Schuett S, Bonhoeffer T, Hübener M. Mapping retinotopic structure in mouse visual cortex with optical imaging. Journal of Neuroscience. 2002;22(15):6549–6559. 10.1523/JNEUROSCI.22-15-06549.2002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Kalatsky VA, Stryker MP. New paradigm for optical imaging: temporally encoded maps of intrinsic signal. Neuron. 2003;38(4):529–545. 10.1016/S0896-6273(03)00286-1 [DOI] [PubMed] [Google Scholar]
  • 9. Garrett ME, Nauhaus I, Marshel JH, Callaway EM. Topography and areal organization of mouse visual cortex. Journal of Neuroscience. 2014;34(37):12587–12600. 10.1523/JNEUROSCI.1124-14.2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Polack PO, Contreras D. Long-range parallel processing and local recurrent activity in the visual cortex of the mouse. Journal of Neuroscience. 2012;32(32):11120–11131. 10.1523/JNEUROSCI.6304-11.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Andermann ML, Kerlin AM, Roumis DK, Glickfeld LL, Reid RC. Functional specialization of mouse higher visual cortical areas. Neuron. 2011;72(6):1025–1039. 10.1016/j.neuron.2011.11.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Marshel JH, Garrett ME, Nauhaus I, Callaway EM. Functional specialization of seven mouse visual cortical areas. Neuron. 2011;72(6):1040–1054. 10.1016/j.neuron.2011.12.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Sereno MI, McDonald CT, Allman JM. Analysis of retinotopic maps in extrastriate cortex. Cerebral Cortex. 1994;4(6):601–620. 10.1093/cercor/4.6.601 [DOI] [PubMed] [Google Scholar]
  • 14. Sereno MI, Dale AM, Reppas JB, Kwong KK, Belliveau JW, Brady TJ, et al. Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science. 1995;268(5212):889–893. 10.1126/science.7754376 [DOI] [PubMed] [Google Scholar]
  • 15. Dumoulin SO, Hoge RD, Baker CL Jr, Hess RF, Achtman RL, Evans AC. Automatic volumetric segmentation of human visual retinotopic cortex. Neuroimage. 2003;18(3):576–587. 10.1016/S1053-8119(02)00058-7 [DOI] [PubMed] [Google Scholar]
  • 16. Wandell BA, Dumoulin SO, Brewer AA. Visual field maps in human cortex. Neuron. 2007;56(2):366–383. 10.1016/j.neuron.2007.10.012 [DOI] [PubMed] [Google Scholar]
  • 17. Waters J, Lee E, Gaudreault N, Griffin F, Lecoq J, Slaughterbeck C, et al. Biological variation in the sizes, shapes and locations of visual cortical areas in the mouse. PLOS ONE. 2019;14(5):1–13. 10.1371/journal.pone.0213924 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Zhuang J, Ng L, Williams D, Valley M, Li Y, Garrett M, et al. An extended retinotopic map of mouse cortex. Elife. 2017;6:e18372 10.7554/eLife.18372 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Minderer M, Brown KD, Harvey CD. The spatial structure of neural encoding in mouse posterior cortex during navigation. Neuron. 2019;102(1):232–248. 10.1016/j.neuron.2019.01.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Saxena S, Kinsella I, Musall S, Kim SH, Meszaros J, Thibodeaux DN, et al. Localized semi-nonnegative matrix factorization (LocaNMF) of widefield calcium imaging data. PLOS Computational Biology. 2020;16(4):e1007791 10.1371/journal.pcbi.1007791 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. de Vries SE, Lecoq JA, Buice MA, Groblewski PA, Ocker GK, Oliver M, et al. A large-scale standardized physiological survey reveals functional organization of the mouse visual cortex. Nature Neuroscience. 2020;23(1):138–151. 10.1038/s41593-019-0550-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Rikhye RV, Sur M. Spatial correlations in natural scenes modulate response reliability in mouse visual cortex. Journal of Neuroscience. 2015;35(43):14661–14680. 10.1523/JNEUROSCI.1660-15.2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. van Hateren JH, Ruderman DL. Independent component analysis of natural image sequences yields spatio-temporal filters similar to simple cells in primary visual cortex. Proceedings of the Royal Society of London B: Biological Sciences. 998;265(1412):2315–2320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Technical whitepaper:stimulus set and response analysis;. Available from: http://help.brain-map.org/display/observatory/Documentation.
  • 25. Pearson K. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science. 1901;2(11):559–572. 10.1080/14786440109462720 [DOI] [Google Scholar]
  • 26. Hotelling H. Analysis of a complex of statistical variables into principal components. Journal of educational psychology. 1933;24(6):417 10.1037/h0071325 [DOI] [Google Scholar]
  • 27. Bishop CM. Pattern recognition and machine learning. springer; 2006. [Google Scholar]
  • 28. Pang R, Lansdell BJ, Fairhall AL. Dimensionality reduction in neuroscience. Current Biology. 2016;26(14):R656–R660. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Chapin JK, Nicolelis MA. Principal component analysis of neuronal ensemble activity reveals multidimensional somatosensory representations. Journal of neuroscience methods. 1999;94(1):121–140. 10.1016/S0165-0270(99)00130-2 [DOI] [PubMed] [Google Scholar]
  • 30. Cunningham JP, Byron MY. Dimensionality reduction for large-scale neural recordings. Nature neuroscience. 2014;17(11):1500 10.1038/nn.3776 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Ahrens MB, Li JM, Orger MB, Robson DN, Schier AF, Engert F, et al. Brain-wide neuronal dynamics during motor adaptation in zebrafish. Nature. 2012;485(7399):471 10.1038/nature11057 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Churchland MM, Cunningham JP, Kaufman MT, Ryu SI, Shenoy KV. Cortical preparatory activity: representation of movement or first cog in a dynamical machine? Neuron. 2010;68(3):387–400. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Martínez AM, Kak AC. PCA versus LDA. IEEE transactions on pattern analysis and machine intelligence. 2001;23(2):228–233. 10.1109/34.908974 [DOI] [Google Scholar]
  • 34. Hassoun MH. Fundamentals of artificial neural networks. MIT press; 1995. [Google Scholar]
  • 35. Burges CJ. A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery. 1998;2(2):121–167. 10.1023/A:1009715923555 [DOI] [Google Scholar]
  • 36. Chang CC, Lin CJ. LIBSVM: A library for support vector machines. ACM transactions on intelligent systems and technology (TIST). 2011;2(3):27. [Google Scholar]
  • 37. Juavinett AL, Callaway EM. Pattern and component motion responses in mouse visual cortical areas. Current Biology. 2015;25(13):1759–1764. 10.1016/j.cub.2015.05.028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Smith IT, Townsend LB, Huh R, Zhu H, Smith SL. Stream-dependent development of higher visual cortical areas. Nature neuroscience. 2017;20(2):200–208. 10.1038/nn.4469 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Wang Q, Sporns O, Burkhalter A. Network analysis of corticocortical connections reveals ventral and dorsal processing streams in mouse visual cortex. Journal of Neuroscience. 2012;32(13):4386–4399. 10.1523/JNEUROSCI.6063-11.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Reynolds DA. Universal background models. Encyclopedia of biometrics. 2015; p. 1547–1550. [Google Scholar]
  • 41. Reynolds DA, Quatieri TF, Dunn RB. Speaker verification using adapted Gaussian mixture models. Digital signal processing. 2000;10(1–3):19–41. [Google Scholar]
  • 42. Ajmera J, Wooters C. A robust speaker clustering algorithm. In: IEEE Workshop on Automatic Speech Recognition and Understanding. IEEE; 2003. p. 411–416.
  • 43.Chen SS, Gopalakrishnan PS. Speaker, environment and channel change detection and clustering via the bayesian information criterion. In: Proc. DARPA broadcast news transcription and understanding workshop. vol. 8. Virginia, USA; 1998. p. 127–132.
  • 44. Van der Maaten L, Hinton G. Visualizing data using t-SNE. Journal of machine learning research. 2008;9(Nov):2579–2605. [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008548.r001

Decision Letter 0

Lyle J Graham, Blake A Richards

5 Jun 2020

Dear Mr. Kumar,

Thank you very much for submitting your manuscript "Functional Parcellation of Mouse Visual Cortex Using Statistical Techniques Reveals Clustering and Signatures of Cortical Processing Areas" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account all of the reviewers' comments.

It is particularly important that you attend to the questions the reviewers raised about interpretations/utility. Reviewer 1 argues that, as it stands, the paper provides no insight into visual processing, since similar clusters are derived from resting state activity and no information is given on the signatures of visual responses of different areas. Reviewer 3 made a similar comment, noting that you suggest that each visual area has distinct response signatures, but don't really provide much insight into the nature of those signatures, particularly given that the classifiers can separate these visual areas using just 10 minutes of resting state activity. You should be very sure to address these fundamental concerns regarding interpretation and utility of the study.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Blake A. Richards

Associate Editor

PLOS Computational Biology

Lyle Graham

Deputy Editor

PLOS Computational Biology

***********************

It is particularly important that you attend to the questions the reviewers raised about interpretations. Reviewer 1 argues that, as it stands, the paper provides no insight into visual processing, since similar clusters are derived from resting state activity and no information is given on the signatures of visual responses of different areas. Reviewer 3 made a similar comment, noting that you suggest that each visual area has distinct response signatures, but don't really provide much insight into the nature of those signatures, particularly given that the classifiers can separate these visual areas using just 10 minutes of resting state activity. You should be very sure to address these fundamental concerns regarding interpretation and utility of the study.

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Kumar et al. studied wide-field GCaMP signals in 6 cortical visual areas in the mouse. Pixels clustered into 6 groups with boundaries that match retinotopic borders, indicating that each visual area is distinct. Unfortunately, what causes these areas to cluster is unknown. With no information on the basis of clustering it’s difficult to assign significance to the clusters. Furthermore, the distinguishing feature is not related to visual stimuli since the same clusters were derived from spontaneous activity, suggesting that the clusters contain no information about the roles of these areas in processing visual information. In short, the paper reports an observation – that pixels cluster – but doesn’t explain why. The observation would appear to have little significance.

Specific comments

(1) The paper’s littered with sentences that appear disconnected from the results and conclusions. I’ll give two examples. The second sentence of the abstract reads: ‘The extent to which these areas represent discrete processing regions remains unclear.’ The paper brings no clarity. Why suggest the study is of visual processing? And in the results ‘These results indicate that each visual area has a characteristic signature that is represented in the responses to visual stimuli presented and can be revealed with a variety of visual stimuli.’ What signature? Information on this signature is conspicuously missing. And about the only thing we know is that the signature’s not related to presented visual stimuli. This sentence is comprehensively at odds with the results.

The text frequently suggests the paper will be about the different visual stimuli that drive these visual areas, but there’s no information here on this topic. My sense is that the study didn’t lead in the direction the authors had expected. If so, it would be best to let go of the intended direction of the project and address what they can with the results. In particular, the aim and conclusions need to be clear and clearly related to the results.

(2) 1.1.1 What is the genotype of the transgenic GCaMP mice? Also, the breeding scheme.

(3) What’s the eye-to-screen distance and visual angle subtended by the monitor.

(4) What was the size (in degrees) of the visual stimuli?

(5) Figure 1D. Why does the field sign map appear so patchy? It’s different from the maps produced by others with wide-field GCaMP imaging. See for example Zhuang et al. eLife 2017. I would guess perhaps the SNR of the retinotopic maps are poor?

(6) ‘The boundaries of the 6 core visual areas were defined according to criteria described in [9]’ This statement is conspicuously untrue. If the authors intended to replicate the field sign mapping technique of Garrett et al., they have failed. In Garrett et al. the borders are, by definition, where the field sign crosses zero. In figure 1D, the borders are not at zero. Many, perhaps all the borders are at some negative field sign value. The authors need to take a closer look at their code. They also need to provide a more detailed explanation of their mapping procedure and, ideally, the code.

(7) Table 1 provides incomplete information on the stimuli. For 1, give the SF and TF. For 2, the TF and direction. For 3, the SF and direction. We also need luminance, contrast and size of the stimulus.

(8) 2.1 The supervised classifiers are described in numerical detail, but I gained no insight into the differences between the classifiers. Why these classifiers? Was there reason to use several?

(9) Figure 4D and E. The LM and RL labels are swapped.

(10) Figure 5 and later figures. Why are the maps broken down by visual stimulus when earlier results were not. And why break them down by visual stimulus when clustering needs no visual stimulus?

(11) 2.3 Resting vs stimulus induced response. I failed to grasp the aim of this section.

(12) The unsupervised clustering simply fails. Is there a reason to include it? The stated conclusions are weak at best.

Reviewer #2: This paper addresses the question of whether the retinotopic visual cortical areas of the mouse can be discovered from their activity patterns in response to visual stimuli or in the resting state. The study concludes that retinotopically defined areas have unique activity profiles that allows their identification based on supervised and semi-supervised methods. However, unsupervised approaches fail to recover these areas with great accuracy, suggesting that despite differences between areas, there is also a great deal of overlap between areas. The question posed is an interesting one, and overall the results are convincing. I like the general approach and think this approach will be valuable for many future questions, even beyond studies of visual cortex. I therefore support publication of this work if the points below can be addressed.

Specific comments

1. In the discussion and conclusions, a lot of emphasis is placed on the resting state data. The authors emphasize that some of the separation between areas could be due to intrinsic activity rather than visual responses. However, there are only two mice for the resting state data, which seems like too small of a sample size. Either these claims should be lessened or more resting state data should be added.

2. I like the analysis of shuffling the area labels in the supervised analysis to show the chance level. I think similar analyses would be nice for the other parts of the paper too. For example, for the unsupervised clustering, it might be interesting to compare clustering metrics for the real data and data in which the pixel locations are shuffled. This could provide some measure of how much structure can be discovered in the real data relative to what would emerge from random data. In general these comparisons are helpful to provide the reader with a bound on what can be expected by chance.

3. My understanding is that the unsupervised analysis was only performed on the widefield calcium imaging data. It was a bit hard to figure this out in the text, so I apologize if I am incorrect. If my statement is correct, then it would be nice in addition to see the unsupervised analysis on the single cell data. The single cell data lack spatial correlations that are present in the widefield data, as the authors note. It would be interesting to see if similar clusters could be uncovered with the single cell data.

4. The tables of accuracies for the supervised and semi-supervised analyses are nice, but it would also be interesting to see the confusion matrices for these analyses. It would be interesting to some readers to see which areas are more similar to one another and thus get confused with one another more frequently. Such a confusion matrix could support some of the claims about lateral versus medial differences that the authors make using the unsupervised analysis.

5. The raw retinotopic map data for all mice should be shown in addition to the post-processed boundaries. This is important to evaluate the quality of the input to parcellate the areas into retinotopic divisions. In particular I ask about this because I was surprised by how much variance there was in the size and location of the areas. For example, in some mice AM is anterior and medial to PM, whereas in other mice AM is anterior and lateral to PM. Also, sometimes AM is directly bordering V1, and other times it is not. I was surprised by the location of AM in Figure 1D. Typically AM is adjacent to V1. Similar variance is seen for other areas. I am not sure it matters greatly for this study, but I have some concern that the area labels may be inaccurate in some mice, such as the case for AM that is not adjacent to V1 (Figure 1D).

6. Currently all the analyses are done within a mouse, which is sensible. However, I was wondering if the authors tried across mouse analyses. For the supervised analyses, what do the results look like if the classifiers are trained on mouse 1 and tested on mouse 2? For the unsupervised analysis, is there a way to see if the clusters identified in mouse 1 then provide predictive power for mouse 2? Across mouse analysis might further support claims of structure made by the authors.

7. What were the mice doing during the imaging experiments? Were they moving? Could movement contribute to the results? Recent studies have emphasized the importance of movement to visual cortical activity (see PMIDs: 20188652, 31551604, 31000656).

8. Some references to recent papers using related methods were missing and should be added. PMIDs: 32282806, 30772081.

Reviewer #3: The authors use widefield and 2-photon imaging data from the mouse visual cortex to train classifiers to identify the different visual areas. They find that supervised and semi-supervised classifiers perform well, identifying pixels or neurons with high accuracy. The fact that they do so using even just the neural responses to one 4.5 second movie is remarkable. The authors go on to show that unsupervised classifiers do not identify the different visual areas with high accuracy, but do capture some of the functional organization of the visual cortex. These results indicate that there are distinct physiological profiles for the different visual areas – or from the unsupervised results at least from groups of visual areas. I found the work to be interesting, and the paper did a great job of explaining the different techniques and conclusions. I do have some concerns, that I hope are reasonably addressable.

Major Concerns:

1. The retinotopic maps shown in these figures are somewhat different from the retinotopic maps I see in the literature (namely Zhuang et al 2017 and Garrett et al 2014) – specifically in regards to the location and borders of RL. In the two papers mentioned above, RL sits at the top of V1, in contact with both AL and AM on either side. I understand that this area has difficult retinotopy, however, it seems possible that the pixels at the top of V1 are being mis-assigned to V1 and should really be within RL – which could had different effects on the different supervised/semi-supervised/un-supervised results. I encourage the authors to look more closely at the assignment for RL, or perhaps to consider excluding it from these analyses (or weighting the accuracy for that area differently).

2. I am wary of the conclusions drawn from the unsupervised classification results. Specifically, it seems that the conclusions drawn result directly from the rules added to the unsupervised clustering. For instance, since clusters can only be merged if they are touching, it seems impossible for the lateral and medial areas to end up in the same clusters, especially given the 40% constraint, so it isn’t clear to me how meaningful that result is. It is possible that the paper could stand without the unsupervised classification results.

3. Throughout the paper, chance is said to be 1/6 given the six visual areas. It is not clear to me that this is the right level of chance to be used. Particularly for the widefield data, when more pixels are in V1 than any of the other visual areas, it seems that the prior should be shifted towards V1. Is there a way to define chance that takes the relative proportion of pixels (and neurons for the 2P data) for each of the areas?

4. Related to this, I would like to see what the confusion matrix looks like for these classifications. Do mis-classified pixels(/neurons) tend to be classified as the closest area? To the area the best matches retinotopy for that location? Or do they default to V1?

5. I would like to see a comparison of the semi-supervised area boundaries with retinotopy (eg. Fig 9 of Zhuang et al). It is not clear to me that the boundaries that this method is identifying do not reflect retinotopy. Eg. It appears that the semi-supervised boundaries separate altitude reasonably well (eg. the boundary between LM and RL that extends into V1 seems to match roughly with the horizontal meridian). The authors make the point that they are not using a retinotopic stimulus, but a natural movie stimulus has distinct information in different retinotopic locations – and thus could drive retinotopically distinct responses. If that is not true for the movies used in this study, I’d like to see an analysis to demonstrate the spatial/temporal content of the movies across retinotopy.

6. The biggest question that emerges for me from this work is what is the distinguishing features of the activity from the different areas. The authors conclude that these results suggest that each visual area has distinct response signatures, and some insight into the nature of those signatures would be very valuable. Particularly given that these classifiers can separate these visual areas using as little as a 4.5 second movie or even just 10 minutes of resting state activity. What are the features of the activity that the classifiers are using to separate these areas? An analysis of the classifier weights or features could be really illuminating in this regard. Or perhaps even example traces from the different areas.

7. The analysis in Figure 6 comparing boundaries obtained with different durations of stimulus is very interesting and important. My concern is that the movie responses are averaged across trials while the resting state is not averaged. The nature of an averaged signal and an unaveraged signal is very different, so 20 seconds of average movie activity and 20 seconds of unaveraged resting state are not an equivalent comparison. Why not do the classification of movie responses without averaging the trials, thus allowing a direct duration comparison between the two?

Minor concerns:

1. There are a lot of details regarding the methods that are missing from the manuscript. Each of these is minor, but altogether I do consider this important:

• Table 1 summarizing the stimuli used for widefield imaging needs more information. Namely the spatial and temporal frequencies and contrast used for stimulus 1. The directions, temporal frequency, and contrast for stimulus 2. The directions, spatial frequency, and contrast for stimulus 3.

• Was the stimulus for the widefield imaging warped to account for viewing distance? Where was the monitor positioned relative the mouse’s center of gaze? As different visual areas cover different regions of retinotopy, if the stimulus wasn’t warped properly, the stimulus could have different content in different regions, and hence for different HVAs.

• What was the mean luminance of the stimulus? Was the monitor gamma corrected?

• What Cre line was used to drive the GCaMP6 expression in the widefield data?

• How did the authors choose to analyze Emx1 and Nr5a1 from the Allen Brain Observatory dataset? (the authors mention these Cre lines were imaged across all six areas, but that is true for Cux2, Rorb, and Rbp4 as well – why were these not analyzed?) Was Emx1 used from all layers or only from specific layers?

• I’d like a bit more information about how the classifier was applied to the 2P data. Were all neural traces (for the chosen Cre lines) used, or subselected? If subselected, how was this done? What was the test/train split? Were equal numbers of neurons used for each area or was this different? How many neurons were used?

2. Figure 3B shows generalization across mice. It’s not clear to me whether this is to show similar results for different mice, or whether it is to show that training on one mouse can predict testing on a different mouse. I believe it’s the former, but it would be very interesting if it were the latter. Please clarify.

3. Figure 4 color labels appear to be mis-assigned

4. Why is the accuracy for widefield pixels so much higher than for the 2P neurons? Given the shorter movie clip, and the single pixel data, I’d expect the widefield data to perform worse than 2P, not better. But perhaps the fact that the widefield signal for a pixel could combine activity from multiple neurons/processes could play a role in this? In a similar vein, why do Emx1 and Nr5a1 perform differently? Are there different numbers of neurons available? Could it be layer specific? I don’t think these can necessarily be conclusively answered, but if possible some discussion of these questions would help.

5. The Allen Brain Observatory is from the Allen Institute for Brain Science (not Allen Brain Institute – line 42). The citation should also match the citation policy for the dataset (https://alleninstitute.org/legal/citation-policy/)

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008548.r003

Decision Letter 1

Lyle J Graham, Blake A Richards

8 Oct 2020

Dear Mr. Kumar,

Thank you very much for submitting your manuscript "Functional Parcellation of Mouse Visual Cortex Using Statistical Techniques Reveals Response-Dependent Clustering of Cortical Processing Areas" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. Specifically, Reviewer 1 has some remaining concerns about your border analyses and Reviewer 3 has several remaining significant concerns regarding your interpretations of your analyses. If you can address these concerns then we will likely be able to accept this paper.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. 

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Blake A. Richards

Associate Editor

PLOS Computational Biology

Lyle Graham

Deputy Editor

PLOS Computational Biology

***********************

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately:

[LINK]

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Uploaded as attachment.

Reviewer #2: The authors have adequately addressed my comments.

Reviewer #3: I think the authors have made some significant improvements to this paper that strengthen and clarify the work. There are, however, some outstanding issues I think need to be addressed

The addition of the confusion matrix in Figure 4 is valuable and informative – thank you for adding that. It might be worth pointing out that for the widefield data, ignoring V1, there is a higher (though not high) confusion of AL and LM for each other, and of AM and PM for each other. To me, this is interesting, and is consistent with other studies that show these two pairs of areas have more similar responses with each other that with other areas. This might be worth mentioning as I believe that it supports the idea that the clustering here represents functional differences.

The fact that this is not clear in the 2P data, however, might make this harder to support. Could you comment on why the confusion is greater for the 2P than the widefield? Is it that there is diversity of functional responses of neurons, so when looking at individual neurons the confusion will be higher than pixels that are merging signals from multiple cells?

Another question I have from the confusion matrix is that in the 2P data particularly, areas AM and PM appear to have lower accuracy than the other areas. Perhaps this is more true for the Emx1 data than the Nr5a1 data (the lower and more variable # of neurons in the Nr5a1 data makes me a bit wary of over-interpreting those results). Could you speak to why this might be?

I continue to disagree with the authors’ claim that the natural movie stimulus does not contain retinotopic information. This is not true. For any given frame of the movie, there is different content in different retinotopic regions. This might average out across all the frames so that the spatial/temporal frequency content is similar across space, but for each frame it is different. This is also true of the stimulus used for retinotopic mapping. While across the entire stimulus the content across space is the same, for each frame it is different – and THIS is what enables the authors to use the stimulus to map the retinotopy of the mouse visual cortex. But the same feature in the movies, different content in different regions of space for each individual frame, means that the movies do contain retinotopic information. The authors claim the opposite which is not true. This must be fixed.

And I still would like to see the comparison of the semi-supervised clustering with retinotopy (eg. Fig 9 of Zhuang et al) as the area borders appear to match retinotopy better than the area boundaries (that are derived from retinotopy).

I think Figure 7 improves the previous analysis of stimulus duration and enables a better comparison between the movie and resting state results. I think the authors could use this to emphasize two things in their text:

First, one of the stunning things from this analysis is that only a very short movie is needed to get fairly accurate separation of visual areas in the widefield data. 4.5 seconds contains probably only a few hundred stimulus frames, and they are likely fairly correlated frames at that. The fact that just a few visual stimulus features can drive this level of accuracy is, to me, really surprising, and I think this could be emphasized more in the text.

Second, the difference in the results here for supervised vs semi-supervised, and between widefield and 2p datsets, could be valuable for helping to understand the differences between these datasets. Eg. The fact that 2p accuracy is higher for averaged movie responses than single trial, while widefield shows little difference between them, could point to the impact of the correlated pixels on the results (eg. averaging across pixels vs averaging across trials). I think the fact that the supervised clustering (both widefield and 2p) has similar performance for resting state and single trial movie, while semi-supervised shows a difference between the two, might be similarly revealing for understanding the differences in those methods/results.

I find the new analysis in section 3 (figures 8 &9) very confusing and I don’t know that it helps to address the question that I had in my previous review.

First, the comparison of inter- and intra-areal correlations for the 2P dataset is fraught with problems. Namely, for the inter-area correlations, it must be pointed out (based on my understanding of the 2P dataset) that the areas are imaged in different sessions and different mice, while the intra-areal correlations include at least some data collected within the same session. This will make the intra-areal correlations higher simply because factors such as running or brain state will be in common. And while the correlations in the context of the movie at least have a common stimulus, the correlations of resting state … I just don’t know how to think about that comparison of inter- and intra-areal correlations. I would recommend the authors not include the 2P dataset in this particular analysis because of these issues.

However, my larger concern is that this analysis does not address my original question of what features of the activity are that separate these areas. The authors claim in their reply that this analysis shows that the intra-correlations are the key features. First, this analysis doesn’t convince me of this. I fully expect neighboring pixels/neurons to be more correlated, if only because they are retinotopically adjacent and thus likely receive common inputs, etc. Even in the absence of a patterned stimulus, they should have more correlations. I’d be more convinced if the inter- vs intra- areal correlations took retinotopy into account – eg. compared the same regions of retinotopy across areas.

However, even then, if the answer that the authors have for what features of activity separate the areas is just higher correlations within areas than across areas, it doesn’t tells us what’s different about, eg., AL and AM. The authors make the conclusion that “visual cortical areas have characteristic activity patterns” – and to me this statement is not supported by this comparison of correlations. It’s not clear to me that this type of analysis is unable to address the question of what distinguishes the areas, in which case I think the authors should walk back such claims.

Minor:

• Please indicate whether “resting state” for the wide field dataset is in the dark or with a gray screen?

• The wide-field dataset methods say the mice are Ai93 and that all mice expressed GCaMP6f or GCaMP6s. Ai93 is exclusively GCaMP6f, so either the “or GCaMP6s” is a mistake, or another reporter line (possibly Ai94) was used? Please fix.

• Mice expressing Emx1-IRES-Cre;Ai93 have been shown to have aberrant activity (Steinmetz et al 2017) – large cortex wide events. Do you think this could impact your results? In my mind, it seems unlikely for two reasons – one my recollection of the paper was that the aberrant activity was weaker in visual cortex than other areas; second, I would expect it to make pixels more similar to one another, so the fact that you are able to distinguish areas indicates that any aberrant activity likely isn’t factoring into the clustering analysis. It might be worth adding a comment on this – but I leave that to the author’s discretion.

• Are you using the ∆F/F traces for the 2P dataset provided through the AllenSDK or the raw fluorescence? I would assume the former, but there are mentions of “raw neuronal activity” that make me wonder. This would be important to add to the methods.

• The citation for the Allen Brain Observatory should not be Lein et al. 2007, but should be de Vries, Lecoq, Buice et al 2020.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see http://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-materials-and-methods

Attachment

Submitted filename: review.docx

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008548.r005

Decision Letter 2

Lyle J Graham, Blake A Richards

17 Nov 2020

Dear Mr. Kumar,

We are pleased to inform you that your manuscript 'Functional Parcellation of Mouse Visual Cortex Using Statistical Techniques Reveals Response-Dependent Clustering of Cortical Processing Areas' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Blake A. Richards

Associate Editor

PLOS Computational Biology

Lyle Graham

Deputy Editor

PLOS Computational Biology

***********************************************************

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008548.r006

Acceptance letter

Lyle J Graham, Blake A Richards

27 Jan 2021

PCOMPBIOL-D-20-00156R2

Functional Parcellation of Mouse Visual Cortex Using Statistical Techniques Reveals Response-Dependent Clustering of Cortical Processing Areas

Dear Dr Kumar,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Alice Ellingham

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Horizontal and vertical retinotopy along with sign map within six visual areas of all the mice used in the paper.

    Cortical areas of the left hemisphere are shown. Azimuth 0° and 90° correspond to the midline and the far periphery of the contralateral visual field, respectively. Negative values of elevation represent lower visual field and positive values represent upper visual field.

    (EPS)

    S2 Fig. Confusion matrices obtained using supervised GMM classifiers for all animals and data.

    In Fig 4, confusion matrices were shown for an example mouse and a Cre-line from the wide-field and two-photon datasets, respectively. Here we show the confusion matrices for all mice and Cre-lines from the wide-field and two-photon datasets, respectively.

    (EPS)

    S3 Fig. Intra-area and inter-area correlations computed from natural movies responses of individual mice from the two-photon dataset.

    In Fig 8, intra-area and inter-area correlations were computed at the Cre-line level with data pooled from different mice and sessions for the two-photon dataset. Here we present the inter and intra-area correlations for various individual mice from the dataset, which had responses recorded from three or more sessions.

    (EPS)

    S4 Fig. Intra-area and inter-area correlations computed from resting state responses of individual mice from the two-photon dataset.

    In Fig 8, intra-area and inter-area correlations were computed at the Cre-line level with data pooled from different mice and sessions for the two-photon dataset. Here we present the inter and intra-area correlations for various individual mice from the dataset, which had responses recorded from three or more sessions.

    (EPS)

    S5 Fig. Intra-area and inter-area correlation computed on input responses for all animals and data.

    In Fig 8, average intra-area and inter-area correlations computed from input responses were shown for an example mouse and a Cre-line from the wide-field and two-photon datasets, respectively. Here, we show the correlations for all mice and Cre-lines from the wide-field and two-photon datasets, respectively.

    (EPS)

    S6 Fig. Intra-area and inter-area correlation computed on LDA features for all animals and data.

    In Fig 8, average intra-area and inter-area correlations computed from LDA features were shown for an example mouse and a Cre-line from the wide-field and two photon-datasets, respectively. Here, we show the correlations for all mice and Cre-lines from the wide-field and two-photon datasets, respectively.

    (EPS)

    S7 Fig. Comparison of intra and Inter-area correlation computed using all pixels and patches of pixels with the same retinotopic map.

    A, C) Intra and inter-area correlations computed using all pixels. B, D) Corresponding correlations computed using patches of pixels with approximately the same retinotopy from each area and averaged across the entire visual space. These data show that intra-area pixels are more correlated than inter-area pixels.

    (EPS)

    S1 Table. Number of neurons available for other Cre-lines in the Allen Institute dataset.

    In the text, we analyzed Emx1-IRES and Nr5a1 Cre-lines from dataset 2 (Section 1.2.2). Here we present the number of neurons available from all the other Cre-lines that contain all the six visual areas considered in the paper.

    (PDF)

    S2 Table. Classification accuracy for other Cre-lines in the Allen Institute dataset.

    In the text, we presented the classification accuracy for Emx1-IRES and Nr5a1 Cre-lines from dataset 2 (Table 4). Here we present the same for all the other Cre-lines in dataset 2 that contain all the six visual areas considered in the paper.

    (PDF)

    Attachment

    Submitted filename: response_to_reviewers.pdf

    Attachment

    Submitted filename: review.docx

    Attachment

    Submitted filename: response_to_reviewers.pdf

    Data Availability Statement

    The paper presents results on two datasets, one collected using two-photo imaging and another collected using wide-field imaging. The two-photon dataset is a public dataset and can be accessed from http://observatory.brain-map.org/visualcoding. The wide-field dataset used in this study can be accessed from https://doi.org/10.6084/m9.figshare.13476522.v1. For both datasets, the software to reproduce the results are given in https://github.com/CCBR-IITMadras/visual-cortex-response-classification.


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES