Efficient encoding of spectrotemporal information for bat echolocation

Adarsh Chitradurga Achutha; Herbert Peremans; Uwe Firzlaff; Dieter Vanderelst

doi:10.1371/journal.pcbi.1009052

. 2021 Jun 28;17(6):e1009052. doi: 10.1371/journal.pcbi.1009052

Efficient encoding of spectrotemporal information for bat echolocation

Adarsh Chitradurga Achutha ¹, Herbert Peremans ², Uwe Firzlaff ³, Dieter Vanderelst ^4,^*

Editor: Emma K Towlson⁵

PMCID: PMC8270447 PMID: 34181643

Abstract

In most animals, natural stimuli are characterized by a high degree of redundancy, limiting the ensemble of ecologically valid stimuli to a significantly reduced subspace of the representation space. Neural encodings can exploit this redundancy and increase sensing efficiency by generating low-dimensional representations that retain all information essential to support behavior. In this study, we investigate whether such an efficient encoding can be found to support a broad range of echolocation tasks in bats. Starting from an ensemble of echo signals collected with a biomimetic sonar system in natural indoor and outdoor environments, we use independent component analysis to derive a low-dimensional encoding of the output of a cochlear model. We show that this compressive encoding retains all essential information. To this end, we simulate a range of psycho-acoustic experiments with bats. In these simulations, we train a set of neural networks to use the encoded echoes as input while performing the experiments. The results show that the neural networks’ performance is at least as good as that of the bats. We conclude that our results indicate that efficient encoding of echo information is feasible and, given its many advantages, very likely to be employed by bats. Previous studies have demonstrated that low-dimensional encodings allow for task resolution at a relatively high level. In contrast to previous work in this area, we show that high performance can also be achieved when low-dimensional filters are derived from a data set of realistic echo signals, not tailored to specific experimental conditions.

Author summary

We show that complex (and simple) echoes from real environments can be efficiently and effectively represented using a small set of filters. Critically, we show that high performance across a range of tasks can be achieved when low-dimensional filters are derived from a data set of realistic echo signals, not tailored to specific experimental conditions. The redundancy in echoic information opens up the opportunity for efficient encoding, reducing the computational load of echo processing as well as the memory load for storing the information. Therefore, we predict the auditory system of bats to capitalize on this opportunity for efficient coding by implementing filters with spectrotemporal properties akin to those hypothesized here. Indeed, the filters we obtain here are similar to those found in other animals and other sensing capabilities. Our results indicate that bats could exploit the redundancy in sonar signals to implement an efficient neural encoding of the relevant information.

Introduction

Many natural stimuli encountered by animals are characterized by a high degree of redundancy [1]. Efficient neural encoding retains essential information while reducing this redundancy. By extracting the most crucial aspects of stimuli, the efficiency of sensing is drastically increased [2]. For echolocating bats, the time of arrival of echoes, which conveys the target’s distance, is the most relevant sensory information [3]. In addition, the spectral content and intensity of the echoes also convey essential information for localizing and recognizing targets [4, 5].

Earlier results from robotic sonar suggest that echo signals, like many other natural stimuli, are highly redundant. For example, [6] presented a bio-mimetic system that could differentiate the head and tail sides of a coin. This binaural system reduced the 2.4 ms long, 60 kHz waveform at each receiver to a 16-value vector. This corresponded to 0.15 samples per millisecond instead of the 120 samples prescribed by the Nyquist criterion. Collecting many broadband echoes (100–30 Khz) in different natural bat habitats [7] demonstrated successful place and pose recognition based on these echoes. Each echo was represented using a distance-intensity profile of fewer than 100 samples, or less than 3 samples per millisecond. References [8] and [9] presented simulations and a robotic model of obstacle avoidance in bats. Their obstacle avoidance strategy only used the (sign of the) interaural level difference to steer the artificial bats around obstacles, demonstrating successful albeit simple sensorimotor control using a highly reduced representation of the echo train.

The previous work, referred to above, showed the possibility to compress echoic information in the context of a particular task, e.g., scene recognition [7] or object recognition [6, 10]. However, to be truly useful, efficient encoding must support many different echolocation based behaviors. Hence, to maximally exploit this apparent redundancy in echo signals, it can be presumed that bats have evolved efficient and task-independent neural encoding strategies for extracting relevant echo features. While specific neural encodings in various areas of the bat’s brain have been studied in great detail, to the best of our knowledge encodings that explicitly take into account the redundancy in echo signals have not been studied systematically yet. Here, we set out to devise such an efficient encoding scheme for echo signals.

First, we collect a large sample of representative stimuli, i.c., echoes from different indoor and outdoor environments, and convert them to cochleograms, as proposed by Lutz Wiegrebe, to whose memory we dedicate this paper [11]. Next, we employ Independent Component Analysis to derive a set of filters that most efficiently encode the ensemble of cochleograms. This same approach has been used to good effect in other sensory domains, including visual [12] and auditory [13] perception. The resulting filters can be interpreted as the spectrotemporal receptive fields of a set of hypothetical neurons [14]. We find that 25 filters can be used to encode 99% of the variance in the cochleograms derived from echoes collected in various environments. Next, we present simulations showing that these filters retain sufficient information to complete several sonar-based tasks on which bats have been tested before. In particular, we show that the filters allow for accurate object recognition, monaural target localization (range and elevation), and scene recognition. Crucially, we show that high performance across a range of tasks can be achieved when low-dimensional filters are derived from echo signals not tailored to specific sonar-based tasks.

These results show that the (lossy) compression performed by the filters provides a substantial reduction in the amount of echoic information that needs to be encoded, processed, and stored while retaining the crucial aspects of the stimuli. The proposed filters have been derived on theoretical grounds, but, as discussed, evidence from auditory and visual encoding in both bats and other animals supports their biological plausibility.

Methods

Overview

The overall approach of the current study is as follows (also depicted in Fig 1). We start by collecting echoes (N = 1014) in 21 natural bat habitats and indoor environments. While not intended to be exhaustive, this dataset covers a broad range of environments containing human-made and natural reflectors of varying complexity that bats might conceivably encounter. Using a functional model of the auditory periphery of the bat [11], we convert the echoes to cochleograms. The information in the cochleogram is a good approximation of that contained in the neural activity at the auditory nerve [15]. Next, based on this database of ecologically relevant cochleograms, we derive a set of filters for encoding the cochleogram using the technique of Independent Component Analysis [16].

To verify that this encoding retains the relevant spectrotemporal echo information useful to a bat, we simulate several previously published behavioral discrimination experiments with bats. In our experiments, we generate echoes similar to those used in the bat experiments and encode the associated cochleograms with the proposed efficient encoding scheme. We then use neural networks to assess whether the information retained by this compressive encoding is sufficient to attain a performance level comparable to that of the bats in the discriminations experiments. See [17] for a similar approach. As a final test, we also assess whether the information retained by the encoding is sufficient to memorize and discriminate between the cochleograms collected at the 21 different locations, a capability required for place, and scene recognition by bats [7].

Deriving an efficient cochlear encoding

We collected echo data using a sonar data acquisition device mounted on a tripod (Fig B in S1 Text). The device consisted of a Sensecomp 7000 broadband emitter. A 1-millisecond FM-pulse sweeping down from 70 kHz to 30 kHz (hyperbolic sweep) was used. This band corresponds to the lower frequency range used by many species of bats in echolocation [18]. The device featured two Knowles microphones. However, in this study, only one of the microphones was used. Echoes recorded by the microphone were sampled at 360 kHz.

Echoes were collected in several outdoor and indoor environments. Outdoors, we ensonfied hedgerows, dense vegetation, plants, tree foliage, and shrubs. Indoors, data were collected in various rooms of a private residence, in lab spaces at the University of Cincinnati, and inside a barn. We selected these locations to include simple (human-made) reflectors and highly complex stochastic reflectors (e.g., leafy foliage). In total, we collected 1014 echoes, 500 of which were collected indoors. Echoes were collected in batches of 30 to 50 at 21 different locations. Some examples of the locations we used are shown in Fig 1. The echoes had a duration of 19 ms, corresponding with a maximal range of 3.2 m (=343m/s × 0.019/2) for a speed of sound of 343m/sec. The duration of 19 ms was dictated by the size of the onboard memory of the data acquisition device. Stilz and Schnitzler [19] found, that depending on atmospheric conditions, echolocation frequency, and the dynamic range of the sonar system, the maximum range for extended backgrounds such as a forest edge can be as short as 2.4m. Therefore, we propose that the chosen echo length while at the lower end falls within the ecologically relevant range.

While collecting data, the position and orientation of the ensonification device were changed between subsequent emissions. We moved the device pseudo-randomly through each space by displacing it by about 20 centimeters between measurements and turning it up to 90 degrees. For each position and orientation of the device, three measurements were taken in succession, separated by 1 second.

The echoes, for each of the three repeats, were converted into cochleograms using the functional model of the middle and inner ear processing in the bat, as proposed by Wiegrebe [11] (see Fig 1). We averaged across the cochleograms for the three repeats to increase the signal-to-noise ratio. The middle and inner ear processing model consists of a bank of gammatone filters followed by half-wave rectification and exponential compression. Finally, each frequency channel’s output is low pass filtered with a cut-off frequency of 1 kHz. As the bat has knowledge of its emission and as we ignore possible Doppler-shifts (hyperbolic FM sweeps are maximally Doppler-shift resilient [20, 21]), the frequency modulation present in each subecho the cochleograms consist of can be compensated for by a ‘dechirping’ operation. Through this ‘dechirping’ mechanism, we shift the response in each cochlear frequency channel in time to align the responses (see Fig 1). A similar compensation mechanism is included in both the SCAT model proposed by Saillant et al. [22] and the model proposed by Wiegrebe [11], implemented through autocorrelation. In the current study, the 20 center frequencies for the gammatone filter bank were spaced by Equivalent Rectangular Bandwidths [23]. An example of a cochleogram is shown in Fig 1. The center frequencies are listed in S1 Text.

Next, each cochleogram S_j, j = 1, ⋯, N is converted into a vector x_j by concatenating the columns of the cochleogram. The efficient encoding we propose assumes that this observed vector x_j can be written as a linear mixture of basic components,

\begin{matrix} x_{j} = \sum_{i = 1}^{N} c_{j, i} \cdot Ψ_{i} = A \cdot c_{j} \end{matrix}

(1)

with the basic components Ψ_i, i = 1, ⋯, N making up the columns of the matrix A = [Ψ₁ Ψ₂ ⋯ Ψ_N] and c_j a vector of statistically independent weights with the i-th component of this vector denoted by c_j,i. Given the dataset of cochleograms, the ICA technique will determine the matrix A that minimizes the multi-information, a generalization of mutual information measuring the statistical dependence between multiple variables, of the weights c_j. By inverting the concatenating operation performed on the cochleograms, we can interpret the basic components Ψ_i, i = 1, ⋯, N, just like the cochleograms, as functions of time t and frequency f. In this paper, we use the FastICA [24] algorithm as implemented by the scikit-learn Python package [25] to derive the basic components from the set of collected cochleograms.

In principle, the set of weights c_j encodes the cochleograms without loss, i.e., the dimensions of the vectors x_j and c_j are the same. However, in this paper, our goal is to assess whether a reduced set of basic components Ψ_i, i = 1, ⋯, M with M ≪ N can capture sufficient spectrotemporal information to successfully support a broad range of typical echolocation tasks. Hence, before the ICA proper is applied, the cochleograms first undergo a preprocessing step consisting of the removal of the mean cochleogram and principal component analysis (PCA). Projecting the cochleograms onto their principal components removes linear correlations and allows, by dropping dimensions with low variance, a dimensional reduction. The PCA performed on the cochleogram dataset showed that 25 components could capture over 99% of the variance in the cochleograms. Next, after mapping the cochleograms onto this reduced 25-dimensional Principal Components space, the 25 independent components representing the data best are determined. As this preprocessing step is included in the FastICA implementation, we refer to both these processing steps as ICA in Fig 1.

An example of a cochleogram S_j converted to its 25 dimensional representation c_j is given in Fig 2.

Modeling behavioral data

To demonstrate that this compressive encoding retains sufficient information to explain bats’ behavior in various experiments, we modeled four previously published behavioral experiments. In modeling each of these, we employed the same approach outlined in Fig 1.

We generated artificial echoes according to the same procedure used in each behavioral experiment. Three out of four of the behavioral experiments considered used a phantom target paradigm. In these experiments, the bat’s emission was recorded using a microphone and convolved in real-time with a target impulse response. The result was played back to the bat. We generated impulse responses mimicking those used in the experiments and convolved them with the same emission signal used to collect the real sonar echoes, i.e., a 1-millisecond FM-pulse sweeping down from 70 to 30 kHz. The fourth experiment used real targets (i.e., small beads). We approximated those as reflecting a simple copy of the incident emission.

These artificial echoes were converted to cochleograms employing the same model [11] used to process the ensonification data. Internal noise (see below for details) was added to the cochleograms before encoding them using the 25 independent components derived from the echo database. Next, we ascertained that this encoding retained essential spectrotemporal information. We did this by training neural networks on the compressed encoding of the cochleograms to determine discrimination thresholds that can be compared to the behavioral findings reported in the earlier studies.

For each experiment modeled, we constructed a separate training and testing data set. The training set was always used exclusively to train the neural network, whereas all performances reported here are exclusively based on the separate test set. For some experiments, we wanted to test whether the neural network could generalize from the training set. In this case, the training and the test set have been generated with different parameter settings. We will discuss this where appropriate.

The networks consisted of a 25 node input layer matching the dimension of the 25 independent component space. We used two hidden layers, each with 50 nodes. The number of nodes was selected for computational convenience, and we did not attempt to optimize the networks’ size. The hidden layer neurons were Rectified Linear Units. The networks had one output node, with a sigmoid activation function. The experiments we modeled used two-alternative forced-choice (2AFC) tasks. Therefore, we trained the networks to generate an output = 1 for inputs corresponding with a rewarded stimulus and output = 0 for inputs corresponding with a non-rewarded stimulus. The loss function was the absolute difference between the desired and actual activation of the output node. The training was done using RMSProp (Root Mean Square Propagation), as implemented in Keras [26].

Noise model

We added Gaussian (zero mean) noise to the cochleograms before applying the compressive encoding to model the effects of internal noise on the bat’s decision process [11, 27]. To derive the variance of the noise, we employed the calibration procedure proposed by [27], i.e., we established the noise level that resulted in similar echo intensity discrimination performance as reported for bats. Reference [28] summarizes intensity discrimination threshold experiments, citing Just-Noticeable Difference (JND) values for echo intensity ranging from 1 to 5 dB. We chose to use a value at the upper end of that range, i.e., 5 dB, as this same value (for 70% correct decisions) was reported in phantom target experiments similar to the ones we will be simulating [29].

In particular, we generated a cochleogram S_ref corresponding with a reference echo amplitude. A second cochleogram S₊₅ was generated with an echo amplitude 5 dB higher than the reference amplitude. Next, we iteratively searched for the level of Gaussian noise, $N (0, σ)$ , that allowed telling S₊₅ apart from S_ref for 70% of the noise realizations. The value of σ that allowed for 70% correct decisions was approximately 0.1 (S1 Text). A cochleogram with added noise is shown in Fig 1. Using σ = 0.1 gives the cochleograms in this paper a dynamic range of about 42 dB, i.e., the maximum value across all cochleograms is approximately 15.

In the remainder of this paper, all simulated echoes were generated with the same reference amplitude. In those cases where amplitudes of echoes were varied, the amplitude roving was done around the reference value.

Experiment 1: Encoding temporal information

To assess whether the compressive encoding retained sufficient temporal information, we mimicked the experiments that quantified the just noticeable difference in echo delay in bats (see [30] for references). For example, in the experiments described by Denzinger and Schnitzler [31], the bats were rewarded for discriminating a phantom target echo at a fixed delay from echoes with a longer and variable delay.

We modeled this absolute delay discrimination experiment by generating target impulse responses consisting of a single impulse. The rewarded target impulse response was fixed at a delay of 11 ms. Unrewarded target impulse responses were shifted backward from 1000 μs to 50 μs relative to the fixed target. We applied amplitude roving by randomly varying the echo amplitudes over a 30 dB range to exclude overall echo-level cues. The resulting echoes were converted to cochleograms, noise was added, and the result was encoded with the 25 filters derived above.

Experiment 2: Encoding simple reflector descriptions

To assess whether the compressive encoding retained sufficient information about simple target impulse responses, we investigated the discrimination of stimuli containing two echoes separated by varying time delays, as proposed in [32]. In these experiments, Megaderma lyra were presented with phantom targets defined by an impulse response consisting of two impulses separated by a variable time delay ranging from 1.3 to 26 μs. In some of the experimental conditions, these two echoes had unequal strength. These short time delays, falling within the cochlear frequency channels’ integration time, result in a notch in the phantom targets’ spectral image. Indeed, assuming the received echo x(t) consists of two delayed copies of the call e(t) with the second one possibly amplified/attenuated with respect to the first one,

\begin{matrix} x (t) = e (t) + a e (t - τ), \end{matrix}

(2)

the spectrum of such an echo can be written as

\begin{matrix} X (f) = E (f) \cdot (1 + a {exp}^{- j 2 π f τ}), \end{matrix}

(3)

with E(f) the spectrum of the emission. This spectrum contains notches at frequencies f⁻, assuming no phase shift between echoes, that depend on the time delay τ

\begin{matrix} f^{-} = \frac{(2 m + 1)}{2} \frac{1}{τ}, \end{matrix}

(4)

with m an integer. The depth of the notch depends on the ratio a of the leading and trailing echo amplitudes with maximum depth achieved for a = 1. In the original experiments, bats were trained to discriminate a stimulus with a reference time delay 7.77 μs (or reference notch frequency of 64.4 kHz) from stimuli with different time delays (or notch frequencies).

In keeping with the study by Schmidt [32], we generated stimuli consisting of two echoes with the delay τ between them varied such that the corresponding notch falls in the interval from 16 to 70 kHz (2 kHz steps), corresponding with the passband of our sonar system. As in the most challenging of the original study’s experimental conditions, the leading echo’s amplitude was set 6 dB lower than the trailing echo’s, resulting in less pronounced notches. Similar to the previous experiment, we applied amplitude roving by randomly varying the two echoes’ amplitudes over a 30 dB range. The delayed echoes were again converted to cochleograms, noise was added, and the results were mapped onto the same 25 independent components (or filters).

Note that irrespective of whether the spectral image is used directly by the bats to solve this task, as concluded by Schmidt [32], or whether this spectral image is transformed into a time-domain representation of the target impulse response first, as suggested by in references [22, 33], the loss of spectral information will harm discrimination performance in this experiment. Hence, while the previous experiment studied how the proposed compressive encoding preserves temporal information, this experiment tests how well spectral information is preserved.

Experiment 3: Encoding scale-invariant reflector descriptions

To assess whether the compressive encoding retained sufficient information about more complex target impulse responses as well, we mimicked an experiment on scale-invariant reflector recognition. In this experiment, Firzlaff et al. [34] trained bats to discriminate echoes resulting from two different impulse responses consisting of 12 impulses each. After the bats had been trained to distinguish between these two successfully, they presented the bats with scaled versions of the target impulse responses mimicking decreased or increased reflector sizes. Scaling entailed multiplying the impulse amplitudes with the square of the scale factor (area of target scales as the square of linear scale factor) and compressing or expanding the target impulse responses along the time axis with the scale factor. As the bats could still discriminate between scaled versions of the original target impulse responses, the experiments showed that the bats could generalize from the trained target impulse responses to their scaled versions.

Similar to the approach taken by Firzlaff et al. [34], we generated two impulse responses consisting of 12 impulses randomly distributed over a 1.86 ms time interval. Scaled versions of these two impulse responses were generated using the same scaling procedure and the same range of scale factors applied in the bat experiments, i.e., a scale factor ranging from 0.65 to 1.5. We applied 15 different scalings in this range. Again, the impulse responses were converted to noisy cochleograms before encoding them using the same 25 filters. To mimic the original study’s experimental design, we trained the neural network to discriminate the impulse responses at scale 1. Next, we tested the neural network’s ability to generalize this discrimination to the scaled versions.

Experiment 4: Encoding spectral information

To assess whether the compressive encoding retained sufficient spectral information for spatial localization of targets, we modeled the elevation discrimination experiments reported in references [5, 35]. As in other mammals, the pinnae of bats generate acoustic cues that aid the localization of echo-producing reflectors. Experiments have shown that the spectral cues imposed by the pinnae are particularly important for localization in the vertical plane, for example, [5, 35–37].

To the best of our knowledge, no phantom target experiments have been used to test bat’s elevation discrimination performance. Therefore, we cannot exactly duplicate a particular experimental setup. Instead, we generated cochleograms using the head-related transfer function (HRTF) of Phillostomus discolor [38, 39] of echoes from virtual targets at azimuth = 0 degrees and varying elevation angle between -20 degrees and 20 degrees. This interval represents the interval of best elevation discrimination found by Lawrence, Simmons and Wotton [5, 35]. We trained the neural network to return a 0 when the 25D input vector was derived from echoes filtered with the HRTF for negative elevations and a 1 for echoes filtered with the HRTF for positive elevations. While this setup differs from the bat experiments referred to above to determine the vertical angular acuity, we propose that it also provides an estimate of the vertical angular acuity. Furthermore, even if the performances in the real and simulated experiments are not directly comparable, solving this discrimination task for target positions close to the horizontal plane will require the neural network to distinguish subtle location-dependent HRTF cues showing that the compressive encoding retains those cues.

Memorizing acoustic signatures

In previous work, we established that cochleograms contain sufficient information to recognize scenes and sonar poses (location and orientation) [7]. In this experiment, we assessed whether the measured cochleograms would still allow place recognition after being projected into the compressed independent components space. To test this, we trained a neural network to associate each of the 25D-vector encodings (see Fig 1) of the empirically collected echoes with the location at which they were collected.

The network used for memorizing acoustic signatures differed from those used in modeling the 2AFC behavioral experiments. This network had three hidden layers with 50, 100, and 50 Rectified Linear Units. The output layer had 21 units, each corresponding to one of the 21 locations where acoustic data was collected. The output layer had a softmax activation function. This activation function normalizes the neurons’ output into a probability distribution. Therefore, the output of this network could be interpreted as a probability distribution across the 21 locations. Categorical Cross Entropy was used as a loss function, and optimization was performed using the Adaptive Moment Estimation (adam) algorithm [40].

Results

Basic components

Fig 3A–3D visualizes four of the basic components Ψ_i, i = 1, ⋯, 25, derived from the echo database. As is clear from this figure, the components show Gabor-like properties along the time dimension. Each component is sensitive to a given delay (or distance) and shows some suppression for shorter and longer delays. Note that because of the ‘dechirping’ operation performed before calculating the cochleograms, all the filters’ Gabor-like responses are vertically oriented.

To investigate the spectral properties of this set of components, we calculate for each component and each frequency channel the mean value along the time dimension (see Fig 3E), and likewise, for the temporal properties, we calculate for each component and for each time the mean value along the frequency dimension (see Fig 3G). As can be seen from Fig 3E, the frequency responses of the components are centered on the range of frequencies most salient in the echoes, i.e., a band around 35 kHz to which the sonar is most sensitive. However, individual components differ somewhat in the frequency to which they are most sensitive. Some components exhibit a similar center-surround characteristic along the frequency axis, as is apparent along the time axis. They are most (least) sensitive to specific frequencies (with some having multiple peaks in their frequency response) and have a region of inactivation (activation) above or below this range, as illustrated by the time-average of two selected components shown in Fig 3F. These two components are most sensitive to different frequencies and have a suppressed frequency region as well. Note that these differences in frequency response allow the set of independent components to encode the spectral properties of the cochleograms.

Fig 3G shows that the components also differ in the delay or distance to which they are most sensitive. This figure shows that the 25 components each have a slightly different best time or distance response. There is a tendency for the components to be most sensitive to shorter time delays. A histogram (Fig 3H) of the best distances (delays) of the 25 components confirms this. Also, the components respond at a faster time-scale at shorter distances and tend to become slower at longer distances, as can be seen in Fig 3A–3D. This can be understood by noting that absorption in air is higher for higher frequencies. Hence, high-frequency contributions to real echoes will be more pronounced at shorter distances and become less so as the distance increases. These high-frequency contributions will stimulate the high-frequency channels of the cochleogram, and because of the larger bandwidths of these channels, this will result in a faster overall cochlear response. Indeed, from the Gammatone filterbank responses shown in Fig 1 (red = high frequency channels, blue = low frequency channels), we note that the low-frequency channels have a much slower response time than the high-frequency channels, see [11]. This indicates that the independent components we derived from the real echoes have correctly captured the physical constraints shaping the spectrotemporal cues present in the cochleograms.

The derived components encode the spectrotemporal information in the cochleograms by being most sensitive to different time delays and frequencies. The filters exhibit evident center-surround characteristics along the time axis: they have a best delay surrounded by suppression regions. Similar, though somewhat less pronounced, features also emerge along the frequency axis.

Modeling behavioral experiments

Fig 4 shows the results from modeling the four behavioral tasks. As can be ascertained from Fig 4A, when mimicking echo delay discrimination experiments by feeding the compressed encodings into a neural network, the just-noticeable difference (JND) in echo delay is about 12 μs.

The JND in echo delay of the model is smaller than the JND values observed in the literature. However, it should be noted that the range of values reported for JND in echo delay is substantial and depends on the specific experimental conditions. Goerlitz et al. [30] quote a range of 36 to 167 μs, across experiments. In their experiments, Goerlitz et al. [30] found values of about 20 to 25 μs for the bat Glossophaga soricina. The neural network’s performance shows that the compressive encoding retains sufficient temporal information to perform temporal discriminations, at least as good as the bats’.

Fig 4B shows that the neural network was also able to discriminate between different notch locations (or time delays between two impulses) when mimicking the experiments of Schmidt [32]. In particular, the network’s performance was closest to the experimental condition in which the bats scored best, i.e., when the trailing and leading echo amplitudes were the same. However, in our simulations, these amplitudes differed by 6 dB. The bats’ performance for this, more challenging, experimental condition is also plotted in Fig 4B. Note that the notch’s reference location (=rewarded stimulus) falls in the center of the blue shaded region, which shows the mean spectrum of all echoes from the database. To make this happen, we shifted the corresponding rewarded delay from 7.77 μs in the bat experiment to 10 μs in the simulation. This was done to ensure that the notch fell within the sonar system’s frequency range for the entire interval of tested delays. The neural network’s discrimination curve shows that encoding the cochleogram using only 25 components does not hamper the successful completion of this discrimination task. Indeed, the performance of the neural network is similar but uniformly better than that of the bats.

From the results shown in Fig 4C, we conclude that similar to the bats in the scale-invariant object recognition experiments described in [34], the neural network learned to discriminate the two (complex) target impulse responses at scale 1 reliably. This discrimination capability seems to generalize to scaled versions of the same target impulse responses, except for scale = 1.5. Only at the largest scale, the network’s performance is notably lower than that observed in bats.

The variability in performance as a function of the scale indicates that the independent components we derived are not scale-invariant representations. However, when cochleograms (and derived representations) are not scale invariant, for bats to be nevertheless able to perform scale-invariant object recognition, the relevant information should still be contained in these representations. To demonstrate that further processing could still extract scale-invariant reflector information from our compressive encoding, we trained a second neural network to discriminate between target impulse responses scaled with factors 0.65, 1, and 1.5. Note that the only difference with the previous simulation is the presentation of a broader range of examples during the neural network’s learning phase. As is clear from the results shown in Fig 4C, the network trained on these three scaled variants of the original target impulse responses can generalize its discrimination capability to other intermediate scales. Hence, this suggests that the encoded cochleograms do indeed retain sufficient information to derive a scale-invariant representation. How bats might accomplish this is beyond the scope of this paper.

Finally, the encoded cochleograms also retained the spectral information required to perform elevation discrimination. As shown in Fig 4D, the neural network was able to discriminate (75% criterion) angles as little as 1–2 degrees above or below the horizon. This is somewhat better but corresponds well with the 3-degree discrimination threshold observed by Lawrence, Simmons and Wotton [5, 35].

Place recognition

The results shown in Fig 5 indicate that the encoded cochleograms also retained sufficient information to allow a neural network to classify each echo as belonging to the location at which it was collected. The network did experience some difficulties assigning a few encoded cochleograms to their respective locations. Analysis of the confusion matrix reveals that the network mostly confuses somewhat similar locations. For example, echoes collected in one room of a private residence were assigned to another room. Also, echoes from one field site were classified as another field site.

Discussion

From cochleograms of ensonification data collected in various bat habitats and indoor environments, we derived an efficient encoding based on 25 independent components. These independent components can be conceptualized as neurons with particular spectrotemporal receptive fields whose output yields a compressed description of the cochleograms [14]. Applying this encoding to a cochleogram is not lossless, as was shown in Fig 2. Reconstructing a cochleogram from its encoded representation results in a smoothed version of the original. However, this loss of information does not impede performance on the tasks modeled in this paper. Despite being highly compressive, the encoding retains essential information to support several critical sonar-based tasks: precise ranging, spectral discrimination, target discrimination, and scene recognition.

A similar conclusion, i.e., echoic information can be highly compressed while still allowing for good task performance, was reached by several other studies. One bat echolocation study [17] describes a PCA-based compression of the spectral information contained in the emission and the external ear transfer functions of Eptesicus fuscus. Following the same approach as here, the compressed encoding of monaural echo information (a mapping onto 8 PCA components) was used as input into an artificial neural network that estimated azimuth and elevation of the echo direction. The authors conclude that the vertical acuity reached by the network was close to that of bats and mostly limited by the coarsely sampled emission and hearing patterns used to derive the PCA encoding. Also, as already mentioned in the introduction, several robotic sonar studies similarly concluded that significant compression could be achieved without compromising task performance [8, 9, 41, 42].

The main difference between this study and these previous studies is that the set of filters used in the proposed compressive encoding is not custom-built for one particular task but is derived from a database of ecologically valid echo signals. This approach reasons that a sonar system, biological and artificial, can profit from learning the ensemble’s statistical structure of echo signals it receives from its environment as this will allow it to represent this structure with optimal efficiency. As shown by the results above, a sufficiently accurate representation of this structure is the only requirement for successfully completing the same tasks that can be performed using the information contained in the raw received signals. Our finding that a limited set of filters can efficiently and effectively encode echoic information has implications for the efficiency of information processing and suggests a potential efficient neural implementation for sonar processing. Below, we discuss both implications of the current results.

Information processing implications

To estimate the data reduction rate accomplished by the encoding, we estimate the number of bits required to encode a cochleogram and compare it with the number of bits required to encode the 25 filters’ outputs. In this study, a cochleogram consists of 140000 floating-point numbers (20 frequency channels × 7000 samples). However, the low pass filter in the model of Wiegrebe [11] allows reducing the effective sample rate needed to represent the cochleograms. Assuming that the 1 kHz low-pass filter in the cochlear model is an ideal low-pass filter, completely removing all frequency components above 1 kHz, only 2 kSamples/sec would be required to fully encode the information in each frequency channel of the cochleogram. Hence, at 360 kSamples/sec, the cochleograms are oversampled by a factor of 180. Note that the low-pass filter in the model is not ideal and does not remove all frequency content above 1 kHz. Therefore, in practice, sampling at a somewhat higher Nyquist rate is required.

Next, we need to determine how many bits are required to encode the samples of a cochleogram. Shannon’s source coding theorem specifies that a cochleogram containing a large number N of independent and identically distributed samples with each sample S(t_j, f_k) having an entropy H(S) = −∑_i p_i * log₂(p_i) can be compressed into N ⋅ H(S) bits with negligible risk of information loss [43]. From the empirically derived distribution p_i of the cochleogram values (Fig D in S1 Text) we estimated H(S) = 4.40 bits. As this value depends on the quantization used, we chose to encode the sample values as doubles. The information capacity I_S of a cochleogram would then be about 3422 bits,

\begin{matrix} I_{S} = \frac{140000 samples \times 4.40 bits per sample}{180 oversample factor} \end{matrix}

(5)

However, not all cochleogram samples are identically distributed, as can be seen from Fig C in S1 Text showing the average cochleogram derived from the database. Indeed many samples contain hardly any energy (and, thus, information). As a first and rough approximation to the real value of I_S we limited the samples to be encoded to those belonging to the region of the average cochleogram where the values attained at least 20% of the maximal value (see also Fig C in S1 Text). On average, this retained 114,212 samples per cochleogram, resulting in a more accurate estimate of I_S = 2792 bits

\begin{matrix} I_{S} = \frac{114, 212 samples \times 4.40 bits per sample}{180 oversample factor} \end{matrix}

(6)

This estimate assumes temporal and spectral independence between samples of the cochleogram. However, nearby samples of the cochleograms are highly correlated. Removing these redundancies allows for more efficient encoding. Indeed, in this study, we showed that the cochleograms could be encoded using only 25 independent and identically distributed coefficients without compromising performance in a broad range of echolocation tasks. Using the same quantization used for the cochleogram samples, we can again calculate the entropy of each of those coefficients H(C) based on the empirically derived distribution of their values (see Fig D in S1 Text). With H(C) = 13.29 bits this results in an information capacity I_C of the compressed encoding given by I_C = 25 × 13.29 = 332 bits.

While several approximations were introduced to derive these values, they indicate that the cochlear output’s informational load can be reduced by roughly 90% through compressive encoding. Despite this large reduction in information content, the simulations reported above show that this compressed representation is sufficient to solve a broad range of sonar-based discrimination tasks. The possibility of reducing the informational load while retaining essential information should allow highly efficient processing, memorization, and retrieving of sonar-based percepts in bats. Because processing more (complex) information is expensive, both in terms of metabolism and the required neural substrate [2, 44], the ability to extract the relevant information also results in decreased costs to the animal.

Physiological implications

Our modeling results show that complex and simple echoes can be effectively represented as the sum of a few components (filters). This was demonstrated by showing that the components retain sufficient information to address several echolocating tasks. This indicates that, at least in principle, the bat could encode the echoic information present at the cochlear nucleus level (modeled here by the cochleograms) using a set of similar components at some higher up stage in the auditory pathway. And use the result to address the echolocation tasks we model.

Evidence from other sensory domains shows that filters are a common way to encode sensory input efficiently. Most famously, so-called simple cells in the primary visual cortex of mammals have been described as applying filters to an image, the output of which determines its response (spike rate) [45]. The receptive fields of these cells have been likened to Gabor filters [46] having optimal localization in both the spatial and the spatial-frequency domains [47], and, therefore, provide an efficient edge encoding scheme [46].

Filters for efficient encoding of sensory information have not only been found in the visual pathway but also the auditory pathway of several mammalian species. In particular, Andoni et al. [48] reported on inferior colliculus cells in the bat Tadarida brasiliensis whose receptive fields have some similarity to the filters we derived. The cells’ responses to communication calls could be predicted from filters with similar spectrotemporal receptive fields, or non-linear combinations thereof [49]. In other mammals as well, neurons with similar spectrotemporal receptive fields have been found (See [50] for references).

If a set of filters can be used for efficient processing of echoic information, could these filters be implemented neurophysiologically? An extreme interpretation of the current results would be that bats require only 25 neurons (or a similarly small number) with the right spectrotemporal responses to echolocate successfully. A more biologically plausible suggestion is that the bat approximates this sparse coding of echoic information by having populations of neurons that, as an ensemble, extract components from the echoic input. The pooled output of these ensembles could be used by other centers to support decision-making or flight control. In this view, motor control or decision-making could be based on reading out 25 (or a similarly small number) of auditory pathway populations.

Assuming that the filters could be implemented as populations of cells allows for a lot of freedom in the response properties of the individual neurons in each ensemble. For example, [48] performed a similar analysis on social calls of bats than we did on sonar data. These authors derived a set of theoretical filters. They found that some individual cells responded as predicted by the theoretical filters. However, in their follow-up paper [49], they found other cells to act as if they implemented non-linear combinations of the theoretical filters. This indicates that at least in the encoding of social calls, the auditory system exploits the input signals’ mathematical properties, decomposing them into independent components, which is an effective encoding strategy. However, their results also indicate that this decomposition is less straightforward than the theoretical filters derived through Independent Component Analysis or similar techniques.

The receptive fields of the neurons observed by Andoni et al. [48] and others do not encode delay or distance information as these aspects are irrelevant for interpreting communication sounds. Instead, they capture those spectrotemporal features relevant to represent the ensemble of communication calls. In contrast, the filters we derive here encode the time-of-arrival (distance) of the echoes and their spectral content. As such, we consider the proposed filters to be analogous to those observed in Tadarida brasiliensis (and other species) but optimized for encoding sonar information by representing the primary cues for a sonar system: delay and spectral information. While such filters have not been directly observed, their required predecessors exist in echolocating bats’ auditory pathway.

Frequency selective cells are prevalent throughout the auditory pathway of bats, which is largely tonotopically organized [51]. Poon et al. [52] reported the inferior colliculus of the FM-bat Eptesicus fuscus to be tonotopically organized. See also [53, 54] for frequency tuned IC cells in Eptesicus fuscus. Frequency selective cells are present up to the cortex, where tonotopically organized areas have been found in several species (See Kossl et al. [55] for references). Target-distance selective cells have been found in the inferior colliculus of Pteronotus parnellii, a CF-FM bat [56] as well as in the inferior colliculus of an FM bat Eptesicus fuscus [54]. The auditory cortex of several CF-FM bats contains echo distance maps [55]. Such organized maps have been also found in some species of FM bats, Carollia perspicillata and Phyllostomus discolor [57, 58] but not in others [59–61], see Kossl et al. [62] for a review. Interestingly, target-distance selective neurons in both CF-FM and FM bats show broader delay-tuning when being selective to longer target-distances [63]. This feature is also present in the basic components reported in the present paper, which show increasing response time (i.e., respond to a broader range of target-distances) with increasing distance (see Fig 4E–4H).

Cells with selective delay and frequency responses are sufficient predecessors for establishing populations of cells with spectrotemporal receptive fields mimicking those of the filters we propose here. Moreover, as both frequency selectivity (already present on the level of the cochlea) and delay selectivity co-exist from the IC upwards, the filters could be established at various processing stages. This would also be in line with neural processing concepts along the ascending auditory pathway [64]. According to these concepts, auditory feature extraction occurs on lower stages of the auditory pathway up to the inferior colliculus. In the auditory cortex, these features are then organized into auditory objects or sensory maps. However, it should be mentioned here that top-down feedback could influence feature extraction and spectro-temporal filter properties. Typically, frontal cortical areas are thought to be involved in top-down control of sensory processing [65–68] and decision making [69].

Conclusion

We have shown that complex echoes from real environments can be efficiently and effectively represented using a small set of filters. The redundancy in echoic information opens up the opportunity for efficient encoding, reducing the computational load of echo processing and the memory load for storing the information. Therefore, we predict the auditory system of bats to capitalize on this opportunity for efficient coding by implementing filters with spectrotemporal properties akin to those hypothesized here.

Supporting information

S1 Text

Fig A. Results from the procedure used to determine a realistic level of internal noise also used in references [7, 11, 27]. As described in the main text, for each level of noise σ we determined $P [\sum C_{6} + N_{σ} > \sum C_{0} + N_{σ}]$ , using a Monte Carlo approach. This graph gives the probability P[⋅] as a function of noise level σ. The value of σ at which P[⋅] = 0.75 was taken as the noise level throughout this paper. Fig B. Ensonification device. The custom-built device consisted of two Knowles microphones embedded in a 3D printed housing and a Sensecomp 7000 emitter. The device was mounted on a tripod. Fig C. Average cochleogram, derived from the ensonification data. b Binarized average cochleogram showing where samples are at least 20% of the maximum value. Fig D. (Left) Distribution of the values of the cochleograms collected in this paper. (Right) Distribution of the 25 filter output values for all ensonification data echoes used in the this paper. As the value of the entropy H(S) depends on the quantization used, we encoded all samples as doubles for this calculation.

(PDF)

Click here for additional data file.^{(5.6MB, pdf)}

Data Availability

All data and code are available at https://doi.org/10.6084/m9.figshare.14573652.v1.

Funding Statement

The author(s) received no specific funding for this work.

References

1. Weber F, Machens CK. Sensory Coding, Efficiency. In: Jaeger D, Jung R, editors. Encyclopedia of Computational Neuroscience. New York, NY: Springer New York; 2014. p. 1–12. [Google Scholar]
2. Warrant EJ. Matched Filtering and the Ecology of Vision in Insects. In: von der Emde G, Warrant E, editors. The Ecology of Animal Senses. Springer International Publishing; 2016. p. 143–167. [Google Scholar]
3. Griffin DR. Listening in the dark; the acoustic orientation of bats and men. Yale University Press, New Haven; 1958. [Google Scholar]
4. Von Helversen D, Von Helversen O. Object recognition by echolocation: a nectar-feeding bat exploiting the flowers of a rain forest vine. Journal of Comparative Physiology A. 2003;189(5):327–336. doi: 10.1007/s00359-003-0405-3 [DOI] [PubMed] [Google Scholar]
5. Wotton JM, Simmons JA. Spectral Cues and Perception of the Vertical Position of Targets by the Big Brown Bat, Eptesicus Fuscus. The Journal of the Acoustical Society of America. 2000;107(2):1034–1041. doi: 10.1121/1.428283 [DOI] [PubMed] [Google Scholar]
6. Kuc R. Biomimetic Sonar Recognizes Objects Using Binaural Information. The Journal of the Acoustical Society of America. 1997;102(2):689–696. doi: 10.1121/1.419658 [DOI] [Google Scholar]
7. Vanderelst D, Steckel J, Boen A, Peremans H, Holderied MW. Place recognition using batlike sonar. Elife. 2016;5:e14188. doi: 10.7554/eLife.14188 [DOI] [PMC free article] [PubMed] [Google Scholar]
8. Vanderelst D, Peremans H, Holderied MW. PLoS Computational Biology. 2015;. doi: 10.1371/journal.pcbi.1004484 [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Mansour CB, Koreman E, Steckel J, Peremans H, Vanderelst D. Avoidance of non-localizable obstacles in echolocating bats: A robotic model. PLoS Computational Biology. 2019;15(12). [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Yovel Y, Franz MO, Stilz P, Schnitzler HU. Plant classification from bat-like echolocation signals. PLoS Computational Biology. 2008;4(3). doi: 10.1371/journal.pcbi.1000032 [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Wiegrebe L. An autocorrelation model of bat sonar. Biological cybernetics. 2008;98(6):587–595. doi: 10.1007/s00422-008-0216-2 [DOI] [PubMed] [Google Scholar]
12. Olshausen BA, Field DJ. Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images. Nature; London. 1996;381(6583):607–9. http://dx.doi.org.proxy.libraries.uc.edu/10.1038/381607a0 [DOI] [PubMed] [Google Scholar]
13. Lewicki MS. Efficient Coding of Natural Sounds. Nature Neuroscience. 2002;5(4):356–363. doi: 10.1038/nn831 [DOI] [PubMed] [Google Scholar]
14. Mély DA, Serre T. Towards a Theory of Computation in the Visual Cortex. In: Zhao Q, editor. Computational and Cognitive Neuroscience of Vision. Singapore: Springer Singapore; 2017. p. 59–84. [Google Scholar]
15. Reijniers J, Peremans H. On population encoding and decoding of auditory information for bat echolocation. Biological Cybernetics. 2010;102(4):311–326. doi: 10.1007/s00422-010-0368-8 [DOI] [PubMed] [Google Scholar]
16. Hyvärinen A, Karhunen J, Oja E. Independent component analysis, adaptive and learning systems for signal processing, communications, and control. John Wiley & Sons, Inc. 2001;1:11–14. [Google Scholar]
17. Wotton JM, Jenison RL. A Backpropagation Network Model of the Monaural Localization Information Available in the Bat Echolocation System. The Journal of the Acoustical Society of America. 1997;101(5):2964–2972. doi: 10.1121/1.418524 [DOI] [PubMed] [Google Scholar]
18. Schnitzler HU, Moss CF, Denzinger A. From spatial orientation to food acquisition in echolocating bats. Trends in Ecology & Evolution. 2003;18(8):386–394. doi: 10.1016/S0169-5347(03)00185-X [DOI] [Google Scholar]
19. Stilz WP, Schnitzler HU. Estimation of the acoustic range of bat echolocation for extended targets. The Journal of the Acoustical Society of America. 2012;132(3):1765–1775. doi: 10.1121/1.4733537 [DOI] [PubMed] [Google Scholar]
20. Altes RA. Sonar for generalized target description and its similarity to animal echolocation systems. The Journal of the Acoustical Society of America. 1976;59(1):97–105. doi: 10.1121/1.380831 [DOI] [PubMed] [Google Scholar]
21. Holderied MW, Baker CJ, Vespe M, Jones G. Understanding signal design during the pursuit of aerial insects by echolocating bats: tools and applications. Integrative and comparative biology. 2008;48(1):74–84. doi: 10.1093/icb/icn035 [DOI] [PubMed] [Google Scholar]
22. Saillant PA, Simmons JA, Dear SP, McMullen TA. A computational model of echo processing and acoustic imaging in frequency-modulated echolocating bats: The spectrogram correlation and transformation receiver. The Journal of the Acoustical Society of America. 1993;94(5):2691–2712. doi: 10.1121/1.407353 [DOI] [PubMed] [Google Scholar]
23. Moore BC, Glasberg BR. Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. The journal of the acoustical society of America. 1983;74(3):750–753. doi: 10.1121/1.389861 [DOI] [PubMed] [Google Scholar]
24. Hyvärinen A, Oja E. Independent component analysis: algorithms and applications. Neural networks. 2000;13(4-5):411–430. doi: 10.1016/S0893-6080(00)00026-5 [DOI] [PubMed] [Google Scholar]
25. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830. [Google Scholar]
26.Chollet F, et al. Keras; 2015. Available from: https://github.com/fchollet/keras.
27. Dau T, Püschel D, Kohlrausch A. A quantitative model of the ‘‘effective’’signal processing in the auditory system. I. Model structure. The Journal of the Acoustical Society of America. 1996;99(6):3615–3622. doi: 10.1121/1.414960 [DOI] [PubMed] [Google Scholar]
28. Schnitzler HU, Henson WO. Performance of airborne animal sonar systems: I. Microchiroptera. In: Animal sonar systems. Springer; 1980. p. 109–181. [Google Scholar]
29. Heinrich M, Warmbold A, Hoffmann S, Firzlaff U, Wiegrebe L. The sonar aperture and its neural representation in bats. Journal of Neuroscience. 2011;31(43):15618–15627. doi: 10.1523/JNEUROSCI.2600-11.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
30. Goerlitz HR, Geberl C, Wiegrebe L. Sonar Detection of Jittering Real Targets in a Free-Flying Bat. The Journal of the Acoustical Society of America. 2010;128(3):1467. doi: 10.1121/1.3445784 [DOI] [PubMed] [Google Scholar]
31. Denzinger A, Schnitzler HU. Echo SPL influences the ranging performance of the big brown bat, Eptesicus fuscus. Journal of Comparative Physiology A. 1994;175(5):563–571. doi: 10.1007/BF00199477 [DOI] [PubMed] [Google Scholar]
32. Schmidt S. Perception of Structured Phantom Targets in the Echolocating Bat, Megaderma Lyra. The Journal of the Acoustical Society of America. 1992;91(4):2203–2223. doi: 10.1121/1.403654 [DOI] [PubMed] [Google Scholar]
33. Matsuo I, Kunugiyama K, Yano M. An echolocation model for range discrimination of multiple closely spaced objects: Transformation of spectrogram into the reflected intensity distribution. The Journal of the Acoustical Society of America. 2004;115(2):920–928. doi: 10.1121/1.1642626 [DOI] [PubMed] [Google Scholar]
34. Firzlaff U, Schuchmann M, Grunwald JE, Schuller G, Wiegrebe L. Object-Oriented Echo Perception and Cortical Representation in Echolocating Bats. PLOS Biology. 2007;5(5):e100. doi: 10.1371/journal.pbio.0050100 [DOI] [PMC free article] [PubMed] [Google Scholar]
35. Lawrence B, Simmons J. Echolocation in bats: the external ear and perception of the vertical positions of targets. Science. 1982;218(4571):481–483. doi: 10.1126/science.7123247 [DOI] [PubMed] [Google Scholar]
36. Aytekin M, Grassi E, Sahota M, Moss CF. The bat head-related transfer function reveals binaural cues for sound localization in azimuth and elevation. The Journal of the Acoustical Society of America. 2004;116(6):3594–3605. doi: 10.1121/1.1811412 [DOI] [PubMed] [Google Scholar]
37. Chiu C, Moss CF. The Role Of The External Ear In Vertical Sound Localization In The Free Flying Bat, EPtesicus Fuscus. Journal of the Acoustical Society of America. 2007;121(4):2227–35. doi: 10.1121/1.2434760 [DOI] [PubMed] [Google Scholar]
38. Vanderelst D, De Mey F, Peremans H, Geipel I, Kalko E, Firzlaff U. What noseleaves do for fm bats depends on their degree of sensorial specialization. PLOS ONE. 2010;5(8):e11893. doi: 10.1371/journal.pone.0011893 [DOI] [PMC free article] [PubMed] [Google Scholar]
39. De Mey F, Reijniers J, Peremans H, Otani M, Firzlaff U. Simulated head related transfer function of the phyllostomid bat Phyllostomus discolor. The Journal of the Acoustical Society of America. 2008;124(4):2123–2132. doi: 10.1121/1.2968703 [DOI] [PubMed] [Google Scholar]
40.Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980. 2014;.
41. Kuc RB. Biomimetic sonar differentiates coin head from tail. The Journal of the Acoustical Society of America. 1997;101:3198. doi: 10.1121/1.419348 [DOI] [Google Scholar]
42.Mansour CB, Koreman E, Laurijssen D, Steckel J, Peremans H, Vanderelst D. Robotic models of obstacle avoidance in bats. In: The 2018 Conference on Artificial Life: A Hybrid of the European Conference on Artificial Life (ECAL) and the International Conference on the Synthesis and Simulation of Living Systems (ALIFE). MIT Press; 2019. p. 463–464.
43. MacKay DJ. Information theory, inference and learning algorithms. Cambridge university press; 2003. [Google Scholar]
44. Sterling P, Laughlin S. Principles of neural design. MIT Press; 2015. [Google Scholar]
45. Carandini M. What Simple and Complex Cells Compute: Classical Perspectives. The Journal of Physiology. 2006;577(2):463–466. doi: 10.1113/jphysiol.2006.118976 [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Marĉelja S. Mathematical Description of the Responses of Simple Cortical Cells. Journal of the Optical Society of America. 1980;70(11):1297. doi: 10.1364/JOSA.70.001297 [DOI] [PubMed] [Google Scholar]
47. Daugman JG. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. JOSA A. 1985;2(7):1160–1169. doi: 10.1364/JOSAA.2.001160 [DOI] [PubMed] [Google Scholar]
48. Andoni S, Li N, Pollak GD. Spectrotemporal Receptive Fields in the Inferior Colliculus Revealing Selectivity for Spectral Motion in Conspecific Vocalizations. Journal of Neuroscience. 2007;27(18):4882–4893. doi: 10.1523/JNEUROSCI.4342-06.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]
49. Andoni S, Pollak GD. Selectivity for spectral motion as a neural computation for encoding natural communication signals in bat inferior colliculus. Journal of Neuroscience. 2011;31(46):16529–16540. doi: 10.1523/JNEUROSCI.1306-11.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
50. Carlson NL, Ming VL, DeWeese MR. Sparse codes for speech predict spectrotemporal receptive fields in the inferior colliculus. PLoS Comput Biol. 2012;8(7):e1002594. doi: 10.1371/journal.pcbi.1002594 [DOI] [PMC free article] [PubMed] [Google Scholar]
51. Pollak GD, Casseday JH. The Neural Basis of Echolocation in Bats. vol. 25 of Zoophysiology. Bradshaw SD, Burggren W, Heller HC, Ishii S, Langer H, Neuweiler G, et al., editors. Berlin, Heidelberg: Springer Berlin Heidelberg; 1989. [Google Scholar]
52. Poon P, Sun X, Kamada T, Jen PS. Frequency and space representation in the inferior colliculus of the FM bat, Eptesicus fuscus. Experimental brain research. 1990;79(1):83–91. doi: 10.1007/BF00228875 [DOI] [PubMed] [Google Scholar]
53. Ferragamo M, Haresign T, Simmons J. Frequency tuning, latencies, and responses to frequency-modulated sweeps in the inferior colliculus of the echolocating bat, Eptesicus fuscus. Journal of Comparative Physiology A. 1997;182(1):65–79. doi: 10.1007/s003590050159 [DOI] [PubMed] [Google Scholar]
54. Macías S, Luo J, Moss CF. Natural Echolocation Sequences Evoke Echo-Delay Selectivity in the Auditory Midbrain of the FM Bat, Eptesicus Fuscus. Journal of Neurophysiology. 2018;120(3):1323–1339. doi: 10.1152/jn.00160.2018 [DOI] [PubMed] [Google Scholar]
55. Kössl M, Hechavarria J, Voss C, Schaefer M, Vater M. Bat auditory cortex–model for general mammalian auditory computation or special design solution for active time perception? European Journal of Neuroscience. 2015;41(5):518–532. doi: 10.1111/ejn.12801 [DOI] [PubMed] [Google Scholar]
56. Wenstrup JJ, Portfors CV. Neural processing of target distance by echolocating bats: functional roles of the auditory midbrain. Neuroscience & Biobehavioral Reviews. 2011;35(10):2073–2083. doi: 10.1016/j.neubiorev.2010.12.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
57. Hagemann C, Esser KH, Kössl M. Chronotopically organized target-distance map in the auditory cortex of the short-tailed fruit bat. Journal of neurophysiology. 2010;103(1):322–333. doi: 10.1152/jn.00595.2009 [DOI] [PubMed] [Google Scholar]
58. Greiter W, Firzlaff U. Echo-acoustic flow shapes object representation in spatially complex acoustic scenes. Journal of neurophysiology. 2017;117(6):2113–2124. doi: 10.1152/jn.00860.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
59. Dear SP, Fritz J, Haresign T, Ferragamo M, Simmons JA. Tonotopic and Functional Organization in the Auditory Cortex of the Big Brown Bat, Eptesicus Fuscus. Journal of Neurophysiology. 1993;70(5):1988–2009. doi: 10.1152/jn.1993.70.5.1988 [DOI] [PubMed] [Google Scholar]
60. Dear SP, Simmons JA, Fritz J. A Possible Neuronal Basis for Representation of Acoustic Scenes in Auditory Cortex of the Big Brown Bat. Nature; London. 1993;364(6438):620–3. http://dx.doi.org.proxy.libraries.uc.edu/10.1038/364620a0 [DOI] [PubMed] [Google Scholar]
61. Wong D, Shannon SL. Functional zones in the auditory cortex of the echolocating bat, Myotis lucifugus. Brain research. 1988;453(1-2):349–352. doi: 10.1016/0006-8993(88)90176-X [DOI] [PubMed] [Google Scholar]
62. Kössl M, Hechavarria J, Voss C, Macias S, Mora E, Vater M. Neural Maps for Target Range in the Auditory Cortex of Echolocating Bats. Current Opinion in Neurobiology. 2014;24:68–75. doi: 10.1016/j.conb.2013.08.016 [DOI] [PubMed] [Google Scholar]
63. Hechavarría JC, Macías S, Vater M, Voss C, Mora EC, Kössl M. Blurry topography for precise target-distance computations in the auditory cortex of echolocating bats. Nature communications. 2013;4(1):1–11. [DOI] [PubMed] [Google Scholar]
64. Nelken I. Processing of complex stimuli and natural scenes in the auditory cortex. Current opinion in neurobiology. 2004;14(4):474–480. doi: 10.1016/j.conb.2004.06.005 [DOI] [PubMed] [Google Scholar]
65. Duncan J. An adaptive coding model of neural function in prefrontal cortex. Nature reviews neuroscience. 2001;2(11):820–829. doi: 10.1038/35097575 [DOI] [PubMed] [Google Scholar]
66. Miller EK, Cohen JD. An integrative theory of prefrontal cortex function. Annual review of neuroscience. 2001;24(1):167–202. doi: 10.1146/annurev.neuro.24.1.167 [DOI] [PubMed] [Google Scholar]
67. Everling S, Tinsley CJ, Gaffan D, Duncan J. Selective representation of task-relevant objects and locations in the monkey prefrontal cortex. European Journal of Neuroscience. 2006;23(8):2197–2214. doi: 10.1111/j.1460-9568.2006.04736.x [DOI] [PubMed] [Google Scholar]
68. Fritz JB, David SV, Radtke-Schuller S, Yin P, Shamma SA. Adaptive, behaviorally gated, persistent encoding of task-relevant auditory information in ferret frontal cortex. Nature neuroscience. 2010;13(8):1011. doi: 10.1038/nn.2598 [DOI] [PMC free article] [PubMed] [Google Scholar]
69. Gold JI, Shadlen MN. The neural basis of decision making. Annual review of neuroscience. 2007;30. doi: 10.1146/annurev.neuro.29.051605.113038 [DOI] [PubMed] [Google Scholar]

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009052.r001

Decision Letter 0

Natalia L Komarova, Emma K Towlson

11 Feb 2021

Dear Dr. Vanderelst,

Thank you very much for submitting your manuscript "Efficient encoding of spectrotemporal information for bat echolocation" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.

We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Emma K Towlson

Guest Editor

PLOS Computational Biology

Natalia Komarova

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The paper by Achutah et al. titled: Efficient encoding of spectrotemporal information for bat echolocation analyzes how much information is contained in complex real-world echoes. The work is novel and important revealing how a low dimensionality version of an echo can maintain the majority of information contained in the original echoes. This is highly relevant for the central nervous system that has to represent the high dimensionality world in much lower dimensions. The authors examine four fundamental tasks in echolocation – temporal accuracy, resolution, IR classification and localization (elevation) showing that the low representation echoes can be used to explain bats’ performance in these tasks. The work is thus very comprehensive. In general, I found the paper well written and I had rather few comments.

One criticism I have is that the authors could use the neural representation of the inner each which already reduces the high dimension correlogram to a lower dimension in a way that is more biological than an ICA analysis. I know of no evidence that the brain transforms sensory information into independent axes – in fact this was the view in the vision literature ca. 2 centuries ago, but is not generally accepted today. This criticism is related to the authors statement that: “The main difference between this study and these previous studies is that the set of fllters used in the proposed compressive encoding is not custom-built for one particular task but is derived from a database of ecologically valid echo signals.”

From my understanding, the authors used ICA which has some compressive definitions, but one could say the same thing about PCA which has been used before. I thus do not understand why this statement is correct.

My second main major concern is the use of neural networks and their tendency to overfit the data, even when training and testing sets are supposedly separated. For example, in many studies samples are divided randomly between training and testing sets, even though samples are often highly correlated, e.g., when two echoes are collected from nearly the same angle – they will be very similar. In such cases, there is essentially no separation between the training and the testing sets. From reading the manuscript, it seems to me that a similar approach was taken here. If this is correct, the performance of the network shows an ability to learn a set of examples rather than to generalize. To test this, or prove it wrong, the authors should measure similarity between examples in the training and testing sets and show that there are no or few examples which are very similar.

Minor comments:

L119 – The authors write: “We averaged across these three repeats to increase the signal to noise ratio”

Was the averaging done at the correlogram level? Averaging at the level of the time-signal would change the signal dramatically

I think that the notation of the math (starting around line 135) is confusing. In theory the ICA could have run directly on the correlograms, but from what I understood the authors first use PCA to reduce correlogram dimensionality to 25 and only then ran an ICA analysis. The application of a PCA in the middle is not shown in figure 1, so it is not fully clear to me if and when it was performed. Moreover, the authors use S (capital or not) to notate three different things. I think this could improve much

Note that Yovel et al. 2008 already showed that a much lower dimension representation of the echo can be used to perform object classification. If I am not mistaking, they used PCA to reduce the data to ca. 200 dimensions, essentially using a similar approach to that described here.

Figures are not numbered according to their appearance in the paper

Reviewer #2: The paper by Achutha et al. investigates how an efficient encoding of sensory representations could support echolocation tasks in bats. The authors obtain cochleograms from and ensemble of echo signals collected in various outdoor and indoor environments, and use these representations to derive a low-dimensional, efficient encoding of the spectrotemporal characteristics of the outputs of the cochlear models. In order to show that such compressive encoding retains sufficient information to support high echolocation performance in bats, neural networks were trained in tasks which mimicked previous experimental approaches using real animals. The experiments conducted in this study show that neural networks can exploit the information contained in the low-dimensional representation of echoes to solve two-alternative-force-choice tasks requiring high temporal and spectral resolutions. The authors conclude that the efficient encoding of echo information is feasible, and that it could likely find neural correlates in the bat’s auditory system.

Previous studies have demonstrated that low-dimensional encodings allow for task resolution at a relatively high level. One of the advantages of the approach taken here is that the authors were able to show that high performance can also be achieved when low-dimensional filters are derived from a dataset of realistic echo signals, not tailored to specific experimental conditions. The manuscript is well-written, and the conclusions well supported. However, there are a few things that require clarification and amendment before publication. My specific comments are below:

Introduction

• The first citation in the paper is [55]. What’s happening is that references are listed alphabetically in the Reference list first, and only then referenced in the main text. This can be confusing at times for the reader who is trying to check the citations.

• Line 17-18: the sentence is unclear. Probably related to the use of the word “foremost” in this context? For example, do the authors mean that the timing of echo arrival is the most relevant information?

• Line 29: “Pose” recognition by Vanderelst et al [52]?

Methods

• Line 105: “just over 1000 echoes”. Is it above 1000 echoes, as this suggests, or were there N=1000 echoes, as stated above (l. 70)?

• Line 108: This part is unclear to me. Echoes were 23 ms long. Were all echoes equally long, regardless on the reflector (21 locations, etc.), or was echo length defined as 23 ms for convenience? It is also unclear how the maximal range is estimated from the echo duration; if this is derived by the authors, then it should be explained in the Methods. If it’s based on previous literature (e.g. [49]), it should be specified.

• Line 127: hyperbolic FM sweeps are maximally Doppler-shift resilient. Can the authors provide a citation for this?

• As far as I understand, the de-chirping requires to know a reference signal (the way is done, for example for radar detection). The authors claim that the “bat has knowledge of its emission”, yet no reference is provided for this, nor a mechanism suggested (e.g. efference copies?). Other models that compensate the FM effect (like that of Wiegrebe, 2008) base this compensation on computing autocorrelations. Autocorrelations could be calculated from neural activity given, for example, a plausible architecture such as described by Shamma (2001, Trends in Cognitive Sciences). Such architecture requires no knowledge of a “reference” signal, like the de-chirping does. Why was the de-chirping implemented in the model? Would the FM compensation via autocorrelation produce the same results? Could the authors substantiate the phrase “the bat has knowledge of its emission” with a speculation of how this “knowledge” occurs in the bat’s brain?

• Line 208: JND is not defined as an abbreviation.

• Line 221: where instead of were.

Results

• Fig. 3e-h are mentioned first in the main text. Why not swap panels a-d (top) with e-h (bottom)? This way, the order would be consistent in the main text, and also individual component examples would be shown first, before exploring details about components in general.

• Fig. 3a: x-axis and label missing?

• Fig. 3d: although distance and time and linearly dependent, this relationship is not explicit in the manuscript. The authors write in the main text (l. 360) of “best delay”, when in the figure is shown “Filter best distance (m)”. I suggest to either make the relationship [ms] -> [m] explicit, or modify Fig. 3d to show best delays and not best distance for consistency.

• Lines 386-387: JND is not defined in the text, it would be better to define it and to specify the parameter you are freeing to, i.e. sometimes JND is in ms, sometimes in dB.

Discussion

• As enjoyable as the Information processing implications subsection is, I have my reserves as to how it fits into the Discussion. Given its reliance on calculations and data (e.g. Figs. S3, S4), could it be better placed in the Results? It is, at any rate, unusual to bring up new data and figures for the first time in the Discussion section.

• Line 594-96: The authors say, based on ref. 20, that bat’s neurons show broader (frequency) tuning when being selective to longer target distance. Yet, they also say that this feature is reflected in the paper, in which increasing response time occurs with increasing target distance (from Fig. 3e-h, I suppose). Unless I miss something, broader frequency tuning (as per ref. 20) is not necessarily equivalent to increasing response times, although both seem to correlate with large delays. Perhaps it would be useful to clarify this.

• Line 599: frequency selectivity does not appear from IC on but from the cochlea onwards and persist along the pathway.

• Lines 593-95: what centers would support decision making during echolocation in bats? I think the literature is very conflicting about this. Would it involve cortex or only subcortical structures?

• Can the 25 filters reported here suffice also for communication call coding in this species?

• General: The neurophysiology part of the discussion is very “feedforward” focused. Yet it is almost certain that feedback pathways play a role when the bat vocalizes. For example feedback activity could change the way the 25 filters reported here behave in different contexts. This should be pointed out to prevent the naïve reader from getting the impression that everything is feedforward in the bat brain.

Others

• Fig. S4 is missing y-axis labels.

• Also Fig. S4 (right). The figure caption reads that the distribution shown here is for 15 filter output values. Does that mean that there were 15 values? Or that this is the distribution of values from 15 filters? If the latter is correct, and there were 25 filters from the ICA analysis, is this a typo?

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Julio Hechavarria

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods

PLoS Comput Biol. 2021 Jun 28;17(6):e1009052. doi: 10.1371/journal.pcbi.1009052.r002

Author response to Decision Letter 0

12 Mar 2021

Attachment

Submitted filename: reply to reviewers plos comp bio.pdf

Click here for additional data file.^{(148.3KB, pdf)}

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009052.r003

Decision Letter 1

Natalia L Komarova

7 May 2021

Dear Dr. Vanderelst,

We are pleased to inform you that your manuscript 'Efficient encoding of spectrotemporal information for bat echolocation' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology.

Best regards,

Natalia L. Komarova

Deputy Editor

PLOS Computational Biology

Natalia Komarova

Deputy Editor

PLOS Computational Biology

***********************************************************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have answered my questions.

I accept the point regarding performing vs. generalizing, but I think that this should be emphasized in the paper. The authors show that the task can be solved with the compressed data, but they show no generalization which bats are known to possess (even if not shown in these tasks)

Reviewer #2: The authors hace dealt with all my comments and concerns in a propoer way. I think the paper is ready for publication as is. I can't but congratulate the authors on this excellent work.

**********

Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Julio C. Hechavaria

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1009052.r004

Acceptance letter

Natalia L Komarova

21 Jun 2021

PCOMPBIOL-D-20-01765R1

Efficient encoding of spectrotemporal information for bat echolocation

Dear Dr Vanderelst,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Olena Szabo

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Text

(PDF)

Click here for additional data file.^{(5.6MB, pdf)}

Attachment

Submitted filename: reply to reviewers plos comp bio.pdf

Click here for additional data file.^{(148.3KB, pdf)}

Data Availability Statement

All data and code are available at https://doi.org/10.6084/m9.figshare.14573652.v1.

[pcbi.1009052.ref001] 1. Weber F, Machens CK. Sensory Coding, Efficiency. In: Jaeger D, Jung R, editors. Encyclopedia of Computational Neuroscience. New York, NY: Springer New York; 2014. p. 1–12. [Google Scholar]

[pcbi.1009052.ref002] 2. Warrant EJ. Matched Filtering and the Ecology of Vision in Insects. In: von der Emde G, Warrant E, editors. The Ecology of Animal Senses. Springer International Publishing; 2016. p. 143–167. [Google Scholar]

[pcbi.1009052.ref003] 3. Griffin DR. Listening in the dark; the acoustic orientation of bats and men. Yale University Press, New Haven; 1958. [Google Scholar]

[pcbi.1009052.ref004] 4. Von Helversen D, Von Helversen O. Object recognition by echolocation: a nectar-feeding bat exploiting the flowers of a rain forest vine. Journal of Comparative Physiology A. 2003;189(5):327–336. doi: 10.1007/s00359-003-0405-3 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref005] 5. Wotton JM, Simmons JA. Spectral Cues and Perception of the Vertical Position of Targets by the Big Brown Bat, Eptesicus Fuscus. The Journal of the Acoustical Society of America. 2000;107(2):1034–1041. doi: 10.1121/1.428283 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref006] 6. Kuc R. Biomimetic Sonar Recognizes Objects Using Binaural Information. The Journal of the Acoustical Society of America. 1997;102(2):689–696. doi: 10.1121/1.419658 [DOI] [Google Scholar]

[pcbi.1009052.ref007] 7. Vanderelst D, Steckel J, Boen A, Peremans H, Holderied MW. Place recognition using batlike sonar. Elife. 2016;5:e14188. doi: 10.7554/eLife.14188 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref008] 8. Vanderelst D, Peremans H, Holderied MW. PLoS Computational Biology. 2015;. doi: 10.1371/journal.pcbi.1004484 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref009] 9. Mansour CB, Koreman E, Steckel J, Peremans H, Vanderelst D. Avoidance of non-localizable obstacles in echolocating bats: A robotic model. PLoS Computational Biology. 2019;15(12). [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref010] 10. Yovel Y, Franz MO, Stilz P, Schnitzler HU. Plant classification from bat-like echolocation signals. PLoS Computational Biology. 2008;4(3). doi: 10.1371/journal.pcbi.1000032 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref011] 11. Wiegrebe L. An autocorrelation model of bat sonar. Biological cybernetics. 2008;98(6):587–595. doi: 10.1007/s00422-008-0216-2 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref012] 12. Olshausen BA, Field DJ. Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images. Nature; London. 1996;381(6583):607–9. http://dx.doi.org.proxy.libraries.uc.edu/10.1038/381607a0 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref013] 13. Lewicki MS. Efficient Coding of Natural Sounds. Nature Neuroscience. 2002;5(4):356–363. doi: 10.1038/nn831 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref014] 14. Mély DA, Serre T. Towards a Theory of Computation in the Visual Cortex. In: Zhao Q, editor. Computational and Cognitive Neuroscience of Vision. Singapore: Springer Singapore; 2017. p. 59–84. [Google Scholar]

[pcbi.1009052.ref015] 15. Reijniers J, Peremans H. On population encoding and decoding of auditory information for bat echolocation. Biological Cybernetics. 2010;102(4):311–326. doi: 10.1007/s00422-010-0368-8 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref016] 16. Hyvärinen A, Karhunen J, Oja E. Independent component analysis, adaptive and learning systems for signal processing, communications, and control. John Wiley & Sons, Inc. 2001;1:11–14. [Google Scholar]

[pcbi.1009052.ref017] 17. Wotton JM, Jenison RL. A Backpropagation Network Model of the Monaural Localization Information Available in the Bat Echolocation System. The Journal of the Acoustical Society of America. 1997;101(5):2964–2972. doi: 10.1121/1.418524 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref018] 18. Schnitzler HU, Moss CF, Denzinger A. From spatial orientation to food acquisition in echolocating bats. Trends in Ecology & Evolution. 2003;18(8):386–394. doi: 10.1016/S0169-5347(03)00185-X [DOI] [Google Scholar]

[pcbi.1009052.ref019] 19. Stilz WP, Schnitzler HU. Estimation of the acoustic range of bat echolocation for extended targets. The Journal of the Acoustical Society of America. 2012;132(3):1765–1775. doi: 10.1121/1.4733537 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref020] 20. Altes RA. Sonar for generalized target description and its similarity to animal echolocation systems. The Journal of the Acoustical Society of America. 1976;59(1):97–105. doi: 10.1121/1.380831 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref021] 21. Holderied MW, Baker CJ, Vespe M, Jones G. Understanding signal design during the pursuit of aerial insects by echolocating bats: tools and applications. Integrative and comparative biology. 2008;48(1):74–84. doi: 10.1093/icb/icn035 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref022] 22. Saillant PA, Simmons JA, Dear SP, McMullen TA. A computational model of echo processing and acoustic imaging in frequency-modulated echolocating bats: The spectrogram correlation and transformation receiver. The Journal of the Acoustical Society of America. 1993;94(5):2691–2712. doi: 10.1121/1.407353 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref023] 23. Moore BC, Glasberg BR. Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. The journal of the acoustical society of America. 1983;74(3):750–753. doi: 10.1121/1.389861 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref024] 24. Hyvärinen A, Oja E. Independent component analysis: algorithms and applications. Neural networks. 2000;13(4-5):411–430. doi: 10.1016/S0893-6080(00)00026-5 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref025] 25. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830. [Google Scholar]

[pcbi.1009052.ref026] 26.Chollet F, et al. Keras; 2015. Available from: https://github.com/fchollet/keras.

[pcbi.1009052.ref027] 27. Dau T, Püschel D, Kohlrausch A. A quantitative model of the ‘‘effective’’signal processing in the auditory system. I. Model structure. The Journal of the Acoustical Society of America. 1996;99(6):3615–3622. doi: 10.1121/1.414960 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref028] 28. Schnitzler HU, Henson WO. Performance of airborne animal sonar systems: I. Microchiroptera. In: Animal sonar systems. Springer; 1980. p. 109–181. [Google Scholar]

[pcbi.1009052.ref029] 29. Heinrich M, Warmbold A, Hoffmann S, Firzlaff U, Wiegrebe L. The sonar aperture and its neural representation in bats. Journal of Neuroscience. 2011;31(43):15618–15627. doi: 10.1523/JNEUROSCI.2600-11.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref030] 30. Goerlitz HR, Geberl C, Wiegrebe L. Sonar Detection of Jittering Real Targets in a Free-Flying Bat. The Journal of the Acoustical Society of America. 2010;128(3):1467. doi: 10.1121/1.3445784 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref031] 31. Denzinger A, Schnitzler HU. Echo SPL influences the ranging performance of the big brown bat, Eptesicus fuscus. Journal of Comparative Physiology A. 1994;175(5):563–571. doi: 10.1007/BF00199477 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref032] 32. Schmidt S. Perception of Structured Phantom Targets in the Echolocating Bat, Megaderma Lyra. The Journal of the Acoustical Society of America. 1992;91(4):2203–2223. doi: 10.1121/1.403654 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref033] 33. Matsuo I, Kunugiyama K, Yano M. An echolocation model for range discrimination of multiple closely spaced objects: Transformation of spectrogram into the reflected intensity distribution. The Journal of the Acoustical Society of America. 2004;115(2):920–928. doi: 10.1121/1.1642626 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref034] 34. Firzlaff U, Schuchmann M, Grunwald JE, Schuller G, Wiegrebe L. Object-Oriented Echo Perception and Cortical Representation in Echolocating Bats. PLOS Biology. 2007;5(5):e100. doi: 10.1371/journal.pbio.0050100 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref035] 35. Lawrence B, Simmons J. Echolocation in bats: the external ear and perception of the vertical positions of targets. Science. 1982;218(4571):481–483. doi: 10.1126/science.7123247 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref036] 36. Aytekin M, Grassi E, Sahota M, Moss CF. The bat head-related transfer function reveals binaural cues for sound localization in azimuth and elevation. The Journal of the Acoustical Society of America. 2004;116(6):3594–3605. doi: 10.1121/1.1811412 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref037] 37. Chiu C, Moss CF. The Role Of The External Ear In Vertical Sound Localization In The Free Flying Bat, EPtesicus Fuscus. Journal of the Acoustical Society of America. 2007;121(4):2227–35. doi: 10.1121/1.2434760 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref038] 38. Vanderelst D, De Mey F, Peremans H, Geipel I, Kalko E, Firzlaff U. What noseleaves do for fm bats depends on their degree of sensorial specialization. PLOS ONE. 2010;5(8):e11893. doi: 10.1371/journal.pone.0011893 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref039] 39. De Mey F, Reijniers J, Peremans H, Otani M, Firzlaff U. Simulated head related transfer function of the phyllostomid bat Phyllostomus discolor. The Journal of the Acoustical Society of America. 2008;124(4):2123–2132. doi: 10.1121/1.2968703 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref040] 40.Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980. 2014;.

[pcbi.1009052.ref041] 41. Kuc RB. Biomimetic sonar differentiates coin head from tail. The Journal of the Acoustical Society of America. 1997;101:3198. doi: 10.1121/1.419348 [DOI] [Google Scholar]

[pcbi.1009052.ref042] 42.Mansour CB, Koreman E, Laurijssen D, Steckel J, Peremans H, Vanderelst D. Robotic models of obstacle avoidance in bats. In: The 2018 Conference on Artificial Life: A Hybrid of the European Conference on Artificial Life (ECAL) and the International Conference on the Synthesis and Simulation of Living Systems (ALIFE). MIT Press; 2019. p. 463–464.

[pcbi.1009052.ref043] 43. MacKay DJ. Information theory, inference and learning algorithms. Cambridge university press; 2003. [Google Scholar]

[pcbi.1009052.ref044] 44. Sterling P, Laughlin S. Principles of neural design. MIT Press; 2015. [Google Scholar]

[pcbi.1009052.ref045] 45. Carandini M. What Simple and Complex Cells Compute: Classical Perspectives. The Journal of Physiology. 2006;577(2):463–466. doi: 10.1113/jphysiol.2006.118976 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref046] 46. Marĉelja S. Mathematical Description of the Responses of Simple Cortical Cells. Journal of the Optical Society of America. 1980;70(11):1297. doi: 10.1364/JOSA.70.001297 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref047] 47. Daugman JG. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. JOSA A. 1985;2(7):1160–1169. doi: 10.1364/JOSAA.2.001160 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref048] 48. Andoni S, Li N, Pollak GD. Spectrotemporal Receptive Fields in the Inferior Colliculus Revealing Selectivity for Spectral Motion in Conspecific Vocalizations. Journal of Neuroscience. 2007;27(18):4882–4893. doi: 10.1523/JNEUROSCI.4342-06.2007 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref049] 49. Andoni S, Pollak GD. Selectivity for spectral motion as a neural computation for encoding natural communication signals in bat inferior colliculus. Journal of Neuroscience. 2011;31(46):16529–16540. doi: 10.1523/JNEUROSCI.1306-11.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref050] 50. Carlson NL, Ming VL, DeWeese MR. Sparse codes for speech predict spectrotemporal receptive fields in the inferior colliculus. PLoS Comput Biol. 2012;8(7):e1002594. doi: 10.1371/journal.pcbi.1002594 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref051] 51. Pollak GD, Casseday JH. The Neural Basis of Echolocation in Bats. vol. 25 of Zoophysiology. Bradshaw SD, Burggren W, Heller HC, Ishii S, Langer H, Neuweiler G, et al., editors. Berlin, Heidelberg: Springer Berlin Heidelberg; 1989. [Google Scholar]

[pcbi.1009052.ref052] 52. Poon P, Sun X, Kamada T, Jen PS. Frequency and space representation in the inferior colliculus of the FM bat, Eptesicus fuscus. Experimental brain research. 1990;79(1):83–91. doi: 10.1007/BF00228875 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref053] 53. Ferragamo M, Haresign T, Simmons J. Frequency tuning, latencies, and responses to frequency-modulated sweeps in the inferior colliculus of the echolocating bat, Eptesicus fuscus. Journal of Comparative Physiology A. 1997;182(1):65–79. doi: 10.1007/s003590050159 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref054] 54. Macías S, Luo J, Moss CF. Natural Echolocation Sequences Evoke Echo-Delay Selectivity in the Auditory Midbrain of the FM Bat, Eptesicus Fuscus. Journal of Neurophysiology. 2018;120(3):1323–1339. doi: 10.1152/jn.00160.2018 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref055] 55. Kössl M, Hechavarria J, Voss C, Schaefer M, Vater M. Bat auditory cortex–model for general mammalian auditory computation or special design solution for active time perception? European Journal of Neuroscience. 2015;41(5):518–532. doi: 10.1111/ejn.12801 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref056] 56. Wenstrup JJ, Portfors CV. Neural processing of target distance by echolocating bats: functional roles of the auditory midbrain. Neuroscience & Biobehavioral Reviews. 2011;35(10):2073–2083. doi: 10.1016/j.neubiorev.2010.12.015 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref057] 57. Hagemann C, Esser KH, Kössl M. Chronotopically organized target-distance map in the auditory cortex of the short-tailed fruit bat. Journal of neurophysiology. 2010;103(1):322–333. doi: 10.1152/jn.00595.2009 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref058] 58. Greiter W, Firzlaff U. Echo-acoustic flow shapes object representation in spatially complex acoustic scenes. Journal of neurophysiology. 2017;117(6):2113–2124. doi: 10.1152/jn.00860.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref059] 59. Dear SP, Fritz J, Haresign T, Ferragamo M, Simmons JA. Tonotopic and Functional Organization in the Auditory Cortex of the Big Brown Bat, Eptesicus Fuscus. Journal of Neurophysiology. 1993;70(5):1988–2009. doi: 10.1152/jn.1993.70.5.1988 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref060] 60. Dear SP, Simmons JA, Fritz J. A Possible Neuronal Basis for Representation of Acoustic Scenes in Auditory Cortex of the Big Brown Bat. Nature; London. 1993;364(6438):620–3. http://dx.doi.org.proxy.libraries.uc.edu/10.1038/364620a0 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref061] 61. Wong D, Shannon SL. Functional zones in the auditory cortex of the echolocating bat, Myotis lucifugus. Brain research. 1988;453(1-2):349–352. doi: 10.1016/0006-8993(88)90176-X [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref062] 62. Kössl M, Hechavarria J, Voss C, Macias S, Mora E, Vater M. Neural Maps for Target Range in the Auditory Cortex of Echolocating Bats. Current Opinion in Neurobiology. 2014;24:68–75. doi: 10.1016/j.conb.2013.08.016 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref063] 63. Hechavarría JC, Macías S, Vater M, Voss C, Mora EC, Kössl M. Blurry topography for precise target-distance computations in the auditory cortex of echolocating bats. Nature communications. 2013;4(1):1–11. [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref064] 64. Nelken I. Processing of complex stimuli and natural scenes in the auditory cortex. Current opinion in neurobiology. 2004;14(4):474–480. doi: 10.1016/j.conb.2004.06.005 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref065] 65. Duncan J. An adaptive coding model of neural function in prefrontal cortex. Nature reviews neuroscience. 2001;2(11):820–829. doi: 10.1038/35097575 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref066] 66. Miller EK, Cohen JD. An integrative theory of prefrontal cortex function. Annual review of neuroscience. 2001;24(1):167–202. doi: 10.1146/annurev.neuro.24.1.167 [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref067] 67. Everling S, Tinsley CJ, Gaffan D, Duncan J. Selective representation of task-relevant objects and locations in the monkey prefrontal cortex. European Journal of Neuroscience. 2006;23(8):2197–2214. doi: 10.1111/j.1460-9568.2006.04736.x [DOI] [PubMed] [Google Scholar]

[pcbi.1009052.ref068] 68. Fritz JB, David SV, Radtke-Schuller S, Yin P, Shamma SA. Adaptive, behaviorally gated, persistent encoding of task-relevant auditory information in ferret frontal cortex. Nature neuroscience. 2010;13(8):1011. doi: 10.1038/nn.2598 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pcbi.1009052.ref069] 69. Gold JI, Shadlen MN. The neural basis of decision making. Annual review of neuroscience. 2007;30. doi: 10.1146/annurev.neuro.29.051605.113038 [DOI] [PubMed] [Google Scholar]

PERMALINK

Efficient encoding of spectrotemporal information for bat echolocation

Adarsh Chitradurga Achutha

Herbert Peremans

Uwe Firzlaff

Dieter Vanderelst

Roles

Abstract

Author summary

Introduction

Methods

Overview

Fig 1. General outline of the approach of this paper.

Deriving an efficient cochlear encoding

Fig 2. Example of a cochleogram from the ensonification data.

Modeling behavioral data

Noise model

Experiment 1: Encoding temporal information

Experiment 2: Encoding simple reflector descriptions

Experiment 3: Encoding scale-invariant reflector descriptions

Experiment 4: Encoding spectral information

Memorizing acoustic signatures

Results

Basic components

Fig 3. The basic components and their properties.

Modeling behavioral experiments

Fig 4. All red curves depict neural network performances.

Place recognition

Fig 5. Performance of the neural network memorizing the class membership (=location) of each encoded cochleogram.

Discussion

Information processing implications

Physiological implications

Conclusion

Supporting information

Data Availability

Funding Statement

References

Decision Letter 0

Natalia L Komarova

Emma K Towlson

Roles

Author response to Decision Letter 0

Decision Letter 1

Natalia L Komarova

Roles

Acceptance letter

Natalia L Komarova

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases