Precise control of neural activity using dynamically optimized electrical stimulation

Nishal Pradeepbhai Shah; AJ Phillips; Sasidhar Madugula; Amrith Lotlikar; Alex R Gogliettino; Madeline Rose Hays; Lauren Grosberg; Jeff Brown; Aditya Dusi; Pulkit Tandon; Pawel Hottowy; Wladyslaw Dabrowski; Alexander Sher; Alan M Litke; Subhasish Mitra; EJ Chichilnisky

doi:10.7554/eLife.83424

. 2024 Nov 7;13:e83424. doi: 10.7554/eLife.83424

Precise control of neural activity using dynamically optimized electrical stimulation

Nishal Pradeepbhai Shah ^1,^2,^3,^†,^✉, AJ Phillips ^1,^3,^†,^✉, Sasidhar Madugula ^2,³, Amrith Lotlikar ¹, Alex R Gogliettino ^3,⁴, Madeline Rose Hays ^3,⁵, Lauren Grosberg ^2,³, Jeff Brown ¹, Aditya Dusi ¹, Pulkit Tandon ¹, Pawel Hottowy ⁶, Wladyslaw Dabrowski ⁶, Alexander Sher ⁷, Alan M Litke ⁷, Subhasish Mitra ¹, EJ Chichilnisky ^2,^3,⁸

Editors: Michael Beyeler⁹, Joshua I Gold¹⁰

PMCID: PMC11542921 PMID: 39508555

Abstract

Neural implants have the potential to restore lost sensory function by electrically evoking the complex naturalistic activity patterns of neural populations. However, it can be difficult to predict and control evoked neural responses to simultaneous multi-electrode stimulation due to nonlinearity of the responses. We present a solution to this problem and demonstrate its utility in the context of a bidirectional retinal implant for restoring vision. A dynamically optimized stimulation approach encodes incoming visual stimuli into a rapid, greedily chosen, temporally dithered and spatially multiplexed sequence of simple stimulation patterns. Stimuli are selected to optimize the reconstruction of the visual stimulus from the evoked responses. Temporal dithering exploits the slow time scales of downstream neural processing, and spatial multiplexing exploits the independence of responses generated by distant electrodes. The approach was evaluated using an experimental laboratory prototype of a retinal implant: large-scale, high-resolution multi-electrode stimulation and recording of macaque and rat retinal ganglion cells ex vivo. The dynamically optimized stimulation approach substantially enhanced performance compared to existing approaches based on static mapping between visual stimulus intensity and current amplitude. The modular framework enabled parallel extensions to naturalistic viewing conditions, incorporation of perceptual similarity measures, and efficient implementation for an implantable device. A direct closed-loop test of the approach supported its potential use in vision restoration.

Research organism: Rhesus macaque, Long-Evans rat

Introduction

A major goal of sensory neuroscience is to leverage our understanding of neural circuits to develop implantable devices that can artificially control neural activity for restoring senses such as vision (Humayun et al., 2012; Stingl et al., 2013; Palanker et al., 2020; Beauchamp et al., 2020; Chen et al., 2020), audition (Gaylor et al., 2013), and somatosensation (Johnson et al., 2013; Flesher et al., 2016; Salas et al., 2018). Recent innovations in large-scale and high-resolution electrical stimulation hardware hold great promise for such applications. However, the effect of stimulation with many closely spaced electrodes on neural activity is generally complex and nonlinear, which severely limits the ability to produce the desired spatiotemporal patterns of neural activity for sensory restoration.

This paper presents a novel approach to this problem in the context of an epiretinal implant for restoring vision in people blinded by photoreceptor degeneration (Humayun et al., 2012; Beyeler et al., 2019; Bloch, 2020). Epiretinal implants electrically activate retinal ganglion cells (RGCs) that have survived degeneration, causing them to send artificial visual signals to the brain. After initial successes, progress toward restoring high-fidelity natural vision using this approach has slowed, likely in part due to indiscriminate activation of many RGCs of different cell types and a resulting inaccurate neural representation of the target stimulus. One reason for this indiscriminate activation is the difficulty of predicting the neural activity evoked by multi-electrode stimulation based on the activity evoked by single-electrode stimulation.

Here, we present a data-driven optimization approach to bypass this problem by dynamically combining simpler stimulation patterns (Choi et al., 2016; Beauchamp et al., 2020; Tafazoli et al., 2020; Haji Ghaffari et al., 2021; Vasireddy et al., 2023) using temporal dithering and spatial multiplexing. The presented solution is divided into three steps, allowing it to be modified for a wide range of neural systems and implants. First, we develop a simple, explicit model of how the visual image can be reconstructed from the activity of many RGCs of diverse types accessed by the multi-electrode array. Second, we avoid the complexity of nonlinear electrical stimulation by empirically calibrating RGC responses to a collection of simple single-electrode stimuli which can then be combined asynchronously and sparsely to reproduce patterns of neural activity. Finally, we optimize visual scene reconstruction by greedily selecting a sequence of these simple stimuli, temporally dithered to exploit the high speed of electrically evoked neural responses, and spatially multiplexed to avoid interactions between nearby electrodes. These three steps result in a dynamically optimized stimulation paradigm: the visual stimulus is transformed into a spatiotemporal pattern of electrical stimuli designed to produce a pattern of neural activity that is most effective for vision restoration given the measured limitations of the neural interface.

This dynamic optimization approach was tested and evaluated using large-scale multi-electrode stimulation and recording ex vivo from the macaque and rat retinas, a lab prototype for a future implantable system. The method produced substantial improvements in stimulus reconstruction compared to existing methods, by appropriately activating ON and OFF RGCs over space. The algorithm was useful in identifying a subset of the most effective electrodes for a particular retina, which could substantially reduce power consumption in an implant. Extensions of the algorithm can in principle be used to translate the approach to naturalistic viewing conditions with eye movements, and to exploit perceptual metrics to further enhance the quality of reconstruction.

Results

First, we frame the translation of a visual stimulus into electrical stimulation as an optimization problem and present a greedy temporal dithering algorithm to solve it efficiently. Then, we use the ex vivo lab prototype to evaluate the performance of the approach. We compare it with existing methods and develop extensions for spatial multiplexing, natural viewing with eye movements and perceptual quality measures.

The ex vivo lab prototype consists of electrical recording and stimulation of the macaque and rat retinas with a large-scale high-density multi-electrode array (512 electrodes, 60 μm spacing). The visual and electrical response properties of all recorded cells are estimated by direct measurements, using experimental methods described previously (Field and Chichilnisky, 2007; Jepson et al., 2013; Grosberg et al., 2017) (see Methods). These data provide reliable experimental access to complete populations of ON and OFF parasol cell types in macaque retina, so these two cell types are the focus of the empirical analysis.

Dynamic optimization to approximately replicate neural code

Converting a visual stimulus into effective electrical stimulation can be framed as an optimization problem. Using the terminology of optimization, the three key components are the objective function, the constraints, and the algorithm (Figure 1A). The objective function to be minimized is identified as the difference between the target visual stimulus and a reconstruction of the stimulus from the neural responses, as a proxy for how the brain could use the signal for visual inference. However, certain constraints are imposed by electrical stimulation, which provides imperfect control over the activity of a population of cells. Hence, the optimization algorithm must convert incoming visual stimuli into electrical stimuli, such that the stimulus reconstructed from electrically evoked responses matches the true stimulus as closely as possible.

Figure 1. — (A) In a healthy retina, the visual stimulus is encoded in the neural response pattern of retinal ganglion cells (RGCs; top row). In a retina with an implant, the visual stimulus is encoded into current patterns, which generate neural response patterns (bottom row). In either case, the neural responses are eventually processed by the brain to elicit perception, through a process assumed to involve reconstruction of the image. Selecting the appropriate electrical stimulation can be framed as an optimization problem, in which the goal is to identify an *algorithm* (prosthesis encoding) that achieves an *objective* (reconstruction error) while operating under *constraints* (electrical stimulation). (B) *Objective*: Linear reconstruction of visual stimulus by summing cell-specific spatial filters, weighted by spike counts. Receptive fields of ON (blue) and OFF (red) parasol cells in a population are shown. (C) *Constraint*: Characterizing electrically evoked RGC responses with a dictionary of stimulation patterns. Example dictionary elements, with cells shaded according to evoked response probability. A single-electrode stimulated multiple cells, indicating poor selectivity. (D) *Algorithm*: Run-time usage of the artificial retina. Exploiting the slow visual integration time, distant electrodes are stimulated in fast sequence. The resulting neural response is the summation of spikes elicited in each time step.

Objective: reconstructing the visual stimulus from neural responses

The objective of electrical stimulation in this context is to reproduce, as closely as possible, a visual sensation that would be produced by normal light-evoked responses. However, it is not known how the brain interprets RGC light responses, and thus how to frame the problem computationally. As a simple proxy, the objective is defined by reconstructing the visual image as accurately as possible from evoked RGC spikes (see Discussion) and then evaluating the difference between the reconstruction and the original image. For simplicity, linear reconstruction is assumed, and the objective is the minimum squared error between the target image and the reconstructed image.

Specifically, for a target image shown to the retina in the experimental lab prototype setting, the reconstructed stimulus is modeled as the linear superposition of spatial filters, each associated with a particular RGC, weighted by the corresponding RGC response (Figure 1B; Warland et al., 1997; Brackbill et al., 2020). The optimal linear reconstruction filter for each ON and OFF parasol cell was approximated using the measured spatial receptive field of the cell obtained with white noise stimulation, scaled to predict the average spike count recorded within 50 ms of the onset of a flashed checkerboard stimulus. Note that in an implanted blind retina, the reconstruction filter for each cell would have to be estimated in a different way (Zaidi et al., 2022, see Discussion).

Constraint: calibrating the collection of neural responses that can be electrically evoked

The limited precision of electrical stimulation constrains our ability to produce desired response patterns in the RGC population. To optimize stimulation under this constraint, it would be ideal to have a model that characterizes how RGCs respond to arbitrary electrical stimulus patterns produced with the electrode array. Unfortunately, estimating this model is difficult due to nonlinear interactions in neural activation resulting from current passed simultaneously through multiple electrodes (Jepson et al., 2014).

An alternative is to use a limited dictionary of responses evoked by simple current patterns. This dictionary was calibrated empirically in advance (Figure 1C). Specifically, current was passed through each of the 512 electrodes individually at each of 40 current levels (logarithmically spaced over the range 0.1–4 µA) and the response probability for each recorded cell was estimated using the fraction of trials in which an evoked spike was recorded from the cell. The evoked spikes were identified using a custom spike sorting algorithm that estimated the electrical artifact produced by stimulation and matched the residual recorded voltage to template waveforms of cells previously identified during visual stimulation (Mena et al., 2017). In general, even though some electrical stimuli selectively activated one cell, many stimuli simultaneously activated two or more cells, often due to axonal stimulation (Figure 1C; Grosberg et al., 2017). Also, high-amplitude stimuli tended to evoke spikes in cells with receptive fields off the electrode array via their axons. These cases were detected by identifying bidirectional spike propagation to the edge of the electrode array and were removed from the dictionary (Grosberg et al., 2017; Tandon et al., 2021).

Algorithm: greedy temporal dithering to approximate the optimal spatiotemporal electrical stimulus

Because a single-electrode stimulus generally cannot create a pattern of activity across the RGC population that accurately encodes a visual image, multiple stimuli must be combined, while also avoiding the nonlinear interactions mentioned above. This was achieved by rapid interleaving or temporal dithering of a diverse collection of single-electrode stimuli. The effectiveness of this method relies on assuming that if many such stimuli are provided in rapid succession (e.g. 0.1-ms interval), they evoke visual sensations similar to those that would be produced by simultaneous stimulation (Figure 1D), because of long visual integration times in the brain (e.g. tens of ms, see Discussion).

Under this assumption, the optimization problem reduces to finding a sequence of dictionary elements ${c_{1}, \dots, c_{T}}$ that minimizes the expected squared error between the target visual stimulus and reconstructed responses based on the total spike count:

c_{1}, . . ., c_{T} = a r g m i n E_{r_{i} \sim B e r n o u l l i (p_{c_{i}})} | | s - A (r_{1} + . . . + r_{T}) | |^{2}

(1)

where $s \in (p i x e l s)$ is the target visual stimulus, $A \in (p i x e l s x c e l l s)$ is the stimulus reconstruction filter, and $r_{i} \in (c e l l s)$ is a vector of the spike counts in the population of cells generated by stimulation $c_{i}$ , with spikes being drawn according to Bernoulli processes with probabilities $p_{c_{i}} \in (c e l l s)$ .

In order to create an effective stimulation sequence, a straightforward, real-time method is to greedily select a stimulus at each step that minimizes the predicted error between the reconstruction and the target image. Although the greedy approach is not necessarily optimal (as explained later), it does allow for effective real-time implementation.

A crucial assumption of the algorithm is that the total expected spike count evoked by a sequence of electrical stimuli is the sum of the expected spikes for all the individual stimuli. However, when a cell is repeatedly stimulated, the activation probabilities associated with later stimuli are reduced, because of biophysical refractoriness. To avoid this non-independence, the stimulus at each time step is chosen from a ‘valid’ subset of the dictionary that does not include cells that were targeted recently (see Methods).

Greedy temporal dithering outperforms existing static methods

The greedy temporal dithering algorithm was evaluated on data collected using the laboratory experimental prototype for a retinal implant. After calibrating the responses of all recorded RGCs to all available single-electrode stimuli, the greedy temporal dithering stimulation sequence corresponding to a specific visual target (usually, a random checkerboard image) was calculated, as described above (Equation 1). Then, the visual target was linearly reconstructed from the stimulation sequence using samples drawn from the single-electrode calibration data, under the assumption that temporal dithering maintains the independence of the responses evoked by single-electrode stimuli. Note that this assumption was later tested (see below).

The results of the analysis obtained by sampling from calibration data reveal the inferred visual reconstruction that is possible with greedy temporal dithering. During the stimulation sequence, the reconstructed image slowly built up to a spatially smooth version of the target image (Figure 2). Not surprisingly, ON (OFF) parasol cells were stimulated more than OFF (ON) parasol cells in bright (dark) regions of the target stimulus. Moreover, the reconstruction for individual trials was similar to the average across multiple trials, indicating that the noise from inter-trial response variation was relatively small.

Figure 2. — White noise target image shown on left. First column: cumulative stimulation count across electrodes after 500, 3000, and 10,000 electrical stimuli (A, B, and C, respectively). Second column: responses for ON (blue) and OFF (red) parasol cells, sampled according to the single-electrode calibration data. Shade indicates the cumulative number of spikes. Third column: single-trial and trial-averaged reconstruction of the target stimulus.

To quantify the performance of the greedy temporal dithering algorithm, its reconstruction error was compared to the error of an optimized approach for existing retinal implants, which map the intensity of the visual stimulus near each electrode to its stimulation current amplitude or frequency (Humayun et al., 2012; Stingl et al., 2013; Palanker et al., 2020). The performance of this static pixel-wise mapping was simulated with the lab prototype. Specifically, the current passed through each electrode was determined by a sigmoidal function of the intensity of the visual stimulus near the electrode, optimized at each electrode to minimize the reconstruction error across a training set of random checkerboard images (see Methods). This approach provides a generous upper bound to the performance of existing implants, because it uses actual response probabilities to optimize each sigmoidal function rather than relying on much more limited patient feedback, as is the case in existing retinal implants. Even using this generous upper bound, static pixel-wise mapping resulted in significantly less accurate reconstruction of the target image (Figure 3D, H), likely due to coactivation of overlapping ON and OFF cell types.

Figure 3. — (A) A sample target checkerboard image. (B) ON and OFF receptive fields shaded with the expected summed response from greedy temporal dithering. Achieved reconstructions are shown for (C) greedy temporal dithering using calibrated responses to single-electrode stimulation (8448 electrical stimuli), (D) static pixel-wise mapping approximating existing open-loop systems (5023 stimuli), (E) a lower error bound on the optimal algorithm for a single-electrode dictionary (1850 stimuli), and (F) perfect control with the available reconstruction filters. (G) Reconstruction error (relative mean squared error) between target and the expected achieved perception for 20 different targets (blue lines), with the example from C indicated with the green line. (H) Histogram of relative performance of the above approaches across 20 target images.

Greedy temporal dithering is nearly optimal given the interface constraints

What factors could improve the performance of the dynamically optimized stimulation approach? Broadly, performance could be limited either by the algorithm (greedy selection of electrical stimuli) or by the constraints of the interface (limited control of neural activity afforded by single-electrode stimulation).

To test whether performance could be improved with a different algorithm, the greedy approach was compared with a nearly optimal algorithm. The original optimization problem can be reformulated as:

m i n i m i z e_{w \geq 0} | | s - A D w | |^{2} + v^{T} w s u c h t h a t w \in Z_{+}

where $s \in (p i x e l s)$ is the target stimulus, $A \in (p i x e l s \times c e l l s)$ is the reconstruction filter, $D \in (c e l l s \times s t i m u l i)$ is a matrix of all response probabilities in the dictionary, $v \in (s t i m u l i)$ is the variance associated with dictionary elements and $w \in (s t i m u l i) \geq 0$ indicates the number of times each dictionary element is used. Because $w$ is an integer, this optimization problem is difficult to solve. However, an upper bound on the performance gap between the greedy algorithm and the optimal algorithm can be obtained by allowing non-integer values of $w$ . Across multiple target images, the gap was low (<10%, ‘optimal algorithm’ in Figure 3E, H), suggesting that the approximate nature of the greedy algorithm is not a substantial source of error in the present conditions.

To test whether performance could be improved with a more precise neural interface, the reconstruction error was compared with perfect control, in which any desired response pattern can be produced. The performance with perfect control was estimated by the solving the following optimization problem:

m i n i m i z e_{r \geq 0} | | s - A r | |^{2} s u c h t h a t r \in Z_{+}

where $r \in (c e l l s)$ is the vector of spike counts. This optimization problem was solved by relaxing the integer constraint on $r$ to obtain an upper bound on the performance gap between a single-electrode dictionary and an ideal dictionary. Across multiple targets, this gap was substantial (>40%, ‘perfect control’ in Figure 3F, H).

Thus, although the greedy algorithm is nearly optimal for a single-electrode dictionary, the reconstruction performance of an artificial retina could be improved by enhancing the dictionary (e.g. by using calibrated, optimized multi-electrode stimulation patterns; Jepson et al., 2014; Vasireddy et al., 2023).

Closed-loop experimental validation of greedy temporal dithering

The performance of the greedy temporal dithering algorithm was next tested empirically using closed-loop recording and stimulation in the isolated rat retina (see Methods). We compared the reconstruction of the visual stimulus from RGC responses evoked by the stimulation sequence to the reconstruction obtained using samples drawn from the single-electrode calibration data. First, reconstruction filters were obtained using visual stimulation and recording, and the responses to single-electrode stimulation were calibrated as described above. Then, the greedy temporal dithering stimulation sequence was computed during the experiment and delivered to the retina at an expanded 3-ms stimulation interval to facilitate spike sorting (see Methods). The evoked RGC responses to this sequence were then recorded and analyzed. The reconstructions obtained with the evoked RGC responses captured much of the spatial structure in each target image (Figure 4A–D). Notably, the spatial structure of reconstructions using the calibrated responses to single-electrode stimulation (‘calibrated’ in Figure 4, similar to Figures 2 and 3) and the RGC responses evoked by the stimulation sequence (‘evoked’ in Figure 4) were similar. These reconstructions reached asymptotic quality with 225 ± 29 stimulations (not shown), a significantly smaller number than in tests obtained with macaque retina (Figure 3G), likely in part because of the smaller number of cells in the rat recordings (see Discussion). Importantly, the reconstruction performance benefited from differential activation of ON and OFF cells over space in a way that reflected the spatial distribution of intensities in each target image (Figure 4E, F). This observation highlights the importance of electrical stimulation which approximates naturalistic RGC responses, in comparison with the static pixel-wise approaches used in existing implants.

Figure 4. — (A) Four sample target checkerboard images. Achieved reconstructions for these images are shown (B) assuming perfect control of retinal ganglion cell (RGC) firing with the available reconstruction filters, (C) using greedy temporal dithering based on calibrated single-electrode responses, and (D) and using the RGC responses evoked during electrical stimulation with greedy temporal dithering. (**E, F**) ON and OFF receptive fields shaded with the total number of evoked spikes. (G) Reconstruction error (relative mean squared error) across 20 target images for perfect control vs. greedy temporal dithering using calibrated responses (top) and for greedy temporal dithering using calibrated responses vs. evoked responses (bottom). Red points correspond to the four targets shown. Evoked RGC responses are averaged over 25 trials for each target (**D–G**).

The limitations to the performance of the dynamically optimized stimulation were further explored with two comparisons. First, the reconstruction error obtained using calibrated responses to single-electrode stimulation was higher than the error obtained under an assumption of ‘perfect control’ (i.e. that all recorded cells can be activated independently), reflecting the limitations of single-electrode stimulation (Figure 4G, top), as shown earlier (Figure 3H). Second, the reconstruction error obtained using the RGC responses evoked by the stimulation sequence was higher than the error obtained using calibrated responses to single-electrode stimulation (Figure 4G, bottom), presumably reflecting non-stationarity in the ex vivo recordings and/or failures of independence due to temporal dithering (Figure 4G, see Discussion).

Spatial multiplexing increases throughput within the visual integration window

The main requirement of temporal dithering – the independence of responses generated by individual electrical stimuli, and their summation within a time window for visual perception – could limit the throughput of electrical stimulation. Although independence can be ensured by spacing single-electrode stimuli widely in time, this approach could make it difficult to deliver many electrical stimuli within a visual integration window (e.g. tens of ms). One approach to maximizing the number of stimuli that can be delivered is spatial multiplexing, in which multiple electrodes are used simultaneously for stimulation if they are known to affect the firing probabilities of disjoint sets of cells. A simple example would be if electrodes separated by more than a particular distance D always influenced the firing of disjoint sets of cells. In this case, implementing a circular exclusion zone with radius D around each electrode at each time step of temporal dithering would be expected to produce the same cellular activation as was obtained with calibrated single-electrode stimulation. As in the original temporal dithering approach, at the subsequent time step, electrical stimuli with nonzero activation probability for any recently targeted cell would be omitted from consideration (Figure 5A).

Figure 5. — (A) Visualization of temporally dithered and spatially multiplexed stimulation. At each time step, multiple single-electrode stimuli are chosen greedily (gray circles) across the electrode array (black dots), separated by a spatial exclusion radius (red circles). (B) Estimation of the spatial exclusion radius using 754 total electrode pairs across 7 parasol cells from 4 peripheral macaque retina preparations (see Methods). Interaction between electrodes is measured by fractional deviation in activation threshold for a given cell on a primary electrode (ordinate) resulting from simultaneous stimulation of another electrode with identical current amplitude at varying separations (abscissa). Baseline represents the variability associated with estimating single-electrode activation thresholds.

To identify a spatial exclusion zone and test its effectiveness, activation curves corresponding to single-electrode stimulation were compared to activation curves obtained with additional simultaneous stimulation using a nearby secondary electrode at the same current level (see Methods). Examination of many electrode pairs over a range of distances and multiple cells (Figure 5B) revealed a systematic decrease of the interaction between stimulating electrodes as a function of distance. On average, the activation probability of a cell in response to single-electrode stimulation was affected relatively little (<5% fractional change in threshold) for a secondary electrode 200µm away. Thus, two electrodes more than this distance apart were unlikely to substantially influence the activation probability of the same cell(s). This suggests that spatial multiplexing of stimuli outside a spatial exclusion radius is a practical strategy for high-throughput stimulation with temporal dithering.

Dynamic optimization framework enables data-driven hardware design

The dynamic optimization framework suggests further optimizations for hardware efficiency. Because the greedy temporal dithering algorithm chooses electrodes in a spatially non-uniform manner over the array (Figure 6A), restricting stimulation to a more frequently chosen subset of electrodes could enhance efficiency. To test this possibility, the algorithm was applied with dictionaries restricted to a subset of the most frequently used electrodes, and calibrated performance was evaluated on twenty visual targets. In general, restricted dictionaries would be expected to reduce reconstruction performance. However, a minimal (<5%) increase in reconstruction error was observed if the number of available electrodes was reduced by up to 50% (Figure 6B–F). Note that this increase was not due to the greedy nature of stimulation, because a lower bound computed for an optimal algorithm showed similar behavior (Figure 6B). These observations suggest a strategy for efficient implant operation in a retina-specific manner: identify the most frequently used ~50% of electrodes during calibration and permanently turn off the remaining electrodes during run-time usage. Such a reduction in the set of stimulated electrodes could lead to reduced memory access and power consumption, with little loss in performance. Thus, applying the algorithmic framework to the ex vivo lab prototype leads to insights relevant to the development of an in vivo implant.

Figure 6. — (A) Frequency of stimulating different electrodes (size of gray circles), overlaid with axons (lines), and somas (colored circles) inferred from spatiotemporal spike waveform across the electrode array recorded from each cell. (B) Reconstruction error as a function of the fraction of electrodes included in the dictionary (black, thin lines correspond to different target images) and average over 20 target images (black, thick line). Different collections of target stimuli were used for electrode selection and reconstruction performance evaluation. Lower bound on error of any algorithm for the subsampled dictionaries for individual targets (green, thin lines) and averaged across targets (green, thick line). (C) Example target image. (**D–F**) Reconstructed images using the dictionary with most frequently used 20%, 60%, and 100% of electrodes, respectively.

Dynamic optimization framework extends to naturalistic viewing conditions

For practical application, the dynamically optimized stimulation approach must be extended to naturalistic viewing conditions, in which saccadic and fixational eye movements move the fovea over the scene for high-resolution vision (Figure 7A). Similarly, an implant fixed on the retina would move over the scene as the eye moves, and would only transmit the information about its restricted view of the image. The dynamic optimization framework extends naturally to this situation.

Figure 7. — (A) Conversion of a visual scene into dynamic stimulus. A target visual scene (left), with sample eye movement trajectory (blue). For each eye position, the population of ganglion cells accessible by the implant views a small portion of the visual scene (top right). The reconstructed stimulus for each patch captures the local stimulus information (bottom right). (B) Spike trains passed through a spatiotemporal reconstruction filter of the dynamic stimulus video. For simplicity, a rank one filter was used, which spatially filtered each spike bin independently, and then filtered the reconstructed stimulus video in time. (C) Final reconstruction performance over a sequence of saccades, in the absence (left) and the presence (right) of small fixational eye movements. (D) Reduction in reconstruction error of the visual scene as a function of the number of saccades, in the absence (blue) and the presence (orange) of fixational eye movements.

First, the objective, the constraint, and an algorithm for the corresponding optimization problem are identified. The objective function is modified to minimize the error between the original stimulus video and a video reconstructed from the RGC spike trains. For simplicity, a spatiotemporal reconstruction filter is used with separable spatial and temporal components and the same time course (with opposite polarity) for ON and OFF parasol cells (Figure 7B, see Methods). The constraints (measured electrical stimulation properties) are unchanged. Finally, the algorithm is adapted by choosing a dictionary element for each time step greedily to minimize the average error between the recent frames of the target stimulus seen by the implant and the corresponding frames of the reconstruction (see Methods).

This modified algorithm was evaluated using simulations of naturalistic viewing. For a given scene, a dynamic visual stimulus was generated by simulating saccadic eye movements with random inter-saccade intervals and random fixation locations with a preference for regions of the scene with high spatial-frequency content (Yarbus, 1967). Optionally, fixational eye movements were simulated by jittering the visual stimulus with Brownian motion (see Methods). As before, stimulation patterns were determined using greedy temporal dithering, and reconstruction was performed from calibrated responses to single-electrode stimulation. The dynamic visual stimulus covering the collection of recorded cells was reconstructed from the evoked spikes and the full visual scene was then assembled by stitching together the parts of the scene covering the cells at each time step.

The assembled visual scene closely matched the target, capturing many of its fine details (Figure 7C, Figure 7—video 1). Interestingly, the reconstructed visual scene was smoother and more accurate (lower reconstruction error) when fixational eye movements were simulated along with saccades (Figure 7C). Specifically, for the same final reconstruction error, a ~4× reduction in the number of required saccades and hence the number of required electrical stimuli was observed in the presence of fixational eye movements (Figure 7D). Hence, the performance of greedy temporal dithering translates to natural viewing conditions and reveals that more accurate image reconstruction is possible with fixational eye movements (Wu et al., 2024).

Optimizing stimulation using a perceptual similarity measure

The framework provides a natural way to use alternative metrics to optimize visual perception evoked by electrical stimulation. Specifically, the mean squared error (MSE) measure of reconstruction accuracy, while convenient, does not accurately capture perceived differences in image content, whereas error metrics such as Structural Similarity (SSIM) more closely parallel perception (Wang et al., 2004). To identify a nearly optimal sequence of stimuli with SSIM as the objective, an exhaustive approach was used to optimize across all possible stimulation sequences for every eye location (see Methods). The SSIM and MSE metrics produced similar reconstructions when the number of electrical stimulation patterns was unlimited (Figure 8B). This suggests that the choice of reconstruction error metric may not be important for an implant that can stimulate at high rates. However, SSIM optimization produced higher-quality reconstructions when the number of electrical stimuli was limited (Figure 8C), a constraint that may be relevant with short visual integration times or low power limits in an implant. Thus, the greedy dithering approach with a perceptually accurate reconstruction metric could lead to higher performance in an implanted device, though additional developments will be needed before such an optimization can be performed in real time.

Figure 8. — (A) Two target images. (B) Reconstruction with MSE and SSIM error metrics for greedy temporal dithering, with a high budget. (C) Same as B, with a low budget.

Discussion

This paper presents a dynamically optimized electrical stimulation approach to improve the performance of sensory electronic implants. Greedy temporal dithering and spatial multiplexing address the challenges of precisely controlling the activity of diverse cell types in a neural population by rapidly delivering a sequence of simple electrical stimuli with independent effects within a visual integration window. This approach avoids nonlinear interactions resulting from simultaneous multi-electrode stimulation while providing enough flexibility to elicit rich spatiotemporal response patterns using a sequence of single-electrode stimulation patterns. The greedy temporal dithering and spatial multiplexing approach outperforms existing approaches (Figures 2—5) and enables efficient neural interface development (Figure 6), potentially incorporating naturalistic viewing with eye movements (Figure 7) and/or perceptual similarity metrics (Figure 8).

The performance of the temporal dithering algorithm was primarily evaluated using calibrated responses from single-electrode stimulation (Figures 2, 3, and 6—8). In addition, validation of temporal dithering (Figure 4) was performed in closed-loop experiments by delivering the optimized electrical stimulation sequence and analyzing the evoked RGC responses. The reconstructions using these evoked RGC responses captured much of the spatial structure in each target image. However, these reconstructions were less accurate than the reconstructions based on the calibrated responses to single-electrode stimulation (which were used to compute the temporal dithering stimulation sequence during the experiment). At least two factors could contribute to this discrepancy: (1) non-stationarities in the electrical responses of RGCs in the ex vivo retina preparation, due to physiological changes and/or movement, and (2) failures of independence of interleaved stimulation over time relative to isolated single-electrode stimulation. Further experimental work will be needed to distinguish these possibilities. In addition, validation of temporal dithering at shorter stimulation intervals more relevant for in vivo application will require the development of spike sorting approaches that reliably operate in the presence of complex electrical artifacts.

Assumptions underlying the temporal dithering and spatial multiplexing approach

The presented approach relies on several assumptions regarding how the brain uses RGC responses for vision. A major assumption is that downstream processing of RGC responses is slow, integrating over tens of milliseconds, so that perception depends primarily on the total number of evoked spikes within this time interval. The assumption of slow visual processing is based on evidence ranging from flicker fusion experiments to the time scale of synaptic transfer to neurophysiological tests of temporal integration (Wandell, 1995; Tadin et al., 2010; Samaha and Postle, 2015; Wutz et al., 2016; Borghuis et al., 2019). However, there is also empirical evidence that the temporal precision of spikes in RGCs in certain conditions is on the order of 1 ms (Berry et al., 1997; Reich et al., 1997; Berry and Meister, 1998; Keat et al., 2001; Uzzell and Chichilnisky, 2004) and that downstream mechanisms could potentially read out these temporally precise RGC signals (Alonso et al., 1996). It remains unclear from these studies exactly how this high temporal precision would be important for vision. Studies of readout of RGC signals from the macaque retina have shown that for reconstruction of images from responses to flashed stimuli, and for speed and direction discrimination with moving stimuli, ~10 ms temporal resolution of readout from RGC signals is optimal (Chichilnisky and Kalmar, 2002; Frechette et al., 2005; Brackbill et al., 2020; Wu et al., 2024). Nonetheless, finer temporal precision could be part of the neural code of RGCs under certain visual stimulus conditions, such as compensation for fixational eye drift (Wu et al., 2024). In sum, much evidence supports the idea that the temporal resolution of RGC signal readout in the brain is likely to be on the order of tens of milliseconds for many visual tasks, but this may not be true for all conditions.

Another important assumption is that visual sensations produced in the brain are based on linear reconstruction of the incident image from retinal inputs. This is a first-order approximation to facilitate real-time optimization of electrical stimulation, and is almost certainly wrong in detail. A more accurate model could involve replacing this with a nonlinear and/or biologically realistic reconstruction (Parthasarathy et al., 2017; Kim et al., 2021; Wu et al., 2022). However, these approaches are far more complex and cannot yet be optimized in real time.

This work also relies on several empirically tested assumptions about electrical stimulation. First, the brief (150 µs) and low (<4 µA) current pulses used in this study typically evoke a single spike with low and precise latencies (e.g. 0.73 ± 0.05 ms Sekirnjak et al., 2011), in part because the mechanism of activation is direct depolarization rather than network-mediated excitation. Second, electrical stimuli at the same electrode separated by at least 10 ms generate approximately independent responses (Talaminos-Barroso et al., 2020), making temporal dithering possible. Third, distant electrodes generate independent responses (Figure 5), making spatial multiplexing possible. While these electrical stimulation properties may be widely applicable, they should be tested and quantified in each neural circuit before applying the temporal dithering and spatial multiplexing approach.

Finally, the approach relies on the assumption that it is possible to deliver a sufficiently large number of electrical stimuli within a visual integration time to produce high quality artificial vision. The total number of electrical stimuli required depends on the number of cells targeted, their expected firing rates for the visual image, and distribution of RGC activation probabilities in the electrical stimulus dictionary. Future work should identify how these factors vary across individuals, species, and neural circuits.

Extensions and broader applicability of the proposed approach

The modular nature of the dynamic optimization approach enables several potential extensions. The single-electrode dictionary could be enhanced with multi-electrode stimulation patterns designed to optimize cellular selectivity, response diversity, or ideally the overall expected algorithm performance (Jepson et al., 2014; Fan et al., 2019; Vilkhu et al., 2021; Vasireddy et al., 2023). Additionally, the efficient and real-time greedy algorithm could be replaced with an algorithm that identifies the optimal stimulation sequence for multiple time steps, perhaps accounting for predicted future saccade locations. Finally, each module could be optimized for metrics such as performance, hardware efficiency (e.g. Figure 6B), or stability/robustness for chronic function.

The greedy temporal dithering and spatial multiplexing approach relies on the ability to efficiently compute and deliver optimal electrical stimuli in a bidirectional electronic implant. Although this procedure exceeds the capabilities of current devices, two features of the approach support implementation on implantable hardware with a limited size and power budget.

First, the hardware requirements of the proposed closed-loop procedure – calibration of single-electrode stimuli followed by dynamically optimized stimulation for encoding visual scenes – are less stringent than those of a real-time closed-loop system. In the latter, one records and analyzes the results of every electrical stimulus to determine the stimulus for the next time step. Instead, the present approach relies only on identifying the average electrical response properties of each cell in advance. After this initial calibration, the stimulation sequence is decided in an open-loop manner by optimizing the expected visual reconstruction associated with each electrical stimulus.

Second, the approach can effectively exploit non-selective activation. Selective activation of every cell in a region of the retina, if achievable, would make it possible to create arbitrary patterns of neural activity, but in practice this is difficult with real neural interfaces. The presented framework uses all available stimulation patterns as efficiently as possible by directly optimizing for stimulus reconstruction. In fact, the approach frequently exploits non-selective activation, in order to evoke desired spiking activity in fewer time steps (Figure 6).

Translational potential

The physiological similarities between human and macaque retina (Cowan et al., 2019; Kling et al., 2020; Soto et al., 2020; Rodieck, 1998), including their responses to electrical stimulation (Madugula et al., 2022), suggest that the benefits of the present stimulation approach could translate from the ex vivo lab prototype with healthy macaque retina to an in vivo implant in the degenerated human retina. However, several technical innovations are required to enable chronic in vivo recording and stimulation. First, new surgical methods must be developed to implant a tiny and high-density chip on the surface of the retina with close and stable contact, to create a lasting, stable interface with RGCs. Second, modifications to the dynamic optimization approach are necessary to mitigate non-stationarities in electrical response properties, which are common in chronic recordings with multi-electrode array implants (Perge et al., 2013; Downey et al., 2018). Third, receptive field locations, cell types, and reconstruction filters must be inferred using spike waveform features and spontaneous activity rather than light-evoked responses in a blind retina (Sekirnjak et al., 2011; Li et al., 2015; Richard et al., 2015; Zaidi et al., 2022). Fourth, the stimulation approach must be modified to account for changes in spontaneous/oscillatory activity in the degenerated retina (Sekirnjak et al., 2011; Goo et al., 2015; Trenholm and Awatramani, 2015). Finally, the approach must be tested with the visual and electrical properties in the central retina (Gogliettino et al., 2023), the most clinically relevant location for a retinal implant.

The present approach could also leverage other methods developed for improving the performance of existing, low-resolution implants. Examples of these methods are context-dependent image preprocessing (Cha et al., 1992; McCarthy et al., 2011; Lieby et al., 2011; Vergnieux et al., 2017; Ho et al., 2019), limiting to sparse stimulation (Loudin et al., 2007), exploiting the adaptation of sensory systems (Rouger et al., 2007; Merabet and Pascual-Leone, 2010), and exploiting perceived phosphenes due to axon bundle activation for optimizing stimulation (Granley et al., 2022; de Ruyter van Steveninck et al., 2022; Relic et al., 2022). Ideally, a unified framework such as the one presented here would include these and potentially other approaches to optimal stimulation.

Electrical stimulation of the visual cortex has also been tested for vision restoration (Chen et al., 2020), and one study (Beauchamp et al., 2020) deployed a dynamic stimulation approach that demonstrated impressive performance in human participants. However, both studies only considered simple visual stimuli which can be described by lines (such as English letters and numbers) or a few dots. The dynamic optimization framework presented here could provide a way to precisely control neural activity for arbitrarily complex stimuli and improve the performance of a range of cortical sensory implants.

Methods

Retinal preparation

Extracellular multi-electrode recording and stimulation in macaque retina were performed as described previously (Jepson et al., 2013; Grosberg et al., 2017). Briefly, eyes were obtained from terminally anesthetized macaque monkeys used for experiments in other laboratories, in accordance with Institutional Animal Care and Use Committee guidelines. After enucleation, the eyes were hemisected and the vitreous humor was removed. The hemisected eye cups were stored in oxygenated bicarbonate-buffered Ames’ solution (Sigma) during transport to the laboratory. The retina was then isolated from the pigment epithelium under infrared illumination and held RGC side down on a custom multi-electrode array (see below). Throughout the experiments, the retina was superfused with Ames’ solution at 35°C.

For experimental validation of temporal dithering, eyes were obtained from adult Long-Evans rats in accordance with Institutional Animal Care and Use Committee guidelines. Immediately after enucleation, the anterior portion of the eye and vitreous humor were removed under infrared illumination and the eye cup placed in oxygenated bicarbonate-buffered Ames’ solution. The retina was then isolated under infrared illumination and held RGC side down on a custom multi-electrode array. Throughout the experiments, the retina was superfused with Ames’ solution at 31°C.

Electrical recordings

A custom 512-electrode stimulation and recording system (Hottowy et al., 2008; Hottowy et al., 2012) was used to deliver electrical stimuli and record spikes from RGCs. The electrodes were organized in a 16 × 32 isosceles triangular lattice arrangement, with 60 μm spacing between electrodes (Litke et al., 2004). Electrodes were 10 μm in diameter and electroplated with platinum black. For recording, raw voltage signals from the electrodes were amplified, filtered (43–5000 Hz), and multiplexed with custom circuitry. These voltage signals were sampled with commercial data acquisition hardware (National Instruments) at 20 kHz per channel. For recording and stimulation, a platinum ground wire circling the recording chamber served as a distant ground.

Electrical stimulation

For electrical stimulation, custom hardware (Hottowy et al., 2012) was controlled by commercial multifunction cards (National Instruments). Current was passed through each of the 512 electrodes individually, with 40 different amplitudes (0.1–4 μA, logarithmically spaced), 27 times each. For each amplitude, charge-balanced triphasic current pulses with relative amplitudes of 2:−3:1 and phase widths of 50 μs (total duration 150 μs) were delivered through the stimulating electrode (amplitude corresponds to the magnitude of the second, cathodal phase of the pulse). This pulse shape was chosen to reduce stimulation artifacts in the recordings. Custom circuitry disconnected the recording amplifiers during stimulation, reducing stimulation artifacts and making it possible to identify elicited spikes on the stimulating electrode as well as nearby electrodes (Hottowy et al., 2012; Jepson et al., 2013).

Visual stimulation

Recordings obtained with visual stimulation were analyzed to identify spike waveforms of distinct RGCs recorded, using spike sorting methods described previously (Field and Chichilnisky, 2007; Litke et al., 2004). Specifically, the spike times of each cell were typically identified using relatively large spikes detected near the soma. Then, the complete spatiotemporal signature of the spikes from each cell over all electrodes (the electrical image) was computed by averaging the voltage waveforms on all electrodes at and near the times of its recorded spikes (Litke et al., 2004). This electrical image provided a template of the cell’s spatiotemporal spike waveform, which was then used to identify spikes evoked from cells by electrical stimulation.

Distinct RGC types were identified by their visual responses to a 30-min long white noise stimulus (80 × 40 pixel grid, ~44 µm pixels, refresh rate 120 Hz, low photopic light level). After the stimulus presentation the average stimulus that preceded a spike in each RGC was computed, producing the spike-triggered average (STA) stimulus (Chichilnisky and Kalmar, 2002). The STA summarizes the spatial, temporal, and chromatic properties of light responses. Spatial receptive fields were obtained using the spatial sensitivity profile of the STA (Chichilnisky and Kalmar, 2002). Features of the STA were used to segregate functionally distinct RGC types (Rhoades et al., 2019). For each identified RGC type, the receptive fields formed a regular mosaic covering the region of retina recorded (Devries and Baylor, 1997; Field and Chichilnisky, 2007), confirming the correspondence to a morphologically distinct RGC type (Dacey, 1993; Wässle et al., 1981), and in some cases revealing complete recordings from the population. The density and light responses of the five most frequently recorded RGC types uniquely identified them as ON and OFF midget, ON and OFF parasol and small bistratified cells. Subsequent analysis was restricted to two numerically dominant RGC types in the macaque retina – ON and OFF parasol cells – which were sampled efficiently in our experiment and formed nearly complete mosaics covering the region recorded. In the rat retina, analysis was restricted to two RGC types – putative ON brisk transient and OFF brisk transient cells identified by their receptive fields and spiking autocorrelation (Ravi et al., 2018) – which formed reasonably complete mosaics covering the region recorded and had properties broadly resembling those of ON and OFF parasol cells in the macaque retina.

Temporal dithering algorithm

Given the stimulus reconstruction filter and electrical stimulation dictionary, the goal of the greedy temporal dithering algorithm is to identify a sequence of electrical stimuli that encodes a target visual stimulus.

Let $s \in (p i x e l s)$ be the target visual stimulus, $A \in (p i x e l s \times c e l l s)$ be the stimulus reconstruction filter, and $r_{i} \in (c e l l s)$ be the observed response vector (consisting of a zero or a one for each cell) produced in the population of cells stimulated using electrical stimulation dictionary element $c_{i}$ with the associated probability vector $p_{c_{i}} \in (c e l l s)$ .

Multiple dictionary elements must be combined to generate rich spatiotemporal population responses that capture the visual information in a target visual stimulus. We therefore define the objective as finding a sequence of dictionary elements ${c_{1}, \dots, c_{T}}$ that minimizes the expected mean squared error between the target visual stimulus and the image reconstructed from the sequence of responses ${r_{1}, \dots, r_{T}}$ . The responses are stochastic with $r_{i} \sim B e r n o u l l i (p_{c_{i}})$ and we assume that responses are generated independently across time steps. The resulting objective function is:

c_{1}, . . ., c_{T} = a r g m i n E_{r_{i} \sim B e r n o u l l i (p_{c_{i}})} | | s - A (r_{1} + . . . + r_{T}) | |^{2}

To efficiently solve this optimization problem, the right-hand side can be decomposed into bias and variance terms as follows:

E_{r_{i} \sim B e r n o u l l i (p_{c_{i}})} | | s - A (p_{c_{1}} + . . . + p_{c_{T}}) + A (p_{c_{1}} + . . . + p_{c_{T}}) - A (r_{1} + . . . + r_{T}) | |^{2}

(2)

= | | s - A (p_{c_{1}} + . . . + p_{c_{T}}) | |^{2} + E | | A (p_{c_{1}} + . . . + p_{c_{T}}) - A (r_{1} + . . . + r_{T}) | |^{2}

(3)

= | | s - A (p_{c_{1}} + . . . + p_{c_{T}}) | |^{2} + E | | A (p_{c_{1}} - r_{1}) | |^{2} + . . . + E | | A (p_{c_{T}} - r_{T}) | |^{2}

(4)

= | | s - A (p_{c_{1}} + . . . + p_{c_{T}}) | |^{2} + t r (C o v (A r_{1})) + \dots + t r (C o v (A r_{T}))

(5)

where Equation 2 follows from adding and subtracting $A (p_{c_{1}} + . . . + p_{c_{T}})$ ; Equation 3 expands the square of summation and uses the fact that $E A r_{i} = A p_{c_{i}}$ to zero out the cross term; and Equation 4 uses the fact that neural responses at different time steps are independent of each other.

The expression ( $t r (C o v (A r_{i}))$ ) corresponds to the total variance across all pixels of the visual image for the dictionary element $c$ chosen at step i. When cells respond independently to electrical stimulation, this variance term simplifies to $\sum_{n} | | a_{n} | |^{2} p_{n, c} (1 - p_{n, c})$ , where $p_{n, c}$ is the activation probability of cell $n$ with dictionary element $c$ , and $a_{n}$ is the reconstruction filter for cell $n$ (the $n$ th column of $A$ ).

Below, we present two methods to solve the above optimization problem and two additional methods to solve relaxed optimization problems for comparison purposes. The first method provides a greedy solution which can be deployed in real time and handles dependencies between successive stimuli. The second method provides an upper bound on the optimal solution by jointly optimizing a vector of the number of times each dictionary element is used. A third method relaxes the above objective to provide an upper bound on performance with no electrical stimulation constraints by directly optimizing the number of spikes for each cell. Finally, a fourth method approximates the function of present-day retinal implants by optimizing a mapping between visual stimulus intensity and current amplitude for each electrode.

For the last three methods which generate solutions irrespective of the order of stimulation, a temporal dithering strategy for ordering the electrical stimuli was assumed so that the stimuli would not interfere with one another. The performance for all methods was evaluated in the same manner, by linearly reconstructing the target stimulus from the identified electrical stimulation sequence using samples drawn from the single-electrode calibration data.

Greedy optimization

Instead of jointly optimizing for the whole stimulation sequence, which is difficult, a greedy approach is used for efficiency. This approach optimizes the choice of stimulation at time step $t$ after fixing the stimulation sequence up to step $t - 1$ . The choice of dictionary element $c_{t}$ at time step $t$ is only affected by the first and last terms of Equation 5. Hence, the greedy objective function for choosing the dictionary element at time step $t$ is given by:

c_{t} = a r g m i n_{c \in D_{t}} | | s - A (p_{c_{1}} + . . . + p_{c_{t - 1}} + p_{c}) | |^{2} + \sum_{n} | | a_{n} | |^{2} p_{n, c} (1 - p_{n, c})

Instead of using a fixed dictionary of stimulation patterns $D$ for all time steps, biological and hardware constraints on the stimulation sequence can be incorporated by changing the dictionary elements $D_{t}$ available at each time step. For example, interactions between time steps produced by refractoriness were avoided by removing dictionary elements that activate cells with probability >0.1 if those cells were targeted with probability >0.1 in the last 100 steps.

Approximate joint optimization

Instead of selecting the dictionary elements step by step, the optimal number of times each dictionary element should ideally be selected (irrespective of the order of stimulation) can be identified by reformulating the objective function as follows:

m i n i m i z e_{w} | | s - A D w | |^{2} + v^{T} w s u c h t h a t w \in Z_{+}

Here, $w \in (s t i m u l i)$ is a vector of non-negative integers corresponding to the number of times each dictionary element is stimulated, $D \in (c e l l s \times s t i m u l i)$ is a matrix of activation probabilities for each cell and dictionary element, and $v \in (s t i m u l i)$ is the variance in decoding corresponding to each dictionary element (given by $v_{c} = \sum_{n} | | a_{n} | |^{2} p_{n, c} (1 - p_{n, c})$ ).

The optimization problem is NP complete due to the integer constraints on $w$ . However, it can be approximately solved by relaxing the integer constraint, resulting in the following optimization problem and upper bound on performance:

m i n i m i z e_{w} | | s - A D w | |^{2} + v^{T} w s u c h t h a t w \geq 0

Perfect control optimization

To give an estimate of the best possible reconstruction when there is no constraint on electrical stimulation, the number of spikes for each cell can be directly optimized (irrespective of the order of stimulation) using the following objective function:

m i n i m i z e_{r \geq 0} | | s - A r | |^{2} s u c h t h a t r \in Z_{+}

Here, $r \in (c e l l s)$ is a vector of non-negative integers corresponding to the number of times each cell spike. Again, this optimization problem can be approximately solved by relaxing the integer constraint on $r$ resulting in the following optimization problem and upper bound on performance:

m i n i m i z e_{r} | | s - A r | |^{2} s u c h t h a t r \geq 0

Static pixel-wise optimization

To approximate the function of present-day retinal implants, a mapping was learned between the intensity of the visual stimulus near each electrode and the intensity of the current passed through that electrode, to determine which electrical stimuli to deliver (irrespective of the order of stimulation).

First, an affine transformation mapped the visual stimulus onto the electrode array. Second, the average visual stimulus intensity was identified over an approximately 130 µm × 130 µm region around the electrode location. Third, the average visual stimulus intensity on the electrode $(s)$ was mapped to the current amplitude $(i)$ using a scaled sigmoid:

i = a + b / (1 + e x p (c s + d))

This electrical stimulus was then delivered $n$ times at that electrode. All parameters ${a, b, c, d, n}$ for each electrode were simultaneously optimized to minimize reconstruction error across a training set of random checkerboard images.

Analysis of temporal dithering using calibrated responses

Greedy temporal dithering was analyzed in Figures 2, 3, and 6 using calibrated responses to visual stimulation (stimulus reconstruction filter) and electrical stimulation (dictionary of single-electrode response probabilities).

The stimulus reconstruction filter for each RGC was approximated using the scaled receptive field (Brackbill et al., 2020). Briefly, the receptive field was obtained by computing the spatial component of the rank 1 approximation of the STA. The receptive field was then denoised by computing the robust standard deviation ( $σ$ ) of the magnitudes of all pixels, zeroing out pixels with absolute value less than 2.5 $σ$ , and retaining the largest spatially contiguous component. Finally, the receptive fields were scaled such that a linear-rectified response model most accurately predicted the average spike count recorded within 50 ms of the onset of static flashed checkerboard stimuli. Note that because the stimulus reconstruction filter is proportional to the receptive field, this approximation matches an optimal linear decoder only if the receptive fields of the cells are orthogonal. This is approximately true: RGC receptive fields form a uniform mosaic sampling the visual field with little overlap (Devries and Baylor, 1997; Chichilnisky and Kalmar, 2002).

Response probabilities for each single-electrode stimulation pattern were identified after removing electrical artifacts using custom spike sorting software (Mena et al., 2017). Briefly, the spike sorting software estimates the electrical stimulation artifacts by modeling the artifact change across amplitudes with a Gaussian process, subtracts the estimated electrical artifacts from the recording and then matches the residual spikes to cell waveforms obtained from recordings obtained with a visual stimulus. The cell activation probabilities for each of the 40 different amplitudes were replaced by values from a sigmoid fitted to all levels, and collected into a dictionary. Each element in the dictionary consisted of an electrode, a stimulus current level, and the evoked spike probability for all recorded cells (typically, for any given electrode and current level, only a few nearby cells had nonzero spike probability).

Dictionary elements that involved activating distant cells along their axons were removed due to the unknown receptive field locations and thus uncertain contribution to stimulus reconstruction (Grosberg et al., 2017). Briefly, the responses to electrical stimulation were mapped to a collection of weighted graphs, and graph partitioning and graph traversal algorithms were applied to identify axon bundle activity. The focus was on two characteristic features of axon bundle signals: bidirectional propagation, and growth of signal amplitude with stimulation current (Tandon et al., 2021).

Finally, only dictionary elements that activated at least one cell with probability at least 0.01 were retained, resulting in 1000–5000 dictionary elements. A single dictionary element that does not activate any cell (probability = 0) was added to allow the greedy algorithm to avoid stimulation when no available stimulation pattern would decrease error.

Once visual and electrical responses were calibrated, the greedy temporal dithering sequence was applied to 20 static black and white random checkerboard targets, each for up to 10,000 steps. Responses were randomly sampled using calibrated dictionary probabilities and used to reconstruct the stimulus. Reconstruction error was reported as the squared error between the target and reconstruction normalized by the target squared (relative mean squared error).

Validation of temporal dithering using experimentally evoked responses

Responses evoked by the greedy temporal dithering approach during a closed-loop experiment with the rat retina were analyzed in Figure 4. A 15-min long white noise visual stimulus (80 × 40 pixel grid, ~44 µm pixels, refresh rate 30 Hz, low photopic light level) recording was used to identify cell locations, types, and spike waveforms as described above. Subsequent analysis was restricted to two numerically dominant ON and OFF cell types, which each formed nearly complete mosaics across the array. Next, an optimal linear reconstruction filter was computed for these cells (Gogliettino et al., 2023). Briefly, a linear-nonlinear cascade model was used to simulate RGC responses to white noise images by half-rectifying the inner product of the STAs and the visual stimuli. The response model was scaled to predict the average spike count recorded within 250 ms of the onset of static flashed checkerboard stimuli presented to the retina during the experiment. The optimal linear reconstruction filter was computed using linear least-squares regression of the model responses against the stimuli for a training set of checkerboard stimuli with varying pixel sizes (352, 220, 176, 110, 88, and 55 μm), using 10,000 training images each (60,000 images total).

A single-electrode stimulation scan (42 amplitudes at each of the 512 electrodes individually, 0.1–4 μA, linearly spaced, 15 times each) was then used to compute an electrical stimulus dictionary. Response probabilities for each electrical stimulus were identified using a custom template matching approach (Gogliettino et al., 2023) and axon bundle activation thresholds were determined by automated methods as described above (Tandon et al., 2021). Briefly, the custom template matching approach performs unsupervised clustering on the voltage traces for each stimulation pattern, then iteratively compares the difference signals between the clusters to cell waveforms estimated from visual stimulus recordings. Response probabilities for each stimulation pattern were smoothed with a sigmoid fitted across amplitudes and collected in the dictionary. Dictionary elements that elicited axon bundle activity or had at least one cell with a poor sigmoid fit were permanently removed from the dictionary. A single dictionary element that did not activate any cell was included to avoid stimulation when any real stimulation pattern would increase error.

After calibration of visual and electrical response properties, a greedy stimulation sequence was computed for each of a collection of 20 random checkerboard visual targets (10 × 5 pixel grid, ~352 µm pixels) during the closed-loop experiment using a greedy temporal dithering implementation optimized for speed (Lotlikar et al., 2023). This implementation evaluates dictionary elements in parallel across several distinct regions involving disjoint groups of cells in order to speed up selection of the electrical stimulation sequence. Each stimulation sequence was assembled until the objective could not be decreased further, resulting in 171–288 stimulations. The stimulation sequence for each target was delivered 25 times at an expanded 3-ms stimulation interval to separate the recorded voltage signals from prior and future electrical stimulation artifacts. The expanded stimulation interval was necessary to determine evoked RGC response probabilities for each target, which were identified using the same custom template matching approach used for the single-electrode stimulation scan.

Characterizing spatial exclusion radius for spatial multiplexing

The spatial exclusion radius was estimated using a bi-electrode stimulation experiment. The initial response dictionary was characterized using single-electrode stimulation as described above. A target cell was chosen, and the activation curve over the standard current range (42 amplitudes, 0.1–4 µA, linearly spaced) was determined for this cell using the electrode that recorded the largest amplitude spike waveform (primary electrode). Stimulation characterization was then repeated on this electrode with equal current passed simultaneously through a secondary electrode, and the changes in the activation curve relative to the original curve were examined. All secondary electrodes within 400 μm of the primary electrode were tested. The single-electrode activation curve was also re-estimated using secondary electrodes more than 800 μm from the primary electrode and not overlapping the axon of the target cell. Evoked spikes were identified using the spike sorting approach described for the validation experiment above.

The two-electrode stimulation produced an activation curve for each electrode pair, from which the activation threshold (estimated current amplitude producing 50% spike probability) was determined. The fractional change from the single-electrode activation threshold was computed for each secondary electrode, revealing the degree to which the presence of a secondary stimulating electrode influences the responses generated by a particular primary electrode. Figure 5B summarizes the absolute change in threshold with increasing distance between stimulating electrodes, generated by computing the weighted mean and the resampled standard error of the weighted mean for test pairs near each distance. The weighting for each electrode pair was inversely proportional to the variance of the single-electrode activation threshold for that cell.

Extension of greedy dithering to natural scenes with eye movements

The greedy temporal dithering approach was extended to natural viewing by modifications to visual stimulus target generation and reconstruction. For a given natural image, a dynamic visual target was generated by simulating eye movements. A sequence of five hundred fixation locations were sampled, preferentially in the high spatial-frequency regions of the image, with a mean duration of 300 ms (SD 100 ms) between saccades. A patch of size 40 × 80 was taken around each saccade location to generate the dynamic visual stimulus. In some cases, fixational eye movements were also simulated by perturbing the fixation location with a brownian motion (3 pixel SD).

The greedy algorithm was modified such that the stimulation choice at each step considered multiple recent frames of the target. The dynamic target was discretized on the display at 120 Hz, and 83 stimulation choices were made within each frame (corresponding to a stimulation every 0.1 ms). To accommodate the dynamic stimulus, the spatial reconstruction filter was replaced with a spatiotemporal reconstruction filter. For efficiency, the spatiotemporal reconstruction filter was modeled as rank 1 (space–time separable), with the identical time course for all cells and opposite polarity for ON and OFF cells. Hence, each evoked spike influences the reconstruction at multiple subsequent time steps. The straightforward extension of the greedy algorithm is then to choose a stimulation pattern at each time step such that it minimizes the total error over multiple time steps.

For a given stimulation sequence, the image is assembled by first reconstructing each frame of the dynamic visual stimulus using the spatiotemporal reconstruction filter. Then, each frame of the reconstructed dynamic stimulus is ‘pasted’ at the fixation location at the time of the spike. The intensity for each pixel in the final reconstructed image is estimated by averaging the intensity across all fixation locations in which the recorded cells have reconstruction filters that include the pixel.

Incorporating perceptual similarity metrics

Possible improvements to the approach that could be produced by optimizing perceptual similarity (rather than mean squared error) in the stimulation objective were analyzed after simplifying modifications. First, instead of image-dependent and random fixation locations, all possible saccade locations were considered. This corresponds to a uniform distribution of fixation locations, and the visual scene is reconstructed by averaging the reconstruction of image patches corresponding to various fixation locations. Next, for each fixation location, the corresponding image patch was reconstructed using expected responses (rather than measured, stochastic responses). Note that unlike the algorithm presented above, this formulation does not account for inter-trial variability. Finally, instead of greedily optimizing the stimulation sequence, the number of stimuli for all dictionary elements and fixation locations were jointly optimized. Given these simplifications, the following optimization problem was solved:

m i n_{{w_{i}} \geq 0} d (s, G ({A D w_{i}}_{i = 1}^{i = # p a t c h e s})) + λ \sum_{i} ‖ w_{i} ‖_{1}

where $d$ is the measure of similarity, $s \in (p i x e l s)$ is the target visual stimulus, $A \in (p i x e l s \times c e l l s)$ is the reconstruction filter, $D \in (c e l l s \times s t i m u l i)$ is the dictionary, $w_{i} \in (s t i m u l i)$ is the number of times each dictionary element is stimulated for patch i, and $G$ is an operator that averages the reconstruction of individual patches to assemble the entire image. To explore the reconstruction under different stimulation budgets, $λ$ is varied to penalize stimulating a large number of dictionary elements.

Acknowledgements

We thank J Carmena, K Bankiewicz, T Moore, W Newsome, M Taffe, T Albright, E Callaway, H Fox, R Krauzlis, S Moriarty, and the California National Primate Research Center for access to macaque retinas. We thank the Stanford Artificial Retina team for helpful discussions. We thank ALS Association Milton Safenowitz fellowship (NPS), NSF Graduate Research Fellowship Grant No. 2146755 and NSF Grant No. 1828993 (AJP), NIH NEI F30-EY-030776-03 (SM), NIH NIMH T32MH-020016, NIH NEI F31-EY-033636, the Fondation Bertarelli, the Stanford Neurosciences Graduate Program (AG), Polish Academy of Sciences DEC-2013/10/M/NZ4/00268 (PH), Research to Prevent Blindness Stein Innovation Award, Wu Tsai Neurosciences Institute Big Ideas, NIH NEI R01-EY021271, NIH NEI R01-EY029247, NIH NEI P30-EY019005, and NSF/CRCNS grant (EJC) for funding this work.

Funding Statement

The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.

Contributor Information

Nishal Pradeepbhai Shah, Email: bhaishahster@gmail.com.

AJ Phillips, Email: andrewjp@stanford.edu.

Michael Beyeler, University of California, Santa Barbara, United States.

Joshua I Gold, University of Pennsylvania, United States.

Funding Information

This paper was supported by the following grants:

ALS Association to Nishal Pradeepbhai Shah.
National Science Foundation Graduate Research Fellowship 2146755 to AJ Phillips.
National Science Foundation 1828993 to AJ Phillips.
National Eye Institute F30-EY-030776-03 to Sasidhar Madugula.
National Institute of Mental Health T32MH-020016 to Alex R Gogliettino.
National Eye Institute F31-EY-033636 to Alex R Gogliettino.
The Fondation Bertarelli to Alex R Gogliettino.
The Stanford Neurosciences Graduate Program to Alex R Gogliettino.
Polish Academy of Sciences DEC-2013/10/M/NZ4/00268 to Pawel Hottowy.
National Eye Institute R01-EY021271 to EJ Chichilnisky.
National Eye Institute R01-EY029247 to EJ Chichilnisky.
National Eye Institute P30-EY019005 to EJ Chichilnisky.
National Science Foundation NSF/CRCNS to EJ Chichilnisky.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Conceptualization, Resources, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Investigation, Methodology, Writing – review and editing.

Resources, Data curation, Software, Investigation.

Methodology, Writing – review and editing.

Resources, Software, Investigation, Methodology.

Methodology, Writing – review and editing.

Investigation, Writing – review and editing.

Methodology, Writing – review and editing.

Resources.

Methodology, Writing – review and editing.

Supervision, Writing – review and editing.

Conceptualization, Resources, Supervision, Funding acquisition, Validation, Investigation, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Additional files

MDAR checklist

elife-83424-mdarchecklist1.docx^{(99.6KB, docx)}

Data availability

Data and code are available on Dryad at https://doi.org/10.5061/dryad.pk0p2ngrv.

The following dataset was generated:

Shah N, Phillips AJ, Madugula S, Lotlikar A, Gogliettino A, Hays M, Grosberg L, Brown J, Dusi A, Tandon P, Hottowy P, Dabrowski W, Sher A, Litke A, Mitra S, Chichilnisky EJ. 2024. Data from: Precise control of neural activity using dynamically optimized electrical stimulation. Dryad Digital Repository.

References

Alonso JM, Usrey WM, Reid RC. Precisely correlated firing in cells of the lateral geniculate nucleus. Nature. 1996;383:815–819. doi: 10.1038/383815a0. [DOI] [PubMed] [Google Scholar]
Beauchamp MS, Oswalt D, Sun P, Foster BL, Magnotti JF, Niketeghad S, Pouratian N, Bosking WH, Yoshor D. Dynamic stimulation of visual cortex produces form vision in sighted and blind humans. Cell. 2020;181:774–783. doi: 10.1016/j.cell.2020.04.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
Berry MJ, Warland DK, Meister M. The structure and precision of retinal spike trains. PNAS. 1997;94:5411–5416. doi: 10.1073/pnas.94.10.5411. [DOI] [PMC free article] [PubMed] [Google Scholar]
Berry MJ, Meister M. Refractoriness and neural precision. The Journal of Neuroscience. 1998;18:2200–2211. doi: 10.1523/JNEUROSCI.18-06-02200.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
Beyeler M, Nanduri D, Weiland JD, Rokem A, Boynton GM, Fine I. A model of ganglion axon pathways accounts for percepts elicited by retinal implants. Scientific Reports. 2019;9:9199. doi: 10.1038/s41598-019-45416-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bloch E. The argus ii retinal prosthesis system. Prosthesis. 2020;01:e4947. doi: 10.5772/intechopen.84947. [DOI] [Google Scholar]
Borghuis BG, Tadin D, Lankheet MJM, Lappin JS. Temporal limits of visual motion processing: psychophysics and neurophysiology. Vision. 2019;01:e0005. doi: 10.3390/vision3010005. [DOI] [PMC free article] [PubMed] [Google Scholar]
Brackbill N, Rhoades C, Kling A, Shah NP, Sher A, Litke AM, Chichilnisky EJ. Reconstruction of natural images from responses of primate retinal ganglion cells. eLife. 2020;9:e58516. doi: 10.7554/eLife.58516. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cha K, Horch KW, Normann RA. Mobility performance with a pixelized vision system. Vision Research. 1992;32:1367–1372. doi: 10.1016/0042-6989(92)90229-c. [DOI] [PubMed] [Google Scholar]
Chen X, Wang F, Fernandez E, Roelfsema PR. Shape perception via a high-channel-count neuroprosthesis in monkey visual cortex. Science. 2020;370:1191–1196. doi: 10.1126/science.abd7435. [DOI] [PubMed] [Google Scholar]
Chichilnisky EJ, Kalmar RS. Functional asymmetries in ON and OFF ganglion cells of primate retina. The Journal of Neuroscience. 2002;22:2737–2747. doi: 10.1523/JNEUROSCI.22-07-02737.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
Choi JS, Brockmeier AJ, McNiel DB, von Kraus L, Príncipe JC, Francis JT. Eliciting naturalistic cortical responses with a sensory prosthesis via optimized microstimulation. Journal of Neural Engineering. 2016;13:056007. doi: 10.1088/1741-2560/13/5/056007. [DOI] [PubMed] [Google Scholar]
Cowan CS, Renner M, Gross-Scherf B, Goldblum D, Munz M, Krol J, Szikra T, Papasaikas P, Cuttat R, Waldt A, Diggelmann R, Patino-Alvarez CP, Gerber-Hollbach N, Schuierer S, Hou Y, Srdanovic A, Balogh M, Panero R, Hasler PW, Kusnyerik A, Szabo A, Stadler MB, Orgül S, Hierlemann A, Scholl HPN, Roma G, Nigsch F, Roska B. Cell types of the human retina and its organoids at single-cell resolution: developmental convergence, transcriptomic identity, and disease map. Cell. 2019;182:1623–1640. doi: 10.1016/j.cell.2020.08.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dacey DM. The mosaic of midget ganglion cells in the human retina. The Journal of Neuroscience. 1993;13:5334–5355. doi: 10.1523/JNEUROSCI.13-12-05334.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
de Ruyter van Steveninck J, Güçlü U, van Wezel R, van Gerven M. End-to-end optimization of prosthetic vision. Journal of Vision. 2022;22:20. doi: 10.1167/jov.22.2.20. [DOI] [PMC free article] [PubMed] [Google Scholar]
Devries SH, Baylor DA. Mosaic arrangement of ganglion cell receptive fields in rabbit retina. Journal of Neurophysiology. 1997;78:2048–2060. doi: 10.1152/jn.1997.78.4.2048. [DOI] [PubMed] [Google Scholar]
Downey JE, Schwed N, Chase SM, Schwartz AB, Collinger JL. Intracortical recording stability in human brain-computer interface users. Journal of Neural Engineering. 2018;15:046016. doi: 10.1088/1741-2552/aab7a0. [DOI] [PubMed] [Google Scholar]
Fan VH, Grosberg LE, Madugula SS, Hottowy P, Dabrowski W, Sher A, Litke AM, Chichilnisky EJ. Epiretinal stimulation with local returns enhances selectivity at cellular resolution. Journal of Neural Engineering. 2019;16:025001. doi: 10.1088/1741-2552/aaeef1. [DOI] [PMC free article] [PubMed] [Google Scholar]
Field GD, Chichilnisky EJ. Information processing in the primate retina: circuitry and coding. Annual Review of Neuroscience. 2007;30:1–30. doi: 10.1146/annurev.neuro.30.051606.094252. [DOI] [PubMed] [Google Scholar]
Flesher SN, Collinger JL, Foldes ST, Weiss JM, Downey JE, Tyler-Kabara EC, Bensmaia SJ, Schwartz AB, Boninger ML, Gaunt RA. Intracortical microstimulation of human somatosensory cortex. Science Translational Medicine. 2016;8:361ra141. doi: 10.1126/scitranslmed.aaf8083. [DOI] [PubMed] [Google Scholar]
Frechette ES, Sher A, Grivich MI, Petrusca D, Litke AM, Chichilnisky EJ. Fidelity of the ensemble code for visual motion in primate retina. Journal of Neurophysiology. 2005;94:119–135. doi: 10.1152/jn.01175.2004. [DOI] [PubMed] [Google Scholar]
Gaylor JM, Raman G, Chung M, Lee J, Rao M, Lau J, Poe DS. Cochlear implantation in adults: a systematic review and meta-analysis. JAMA Otolaryngology-- Head & Neck Surgery. 2013;139:265–272. doi: 10.1001/jamaoto.2013.1744. [DOI] [PubMed] [Google Scholar]
Gogliettino AR, Madugula SS, Grosberg LE, Vilkhu RS, Brown J, Nguyen H, Kling A, Hottowy P, Dąbrowski W, Sher A, Litke AM, Chichilnisky EJ. High-fidelity reproduction of visual signals by electrical stimulation in the central primate retina. The Journal of Neuroscience. 2023;43:4625–4641. doi: 10.1523/JNEUROSCI.1091-22.2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
Goo YS, Park DJ, Ahn JR, Senok SS. Spontaneous oscillatory rhythms in the degenerating mouse retina modulate retinal ganglion cell responses to electrical stimulation. Frontiers in Cellular Neuroscience. 2015;9:512. doi: 10.3389/fncel.2015.00512. [DOI] [PMC free article] [PubMed] [Google Scholar]
Granley J, Relic L, Beyeler M. A Hybrid Neural Autoencoder for Sensory Neuroprostheses and Its Applications in Bionic Vision. arXiv. 2022 http://arxiv.org/abs/2205.13623 [PMC free article] [PubMed]
Grosberg LE, Ganesan K, Goetz GA, Madugula SS, Bhaskhar N, Fan V, Li P, Hottowy P, Dabrowski W, Sher A, Litke AM, Mitra S, Chichilnisky EJ. Activation of ganglion cells and axon bundles using epiretinal electrical stimulation. Journal of Neurophysiology. 2017;118:1457–1471. doi: 10.1152/jn.00750.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
Haji Ghaffari D, Akwaboah AD, Mirzakhalili E, Weiland JD. Real-time optimization of retinal ganglion cell spatial activity in response to epiretinal stimulation. IEEE Transactions on Neural Systems and Rehabilitation Engineering. 2021;29:2733–2741. doi: 10.1109/TNSRE.2021.3138297. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ho E, Boffa J, Palanker D. Performance of complex visual tasks using simulated prosthetic vision via augmented-reality glasses. Journal of Vision. 2019;19:22. doi: 10.1167/19.13.22. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hottowy P, Dąbrowski W, Skoczeń A, Wiącek P. An integrated multichannel waveform generator for large-scale spatio-temporal stimulation of neural tissue. Analog Integrated Circuits and Signal Processing. 2008;55:239–248. doi: 10.1007/s10470-007-9125-x. [DOI] [Google Scholar]
Hottowy P, Skoczeń A, Gunning DE, Kachiguine S, Mathieson K, Sher A, Wiącek P, Litke AM, Dąbrowski W. Properties and application of a multichannel integrated circuit for low-artifact, patterned electrical stimulation of neural tissue. Journal of Neural Engineering. 2012;9:066005. doi: 10.1088/1741-2560/9/6/066005. [DOI] [PMC free article] [PubMed] [Google Scholar]
Humayun MS, Dorn JD, da Cruz L, Dagnelie G, Sahel JA, Stanga PE, Cideciyan AV, Duncan JL, Eliott D, Filley E, Ho AC, Santos A, Safran AB, Arditi A, Del Priore LV, Greenberg RJ, Argus II Study Group Interim results from the international trial of Second Sight's visual prosthesis. Ophthalmology. 2012;119:779–788. doi: 10.1016/j.ophtha.2011.09.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jepson LH, Hottowy P, Mathieson K, Gunning DE, Dabrowski W, Litke AM, Chichilnisky EJ. Focal electrical stimulation of major ganglion cell types in the primate retina for the design of visual prostheses. The Journal of Neuroscience. 2013;33:7194–7205. doi: 10.1523/JNEUROSCI.4967-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jepson LH, Hottowy P, Mathieson K, Gunning DE, Dąbrowski W, Litke AM, Chichilnisky EJ. Spatially patterned electrical stimulation to enhance resolution of retinal prostheses. The Journal of Neuroscience. 2014;34:4871–4881. doi: 10.1523/JNEUROSCI.2882-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
Johnson LA, Wander JD, Sarma D, Su DK, Fetz EE, Ojemann JG. Direct electrical stimulation of the somatosensory cortex in humans using electrocorticography electrodes: a qualitative and quantitative report. Journal of Neural Engineering. 2013;10:036021. doi: 10.1088/1741-2560/10/3/036021. [DOI] [PMC free article] [PubMed] [Google Scholar]
Keat J, Reinagel P, Reid RC, Meister M. Predicting every spike: a model for the responses of visual neurons. Neuron. 2001;30:803–817. doi: 10.1016/s0896-6273(01)00322-1. [DOI] [PubMed] [Google Scholar]
Kim YJ, Brackbill N, Batty E, Lee J, Mitelut C, Tong W, Chichilnisky EJ, Paninski L. Nonlinear decoding of natural images from large-scale primate retinal ganglion recordings. Neural Computation. 2021;33:1719–1750. doi: 10.1162/neco_a_01395. [DOI] [PubMed] [Google Scholar]
Kling A, Gogliettino AR, Shah NP, Wu EG, Brackbill N, Sher A, Litke AM, Silva RA, Chichilnisky EJ. Functional organization of midget and parasol ganglion cells in the human retina. Neuroscience. 2020;01:e0762. doi: 10.1101/2020.08.07.240762. [DOI] [Google Scholar]
Li PH, Gauthier JL, Schiff M, Sher A, Ahn D, Field GD, Greschner M, Callaway EM, Litke AM, Chichilnisky EJ. Anatomical identification of extracellularly recorded cells in large-scale multielectrode recordings. The Journal of Neuroscience. 2015;35:4663–4675. doi: 10.1523/JNEUROSCI.3675-14.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lieby P, Barnes N, McCarthy C, Dennett H, Walker JG, Botea V, Scott AF. Substituting Depth for Intensity and Real-Time Phosphene Rendering: Visual Navigation under Low Vision Conditions. Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2011. pp. 8017–8020. [DOI] [PubMed] [Google Scholar]
Litke AM, Bezayiff N, Chichilnisky EJ, Cunningham W, Dabrowski W, Grillo AA, Grivich M, Grybos P, Hottowy P, Kachiguine S, Kalmar RS, Mathieson K, Petrusca D, Rahman M, Sher A. What does the eye tell the brain?: Development of a system for the large-scale recording of retinal output activity. IEEE Transactions on Nuclear Science. 2004;51:1434–1440. doi: 10.1109/TNS.2004.832706. [DOI] [Google Scholar]
Lotlikar A, Shah NP, Gogliettino AR, Vilkhu R, Madugula S, Grosberg L, Hottowy P, Sher A, Litke A, Chichilnisky EJ, Mitra S. Partitioned Temporal Dithering for Efficient Epiretinal Electrical Stimulation. 11th International IEEE/EMBS Conference on Neural Engineering; 2023. pp. 1–5. [DOI] [Google Scholar]
Loudin JD, Simanovskii DM, Vijayraghavan K, Sramek CK, Butterwick AF, Huie P, McLean GY, Palanker DV. Optoelectronic retinal prosthesis: system design and performance. Journal of Neural Engineering. 2007;4:S72–S84. doi: 10.1088/1741-2560/4/1/S09. [DOI] [PubMed] [Google Scholar]
Madugula SS, Gogliettino AR, Zaidi M, Aggarwal G, Kling A, Shah NP, Brown JB, Vilkhu R, Hays MR, Nguyen H, Fan V, Wu EG, Hottowy P, Sher A, Litke AM, Silva RA, Chichilnisky EJ. Focal electrical stimulation of human retinal ganglion cells for vision restoration. Journal of Neural Engineering. 2022;19 doi: 10.1088/1741-2552/aca5b5. [DOI] [PMC free article] [PubMed] [Google Scholar]
McCarthy C, Barnes N, Lieby P. 33rd Annual International Conference. 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2011. pp. 4457–4460. [DOI] [PubMed] [Google Scholar]
Mena GE, Grosberg LE, Madugula S, Hottowy P, Litke A, Cunningham J, Chichilnisky EJ, Paninski L. Electrical stimulus artifact cancellation and neural spike detection on large multi-electrode arrays. PLOS Computational Biology. 2017;13:e1005842. doi: 10.1371/journal.pcbi.1005842. [DOI] [PMC free article] [PubMed] [Google Scholar]
Merabet LB, Pascual-Leone A. Neural reorganization following sensory loss: the opportunity of change. Nature Reviews. Neuroscience. 2010;11:44–52. doi: 10.1038/nrn2758. [DOI] [PMC free article] [PubMed] [Google Scholar]
Palanker D, Le Mer Y, Mohand-Said S, Muqit M, Sahel JA. Photovoltaic restoration of central vision in atrophic age-related macular degeneration. Ophthalmology. 2020;127:1097–1104. doi: 10.1016/j.ophtha.2020.02.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
Parthasarathy N, Batty E, Falcon W, Rutten T, Rajpal M, Chichilnisky EJ, Paninski L. Neural networks for efficient bayesian decoding of natural images from retinal neurons. Neuroscience. 2017;01:e3759. doi: 10.1101/153759. [DOI] [Google Scholar]
Perge JA, Homer ML, Malik WQ, Cash S, Eskandar E, Friehs G, Donoghue JP, Hochberg LR. Intra-day signal instabilities affect decoding performance in an intracortical neural interface system. Journal of Neural Engineering. 2013;10:036004. doi: 10.1088/1741-2560/10/3/036004. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ravi S, Ahn D, Greschner M, Chichilnisky EJ, Field GD. Pathway-specific asymmetries between on and off visual signals. The Journal of Neuroscience. 2018;38:9728–9740. doi: 10.1523/JNEUROSCI.2008-18.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
Reich DS, Victor JD, Knight BW, Ozaki T, Kaplan E. Response variability and timing precision of neuronal spike trains in vivo. Journal of Neurophysiology. 1997;77:2836–2841. doi: 10.1152/jn.1997.77.5.2836. [DOI] [PubMed] [Google Scholar]
Relic L, Zhang B, Tuan YL, Beyeler M. Deep Learning–Based Perceptual Stimulus Encoder for Bionic Vision. arXiv. 2022 https://arxiv.org/abs/2203.05604
Rhoades CE, Shah NP, Manookin MB, Brackbill N, Kling A, Goetz G, Sher A, Litke AM, Chichilnisky EJ. Unusual physiological properties of smooth monostratified ganglion cell types in primate retina. Neuron. 2019;103:658–672. doi: 10.1016/j.neuron.2019.05.036. [DOI] [PMC free article] [PubMed] [Google Scholar]
Richard E, Goetz GA, Chichilnisky EJ. In Advances in Neural Information Processing Systems 28. Curran Associates, Inc; 2015. Recognizing Retinal Ganglion Cells in the Dark; pp. 2476–2484. [Google Scholar]
Rodieck RW. The First Steps in Seeing. Sinauer Associates Incorporated; 1998. [Google Scholar]
Rouger J, Lagleyre S, Fraysse B, Deneve S, Deguine O, Barone P. Evidence that cochlear-implanted deaf patients are better multisensory integrators. PNAS. 2007;104:7295–7300. doi: 10.1073/pnas.0609419104. [DOI] [PMC free article] [PubMed] [Google Scholar]
Salas A, Michelle LB, Kellis S, Jafari M, Jo H, Kramer D, Shanfield K. Proprioceptive and cutaneous sensations in humans elicited by intracortical microstimulation. eLife. 2018;07:e2904. doi: 10.7554/eLife.32904. [DOI] [PMC free article] [PubMed] [Google Scholar]
Samaha J, Postle BR. The speed of alpha-band oscillations predicts the temporal resolution of visual perception. Current Biology. 2015;25:2985–2990. doi: 10.1016/j.cub.2015.10.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sekirnjak C, Jepson LH, Hottowy P, Sher A, Dabrowski W, Litke AM, Chichilnisky EJ. Changes in physiological properties of rat ganglion cells during retinal degeneration. Journal of Neurophysiology. 2011;105:2560–2571. doi: 10.1152/jn.01061.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
Soto F, Hsiang JC, Rajagopal R, Piggott K, Harocopos GJ, Couch SM, Custer P, Morgan JL, Kerschensteiner D. Efficient coding by midget and parasol ganglion cells in the human retina. Neuron. 2020;107:656–666. doi: 10.1016/j.neuron.2020.05.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
Stingl K, Bartz-Schmidt KU, Besch D, Braun A, Bruckmann A, Gekeler F, Greppmaier U, Hipp S, Hörtdörfer G, Kernstock C, Koitschev A, Kusnyerik A, Sachs H, Schatz A, Stingl KT, Peters T, Wilhelm B, Zrenner E. Artificial vision with wirelessly powered subretinal electronic implant alpha-IMS. Proceedings. Biological Sciences. 2013;280:20130077. doi: 10.1098/rspb.2013.0077. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tadin D, Lappin JS, Blake R, Glasser DM. High temporal precision for perceiving event offsets. Vision Research. 2010;50:1966–1971. doi: 10.1016/j.visres.2010.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tafazoli S, MacDowell CJ, Che Z, Letai KC, Steinhardt CR, Buschman TJ. Learning to control the brain through adaptive closed-loop patterned stimulation. Journal of Neural Engineering. 2020;17:056007. doi: 10.1088/1741-2552/abb860. [DOI] [PubMed] [Google Scholar]
Talaminos-Barroso A, Reina-Tosina J, Roa-Romero LM. In Control Applications for Biomedical Engineering Systems. Elsevier; 2020. Models Based on Cellular Automata for the Analysis of Biomedical Systems; pp. 405–445. [DOI] [Google Scholar]
Tandon P, Bhaskhar N, Shah N, Madugula S, Grosberg L, Fan VH, Hottowy P, Sher A, Litke AM, Chichilnisky EJ, Mitra S. Automatic identification of axon bundle activation for epiretinal prosthesis. IEEE Transactions on Neural Systems and Rehabilitation Engineering. 2021;29:2496–2502. doi: 10.1109/TNSRE.2021.3128486. [DOI] [PMC free article] [PubMed] [Google Scholar]
Trenholm S, Awatramani GB. Origins of spontaneous activity in the degenerating retina. Frontiers in Cellular Neuroscience. 2015;9:277. doi: 10.3389/fncel.2015.00277. [DOI] [PMC free article] [PubMed] [Google Scholar]
Uzzell VJ, Chichilnisky EJ. Precision of spike trains in primate retinal ganglion cells. Journal of Neurophysiology. 2004;92:780–789. doi: 10.1152/jn.01171.2003. [DOI] [PubMed] [Google Scholar]
Vasireddy PK, Gogliettino AR, Brown JB, Vilkhu RS, Madugula SS, Phillips AJ, Mitral S, Hottowy P, Sher A, Litke A, Shah NP, Chichilnisky EJ. Efficient Modeling and Calibration of Multi-Electrode Stimuli for Epiretinal Implants. 2023 11th International IEEE/EMBS Conference on Neural Engineering (NER; Baltimore, MD, USA. 2023. [DOI] [Google Scholar]
Vergnieux V, Macé MJM, Jouffrais C. Simplification of visual rendering in simulated prosthetic vision facilitates navigation. Artificial Organs. 2017;41:852–861. doi: 10.1111/aor.12868. [DOI] [PubMed] [Google Scholar]
Vilkhu RS, Madugula SS, Grosberg LE, Gogliettino AR, Hottowy P, Dabrowski W, Sher A, Litke AM, Mitra S, Chichilnisky EJ. Spatially patterned bi-electrode epiretinal stimulation for axon avoidance at cellular resolution. Journal of Neural Engineering. 2021;18 doi: 10.1088/1741-2552/ac3450. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wandell BA. Foundations of Vision. Sinauer Associates, Incorporated; 1995. [Google Scholar]
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing. 2004;13:600–612. doi: 10.1109/tip.2003.819861. [DOI] [PubMed] [Google Scholar]
Warland DK, Reinagel P, Meister M. Decoding visual information from a population of retinal ganglion cells. Journal of Neurophysiology. 1997;78:2336–2350. doi: 10.1152/jn.1997.78.5.2336. [DOI] [PubMed] [Google Scholar]
Wässle H, Peichl L, Boycott BB. Dendritic territories of cat retinal ganglion cells. Nature. 1981;292:344–345. doi: 10.1038/292344a0. [DOI] [PubMed] [Google Scholar]
Wu EG, Brackbill N, Sher A, Litke AM, Simoncelli EP, Chichilnisky EJ. Maximum a posteriori natural scene reconstruction from retinal ganglion cells with deep denoiser priors. Neuroscience. 2022;01:e2737. doi: 10.1101/2022.05.19.492737. [DOI] [Google Scholar]
Wu EG, Brackbill N, Rhoades C, Kling A, Gogliettino AR, Shah NP, Sher A, Litke AM, Simoncelli EP, Chichilnisky EJ. Fixational Eye Movements Enhance the Precision of Visual Information Transmitted by the Primate Retina. bioRxiv. 2024 doi: 10.1101/2023.08.12.552902. [DOI] [PMC free article] [PubMed]
Wutz A, Muschter E, van Koningsbruggen MG, Weisz N, Melcher D. Temporal Integration Windows in Neural Processing and Perception Aligned to Saccadic Eye Movements. Current Biology. 2016;26:1659–1668. doi: 10.1016/j.cub.2016.04.070. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yarbus AL. Eye Movements and Vision. Springer; 1967. [DOI] [Google Scholar]
Zaidi M, Aggarwal G, Shah NP, Karniol-Tambour O, Goetz G, Madugula S, Gogliettino AR, Wu EG, Kling A, Brackbill N, Sher A, Litke AM, Chichilnisky EJ. Inferring Light Responses of Primate Retinal Ganglion Cells Using Intrinsic Electrical Signatures. J Neural Eng. 2022;20:e3858. doi: 10.1101/2022.05.29.493858. [DOI] [PMC free article] [PubMed] [Google Scholar]

eLife. doi: 10.7554/eLife.83424.sa0

Editor's evaluation

Michael Beyeler ¹

This valuable study proposes a new algorithm for determining the electrical stimulation delivered through a sensory-neural/retinal implant with the aim of improving the perceptual benefit to implant users. The evidence supporting the conclusions is solid, with additional experiments and analyses submitted during the revision having significantly strengthened the study. The work will be of interest to both neuroscientists and neuroengineers.

eLife. doi: 10.7554/eLife.83424.sa1

Decision letter

Editor: Michael Beyeler¹

Reviewed by: Mohit Shivdasani²

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Decision letter after peer review:

Thank you for submitting your article "Precise control of neural activity using temporally dithered and spatially multiplexed electrical stimulation" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Joshua Gold as the Senior Editor. The following individual involved in the review of your submission has agreed to reveal their identity: Mohit Shivdasani (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

Included here is a brief evaluation summary and list of revisions the reviewers and review editor deem essential for the authors to address. The public summaries and full, individual reviewers' recommendations for the authors are also appended below. The authors are advised to address the public summaries briefly, and the individual recommendations in a detailed, point-by-point manner.

As you will be able to read below, reviewers appreciated the importance of the study and its potentially broad interest. The approach to formulating the problem of choosing electrical stimuli for visual prostheses as a data-driven optimization problem holds promise for several sensory-neural prostheses. The writing was relatively clear, the figures appropriate, and the methods mostly rigorous. However, reviewers raised concerns with regard to some of the claims made, particularly as it pertains to the full greedy, dithering, multiplexed algorithm and its potential to greatly improve the quality of vision delivered by a retinal implant. The key points that need to be addressed can be summarized as follows:

1) Please provide more experimental data (specifically: reconstructed images and reconstruction errors) to substantiate the claim that the algorithm can improve the quality of vision in an ex vivo setting. The main evidence that is presented about the quality of vision that might be achieved is a computer simulation; that is, the image reconstructions and reconstruction errors given in Figures 2 & 3 with the dithered, but not multiplexed version of the algorithm. However, the same cannot be said about the ex vivo experiment. While the outputs of the dithered & multiplex version were indeed applied to ex vivo retina, the output is all simulated and no experimental validation data is presented. However, it is possible that the experimentally observed retinal output might differ from (the assumption of) a linear sum of dictionary elements. All reviewers agreed that if the authors could report on the results of the experiments in which algorithmic stimulation is applied to ex vivo retina, and report this in terms of image reconstructions and reconstruction error, this would greatly improve the strength of evidence.

2) Please expand the discussion on the theoretical assumptions regarding visual processing in the brain and the perception of phosphenes through electrical stimulation that the study is based on, as it may limit the translational impact of the work. All reviewers agreed that the study relies on several significant assumptions about neural coding in the retina, the visual brain, and interactions between electrodes, some of which have been recently challenged. This includes the assumption that neural coding in the retina is solely based on a firing-rate code, that visual perception is solely based on the number of spikes within a slow temporal integration window, and that non-simultaneous interleaved electrical stimulation does not lead to neural interactions. At a minimum, the authors should address these limitations clearly in their Discussion, and comment on the potential implications of the failure of these assumptions on their algorithm performance.

3) Please clarify which figures/results are from simulations and which are from experimental data.

Reviewer #1:

Shah et al. propose an algorithm to precisely control RGC activation using electrical stimulation, using temporal dithering, and spatial multiplexing. The main assumption is that the brain has perceptual integration windows, during which a visual percept can be built up by stimulating single (or small groups of) neurons in rapid succession. Which electrodes to stimulate to achieve a desired percept is dictated by a dictionary of stimulation patterns. The authors demonstrate the effectiveness of their method on ex vivo recordings of ON and OFF parasol cells.

The biggest strengths of the study are the theoretical contributions and the experimental recordings used to demonstrate the effectiveness of their algorithm. The thinking follows a number of recent efforts in the field to think about visual prosthetic stimulation as a closed-loop data-driven optimization problem. This may have benefits over other open-loop stimulation techniques.

However, the biggest weakness of the study is a reliance on a number of controversial assumptions about the neural code of vision. The first is the existence of a slow temporal integration window during which the brain cannot distinguish the order of stimuli presented and/or sums up RGC activity to decode the presented stimulus. The paper presents only limited (and dated) evidence for this. Second, the assumption of a Bernoulli distribution is at the very least limiting, as neurons may respond with multiple spikes to a stimulation pattern and some spatial features may be encoded by the relative timing of spikes across neurons. Third, even though the temporal dithering may avoid electrical crosstalk, there may still be neuronal crosstalk on longer timescales, thus challenging the independence assumption. Extending the delay between stimuli in order to avoid neuronal crosstalk may severely limit the utility of the proposed algorithm since it would cap the number of stimuli that could be delivered in one of the assumed temporal integration windows.

Even if the assumptions hold, the presented evidence of the ex vivo recordings would need to provide some additional detail before the practical utility of the proposed algorithm could be judged accordingly. Although the study reports reconstruction errors and example reconstructed images for the simulation experiment, the same cannot be said about the ex vivo experiment. In addition, real-world implementation of the algorithm would presumably include not just stimulus delivery but also in-the-loop stimulus decoding, and it is not clear how quickly that could be done. Lastly, the linear filters would have to be estimated in a degenerated retina, where one could not rely on responses to light stimuli. The paper notes that this could be done by considering the spontaneous activity of cells – I can see how that could allow you to distinguish between ON and OFF cells, for instance, but it is not clear to me how that would allow you to determine the linear filter for each cell. For those reasons, it is somewhat hard to judge the practical utility and potential significance of the proposed algorithm.

Recommendations for the authors:

Methods: The assumption of a Bernoulli distribution seems limiting, as neurons may respond with more than one spike. The decoding overall does not take into account that (at least some) visual information may be encoded in the relative timing of spikes across neurons.

p.7 algorithm section: The major underlying assumption here is that temporally dithered stimuli will be linearly integrated by the retina. Dithering may avoid electrical crosstalk, but neurons may exhibit "crosstalk" on longer timescales due to the relatively slow (as compared to electrical stimulation) temporal dynamics of ion channels. The authors seem to be aware of that as they say "presumably" and mention the idea of a perceptual integration window. But it would have been great to refer to some existing (and more recent) literature on the topic (if any).

p.8 algorithm section: Greedy algorithms are often not guaranteed to find the global optimum. Can the authors show that their proposed algorithm does not get stuck in local optima? How is the final optimization performed at time t, given the dictionary elements?

Ex vivo experiments: It would be helpful to see reconstruction errors and reconstructed images, similar to how they were presented for the simulation study. How long does the decoding/stimulus selection take? Presumably, the dictionary is quite large given the number of neurons. Is there a more efficient way to search the dictionary than O(N)? Or how is that done, and how quickly could it be done in a real-world implementation? I am concerned that this step may severely limit the number of stimuli that could be realistically delivered in a temporal integration window.

In all equations, it would help if the authors used proper formatting to label vectors vs. matrices. For example, is the stimulus reconstruction filter in Eq. 1 a vector and the product is elementwise? What is the size of (and what are the rows/columns in) D on page 8? Etc.

Reviewer #2:

This study proposes a new algorithm for determining the electrical stimulation delivered through a sensory neural implant with the aim of improving the perceptual benefit to implant users. The algorithm is evaluated using data from an ex vivo prototype of a retinal prosthesis to computer-simulate the retinal responses expected from applying the algorithm and later by applying stimuli from the full temporally dithered, spatially multiplex algorithm to ex vivo retina.

Presently, stimulation algorithms used clinically are calibrated using limited perceptual data from the user. In contrast, the proposed algorithm uses detailed measurements of retinal responses to electrical stimulation to optimize the stimulation. This is achieved by minimizing the error between a target image and a version of that image reconstructed from the evoked response that is predicted by the algorithm based on the detailed measurements. The use of a data-driven, optimization approach is similar to several other recently proposed neural stimulation algorithms (which are not cited by the study). The distinguishing feature of the algorithm proposed in this study is that it seeks to stimulate in a way that minimizes the interactions between electrodes that can occur when stimulating neural responses. This avoids the need for the algorithm to account for such interactions.

Overall, the main advantage of the proposed approach is that it frames the problem of how to deliver perceptually beneficial electrical stimulation with an implant as a closed-loop/data-driven optimization problem. This has the potential to improve over presently used open-loop strategies. It then provides an algorithm for solving this optimization problem to a good approximation. Applying the algorithm using data recorded from ex vivo retina with a prototype implant is a strength. However, the evaluation of the efficacy of the algorithm is limited. In the first instance, it is limited to computer simulation of the retinal response for the version of the algorithm that uses just temporal dithering. While this analysis supports the conclusion that the proposed algorithm could provide improved visual perception relative to the clinical open-loop strategy, much stronger evidence would be provided by applying the optimal stimuli from the proposed algorithm directly to the ex vivo retinal preparation and measuring the retinal response. This approach to testing the algorithm directly to the ex vivo retina is done for the full version of the algorithm that combines spatial multiplexing with temporal dithering. However, in contrast to simulated results, the study does not report on the reconstructed images that result from applying the algorithm to ex vivo retina, nor on the reconstruction errors. This makes it difficult to evaluate the efficacy of the algorithm.

Section: Introduction.

The motivation for using a temporally dithered, spatially multiplexed algorithm to optimize stimulation stems from the desire to minimize the interactions caused by simultaneous stimulation of the electrodes in evoking a neural response. While this is an important strategy to investigate, the interactions are not typically as "complex" as claimed in the manuscript. Indeed, previous studies in several labs (including the Chichilnisky lab) show that these interactions can typically be described by a linear, weighted sum of the electrode currents followed by a simple static nonlinearity to predict the probability of spiking (a small minority of retinal ganglion cells require more complex nonlinear descriptions) [1 -3]. This model and others have been the basis for alternative data-driven, closed-loop stimulation strategies that optimize the stimulation in a way that seeks to take advantage of the interactions between electrodes to improve the spatial resolution of evoked retinal activity through "current steering".

Section: Greedy temporal dithering to replicate neural code.

Data-driven optimization: The data required for the proposed algorithm is of two types. An exhaustive dictionary of response probabilities to single electrode stimulation across all current amplitudes, and a set of responses used to reconstruct the target image from the predicted response to electrical stimulation. For the latter, reconstruction of the image is achieved by applying linear filters to the predicted response. In the study, these linear filters were derived from cells' receptive fields, obtained by measured responses in the retina to light stimulation. It is noted that this would not be possible in a clinical implant, as the retina is degenerate. However, it is not clear how a set of filters would be obtained in this case. The authors mention that distinct cell types can be identified from spontaneous activity. However, this does not explain how receptive field size and location would be estimated in this situation.

The reconstruction of the image is achieved through linear filtering with a matrix A, with columns, A_j, that are the (scaled) receptive field filters (Eq. 1). However, this is only correct if the receptive field filters of the different cells are orthogonal, i.e. the inner product of each pair of receptive fields is zero. More generally, appropriate linear filtering should be performed by applying the pseudo inverse of the transpose of A. This is because the retinal spike rates are being approximated as the inner product of the receptive field and the image (A_j transposed, matrix-multiplied by the image vector), half-wave rectified. For the receptive fields of ON and OFF parasol cells given in the study, it appears that the receptive fields are approximately orthogonal for the two separate populations due to the non-overlapping tiling of the visual field by each population (e.g. Figure 2). However, it is not clear whether this situation would prevail in the blind retina, as the filters have not been specified in the case.

The greedy optimization algorithm is insufficiently explained in the Methods, including the following points:

• A derivation justifying splitting the objective function into the terms due to the mean and variance is required.

• The terminology for the terms tr(var(A R_i)) is not clearly explained. I assume it is the matrix trace of the covariance matrix of the random variable A R_i.

• The assumption of a Bernoulli random variable for the response, i.e. 1 or 0 spikes, is limiting, given there may be multiple spikes in response to electrical stimulation, especially for activation via the retinal network.

• The expression that was derived for the term tr(var(A R_i)) in the case of Bernoulli random variables should be given.

• It is not explained how the algorithm performs the final optimization at time step t, given elements in a restricted dictionary D_t.

Section: Greedy temporal dithering outperforms open loop methods.

The image reconstruction shown in Figure 2 uses 500, 3000, and 10000 electrical stimuli (shown in A, B and C respectively). However, these are unrealistically large numbers of stimuli: given the temporal perceptual window of 50 ms, mentioned in the Introduction as the time over which retinal responses would be perceptually integrated, and the pulse duration of 0.15 ms used in the study, a maximum of 333 stimuli could be applied during the window. Consequently, the use of 3000 and 10,000 electrical stimuli in the simulations provides unrealistic estimates of the degree to which the image can be reconstructed.

A full comparison of the proposed greedy, closed-loop algorithm to the conventional open-loop algorithm is difficult to evaluate based on the results presented. First, the number of electrical stimuli applied in making the comparison (Figure 3H) is not given. However, it seems likely, given the data in Figure 3G that an unrealistically large 10,000 stimuli were used. If instead a realistic 300-400 stimuli were used there may be little difference between the greedy-closed loop algorithm and the conventional open-loop algorithm.

A second limitation is that, in this subsection, the greedy, closed-loop algorithm appears to have only been tested in simulation. E.g. "For random checkerboard visual stimulus targets, the greedy dithering stimulation sequence was calculated, neural responses were sampled using measured response probabilities evoked by the individual selected stimuli, and then the target image was linearly reconstructed from these responses." Given that all the relevant data required to run the algorithm for the ex vivo retina and implant prototype had been collected during the experiment, it is unclear why the algorithm was not applied to test it by directly measuring responses to the algorithm's stimulation. This would have tested a critical assumption of the greedy-temporal dithering algorithm: that the responses to successive stimuli are statistically independent. Instead, the simulation assumes this to be the case.

A third limitation is that the reconstructed image for the conventional open-loop algorithm does not resemble the phosphene images reported by most retinal implant users. Most implant users report predominantly bright, rather than dark, localized phosphenes [4]. The open-loop reconstruction shown in Figure 3d appears to be largely a gray averaging of light and dark phosphenes, likely due to the linear reconstruction method used.

Some details of the implementation of the open-loop strategy are unclear including:

• How the area that was "near" the electrode was selected when calculating the intensity of the visual stimulus.

• How the temporal sequence of the electrodes was chosen. It seems that the open-loop strategy is also likely, temporally dithered, but without the benefit of data-driven optimization.

Section: Greedy temporal dithering is nearly optimal given the interface constraints.

The comparison of the greedy, closed-loop approximately optimal algorithm to truly optimal algorithms is an important comparison in principle. However, again it is not clear if a realistic number of stimulation pulses were used in performing this comparison (i.e. < 400).

Some details of the implementation of the optimal comparison strategy are unclear including:

• The meaning and purpose of the term V^T w in the objective function.

• Whether w>=0 was required after the integer requirement was relaxed in the optimization.

Section: Spatial multiplexing for fitting multiple stimuli in a visual integration window.

The idea to use spatial multiplexing of stimuli to overcome the limitation in the number of stimuli that can be delivered during a perceptual temporal window is a good idea to investigate. The aim is to choose stimuli on different electrodes that affect neural response independently. However, the initial formulation of what is meant by independence is not correct. This is stated as: "For independence to hold, the following condition must be met: if p1 is the activation probability of a given cell with stimulation on electrode 1, and p2 is the activation probability of the same cell with electrode 2, then the activation probability with simultaneous stimulation must be p1+p2." That this is incorrect can be seen because this formulation could give a probability greater than 1. However, the subsequent description of what is actually implemented appears correct. A general, in-principle way of describing what independence means is that if p1 is the probability of stimulating one cell with electrode 1 and p2 is the probability of stimulating a different cell with electrode 2, then the probability of stimulating both cell 1 and cell 2 using simultaneous stimulation with electrodes 1 and 2 is the product of those probabilities, p1.p2.

In contrast to greedy dithering alone, the use of both greedy dithering and spatial multiplexing was tested in a closed-loop experiment by recording responses to stimuli produced by the algorithm. However, the paper does not report on the image reconstructions, nor the reconstruction errors that were obtained.

Instead, the reported results of the greedy dithering-plus-multiplexing (Figure 4) show only that it is possible to select eight multiplexed electrodes with sufficient separation to ensure minimal interference. This could potentially increase the number of electrodes stimulated with the greedy, closed-loop algorithm by a factor of 8, bringing it to around 2,700 stimuli. This is closer to the 3000 electrode stimulations used in Figure 2b that gave errors that approached the asymptotic limit. However, the results in Figure 4 were obtained using stimulation every 2 ms, not every 0.15 ms (= pulse duration). With this limitation, this reduces the number of electrode stimuli to 200 in a 50 ms perceptual window, which again is not likely to give a good reconstruction error according to the simulations.

Other Results sections.

The sections on hardware constraints, naturalistic viewing conditions, and the use of perceptual similarity measures make useful observations about the potential benefits of the optimization framework for algorithmically determining the electrical stimulation.

Discussion.

The discussion covers many important points well. Regarding the translational potential, I would agree that an important point is "First, new surgical methods must be developed to implant a tiny chip on the surface of the retina with stable contact." But add that it must also be in extremely close contact for retinal ganglion cell spikes to be recorded. Further, a very high-density array (~ 60 μm pitch) and associated electronics for both stimulation and recording must be developed which is suitable in size, form factor, and power consumption for clinical use.

References

[1] Jepson, L. H., Hottowy, P., Mathieson, K., Gunning, D. E., Dąbrowski, W., Litke, A. M., & Chichilnisky, E. J. (2014). Spatially patterned electrical stimulation to enhance resolution of retinal prostheses. Journal of Neuroscience, 34(14), 4871-4881.

[2] Lorach, H., Goetz, G., Smith, R., Lei, X., Mandel, Y., Kamins, T.,.… & Palanker, D. (2015). Photovoltaic restoration of sight with high visual acuity. Nature medicine, 21(5), 476-482.

[3] Maturana, M. I., Apollo, N. V., Hadjinicolaou, A. E., Garrett, D. J., Cloherty, S. L., Kameneva, T.,.… & Meffin, H. (2016). A simple and accurate model to predict responses to multi-electrode stimulation in the retina. PLoS Computational Biology, 12(4), e1004849.

[4] Humayun, M. S., Weiland, J. D., Fujii, G. Y., Greenberg, R., Williamson, R., Little, J., et al. (2003) Visual perception in a blind subject with a chronic microelectronic retinal prosthesis. Vision Research, 43, (2573-2581).

Recommendations for the authors:

Overall, it appears that the approach may offer some important benefits for sensory-neural implant users. However, the reporting of results is not sufficiently complete to draw strong conclusions about the potential benefits. In addition to the Public Review, I have some related suggestions below.

Reconstruction model in the blind retina.

• It would be helpful to provide more detail about how the image reconstruction would work in the blind retinas, beyond what is mentioned regarding the identification of ON and OFF retinal ganglion cell type. How would the size and location of receptive fields be estimated?

• The assumptions underlying the reconstruction model should be described, especially with respect to the orthogonality of the receptive field filters. It would be helpful to describe an approach in the methods that do not rely on this assumption, as I describe in my public comments.

Greed optimization algorithm: There are several aspects of this that could be better explained. These include:

• A derivation justifying splitting the objective function into the terms due to the mean and variance is required.

• The terminology for the terms tr(var(A R_i)) is not clearly explained. I assume it is the matrix trace of the covariance matrix of the random variable A R_i.

• The assumption of a Bernoulli random variable, i.e. 1 or 0 spikes, is limiting, given there may be multiple spikes in response to electrical stimulation, especially for activation via the retinal network.

• The expression derived for the term tr(var(A R_i)) in the case of Bernoulli random variables should be given.

• It is not explained how the algorithm performs the final optimization at time step t, given elements in a restricted dictionary D_t.

• It is not explained how to determine the time for which recently used dictionary elements are excluded from current use.

Section: Greedy temporal dithering outperforms open loop methods

Regarding the number of single-electrode stimuli used in image reconstruction, it would be better to place the numbers used in the context of what is possible in the perceptual time window. It would recommend using the value of 333 instead of 500, as this corresponds to the number of 0.15 pulses that could be fit into a 50 ms window. The value of 3000 roughly corresponds to what might be achieved with spatial multiplexing. The value of 10,000 corresponds to the upper limit that is achievable through this algorithm.

I think it would be beneficial to make it clearer that the results in Figure 3 are simulated. It would also strengthen the study to perform validation in ex vivo retina to apply the greedy temporal dithering stimuli to the retina and reconstruct the image from the responses. If there is a good reason not to do this, this should be explained.

It would improve the study if a reconstruction algorithm that provides an image with a better match to the perception of phosphenes by retinal implant users was used. If this cannot be done, it should be discussed as a limitation of the study.

It would be helpful to clarify some details of the implementation of the open-loop strategy including:

• How the area that was "near" the electrode was selected when calculating the intensity of the visual stimulus.

• How the temporal sequence of the electrodes was chosen.

Section: Greedy temporal dithering is nearly optimal given the interface constraints

A realistic number of stimulation pulses should be used in performing this comparison e.g. < 400 for the pure temporal dithering or < 3000 for the spatially multiplexed, temporal dithering.

It would be helpful to clarify some details of the implementation of the open-loop strategy including:

• The meaning and purpose of the term V^T.w in the objective function.

• Whether w>=0 was required after the integer requirement was relaxed in the optimization.

Section: Spatial multiplexing for fitting multiple stimuli in a visual integration window

As described in my public review, the description of independence is not correct. I have suggested an alternative description that I believe accords with what was actually implemented.

It was surprising that the results of the validation experiments on ex vivo retina with the spatially multiplexed, temporally dithered algorithm were not reported more thoroughly. It is important to provide figures showing the image reconstruction that was achieved and the statistics for the reconstruction error.

Reviewer #3:

In this study, Shah and colleagues propose an interesting solution to the non-linear interactions caused by simultaneously stimulating multiple electrodes within a retinal implant. Through high-resolution recordings of ON and OFF parasol retinal ganglion cells, the authors demonstrate that a greedy dithering and spatially multiplexed algorithm, which can also work in the presence of saccadic eye movements, is able to faithfully reconstruct images represented by total numbers of spikes in a given time window across multiple retinal ganglion cells. Essentially, Shah and colleagues propose and demonstrate a method to only stimulate single or groups of 8 electrodes at a time from a pre-established dictionary, but then interleave stimulation of multiple electrodes or groups rapidly across the dictionary to additively build an image. Through their very rigorous and elegant ex vivo recordings in 180 ON and OFF parasol cells across four primate retina preparations, the authors compellingly demonstrate that (i) their greedy algorithm performs better than an open loop algorithm, similar to an optimal algorithm considering the interface constraints, and close but not equal to an ideal control using only a single-electrode dictionary; (ii) that groups of electrodes can be simultaneously activated with a high-resolution neural interface without any retinal interactions provided that they are at least 160 μm apart; (iii) that the algorithm performs just as well even with only 50% of the electrodes on the interface and (iv) that the algorithm can work in the presence of saccadic eye movements and performs better when both saccadic and fixational eye movements are made as opposed to saccadic movements alone.

The experimental recordings and performance of the algorithm in various conditions are the biggest strengths of this study and the authors certainly demonstrate that their algorithm can reproduce spiking numbers across an array of cells that resemble closely spiking numbers evoked by visual stimuli for these conditions. In other words, the authors' primary claim that the neural code for visual images in the retina (in the form of spiking numbers) can be faithfully reproduced with electrical stimulation using such an algorithm, is well supported by evidence.

A major weakness in the study however is the reliance of this algorithm on several significant assumptions about neural coding in the retina, neural coding in the visual brain, and interactions between electrodes even with non-simultaneous stimulation. Some of these assumptions have already been highly challenged in several studies in the visual neuroscience field and in studies involving the perception of phosphenes with interleaved stimulation of single electrodes. Therefore, in light of what is currently known about visual encoding and artificial vision, the study whilst showcasing an elegant computational tool perhaps provides only little hope that such an algorithm will actually work in practice to recreate the perception of images with electrical stimulation but instead does lay a foundation for further work to be done with the assessment of future algorithms. The main assumptions that the authors rely on include:

1) That neural coding in the retina is simply based on a number of spikes evoked by populations of cells ignoring any temporal patterns of responses. A plethora of studies has indicated that relative spike timing between groups of retinal ganglion cells for example can encode complex visual features but the greedy algorithm does not aim to mimic these spike timing features.

2) That perception within the brain is solely based on a number of spikes within a slow temporal integration window (the authors cite a 1995 reference for this). Since 1995 though, this has also been challenged, therefore extending the authors' claims of reproducing spike numbers in the retina to reproducing perception in the brain would be contentious.

3) That neural interactions with non-simultaneous interleaved electrical stimulation are absent. There is in silico, electrophysiological and perceptual evidence with retinal implants that interleaving of electrodes still results in neural interactions and that perception with interleaved stimulation with multiple electrodes does not result in a linear summed perception of phosphenes evoked by single electrodes i.e. dictionary elements. Therefore, the algorithm would only work if such interactions are minimal or absent, for example with larger than 0.1 ms intervals between stimulations or more than 160 μm electrode separation. Note, interactions with interleaving also exist with cochlear implants as the current spread is large.

4) That even if the above 3 assumptions were applied and true, the algorithm can faithfully extrapolate to reconstruct moving images at 24 per second. This seems unlikely as presumably the total time required to linearly reconstruct a single static image would extend to many tens or even hundreds of ms given the number of times each dictionary element needs to be accessed to enable reproduction of similar spiking numbers between visual and electrical stimulation, runs in the thousands.

In spite of major reliance on these assumptions, the authors do demonstrate a very useful tool in the form of the greedy algorithm for situations perhaps other than the visual system, where perception with artificial stimulation may be more predictable and interactions with non-simultaneous stimulation may be simpler.

Recommendations for the authors:

It may be possible to address at least some of the limitations in particular (1) and (4) mentioned in the public review. For limitation (1), the authors could try and experiment with their algorithm and reanalyse data to examine if and how well spike timing features (perhaps relative first spike latencies between RGCs or other temporal patterns of spikes) are reproducible. For limitation (4) the authors could at least perform calculations of time taken by the algorithm in each of the situations and targets presented, to examine if these times are realistic.

For limitations (2) and (3), the authors at a minimum should address these clearly in their discussion and the potential implications of the failure of these assumptions on their algorithm performance.

Other things that the authors should consider is including some example raw data from their retinas before and after artifact subtraction in response to both visual targets and their greedy algorithm as a figure.

eLife. 2024 Nov 7;13:e83424. doi: 10.7554/eLife.83424.sa2

Author response

Essential revisions:

Included here is a brief evaluation summary and list of revisions the reviewers and review editor deem essential for the authors to address. The public summaries and full, individual reviewers' recommendations for the authors are also appended below. The authors are advised to address the public summaries briefly, and the individual recommendations in a detailed, point-by-point manner.

As you will be able to read below, reviewers appreciated the importance of the study and its potentially broad interest. The approach to formulating the problem of choosing electrical stimuli for visual prostheses as a data-driven optimization problem holds promise for several sensory-neural prostheses. The writing was relatively clear, the figures appropriate, and the methods mostly rigorous. However, reviewers raised concerns with regard to some of the claims made, particularly as it pertains to the full greedy, dithering, multiplexed algorithm and its potential to greatly improve the quality of vision delivered by a retinal implant. The key points that need to be addressed can be summarized as follows:

1) Please provide more experimental data (specifically: reconstructed images and reconstruction errors) to substantiate the claim that the algorithm can improve the quality of vision in an ex vivo setting. The main evidence that is presented about the quality of vision that might be achieved is a computer simulation; that is, the image reconstructions and reconstruction errors given in Figures 2 & 3 with the dithered, but not multiplexed version of the algorithm. However, the same cannot be said about the ex vivo experiment. While the outputs of the dithered & multiplex version were indeed applied to ex vivo retina, the output is all simulated and no experimental validation data is presented. However, it is possible that the experimentally observed retinal output might differ from (the assumption of) a linear sum of dictionary elements. All reviewers agreed that if the authors could report on the results of the experiments in which algorithmic stimulation is applied to ex vivo retina, and report this in terms of image reconstructions and reconstruction error, this would greatly improve the strength of evidence.

We agree that this was the main missing element in the submitted manuscript and have spent recent months addressing this issue directly with experiments. The new results are given in the new section Experimental validation of greedy temporal dithering accompanied by the new Figure 4. In this analysis, we apply greedy dithered sequences to the retina ex vivo and directly compute image reconstructions and errors from the measured, evoked neural responses. We also compare these results to our previous approach of using measured responses from the electrical stimulus calibration phase of the experiment, and show that the experimental results align with these expectations. Due to pandemic-era limitations on the availability of monkey retinas, we performed the new experimental validation in the rat retina, and have thus added sections corresponding to this analysis to Methods. We think the addition of these experiments has substantially increased the impact of the paper and are grateful to the reviewers for highlighting its importance. We have also outlined the limitations of this experimental validation in the Discussion.

2) Please expand the discussion on the theoretical assumptions regarding visual processing in the brain and the perception of phosphenes through electrical stimulation that the study is based on, as it may limit the translational impact of the work. All reviewers agreed that the study relies on several significant assumptions about neural coding in the retina, the visual brain, and interactions between electrodes, some of which have been recently challenged. This includes the assumption that neural coding in the retina is solely based on a firing-rate code, that visual perception is solely based on the number of spikes within a slow temporal integration window, and that non-simultaneous interleaved electrical stimulation does not lead to neural interactions. At a minimum, the authors should address these limitations clearly in their Discussion, and comment on the potential implications of the failure of these assumptions on their algorithm performance.

We appreciate the reviewers’ focus on what is assumed/tested in our approach, something we have attempted to be very up-front about. As requested, we have provided additional clarification of our assumptions regarding neural coding in the visual brain and electrical stimulation in the Discussion. Here, we summarize the rationale for the three specific assumptions raised by the reviewers, paralleling the new Discussion text:

Firing rate code in the retina: We agree that in addition to firing rate, other features of the retinal code such as relative latency have been shown to carry information about the visual stimulus in some conditions and species (e.g. (Gollisch & Meister, 2008)). However, firing rate is thought to be the dominant feature of the neural code in the primate visual system (Shadlen & Newsome, 1994), so we made a first-order approximation to focus on it in this work. The importance of the firing rate code is also supported by recent work in which we found that macaque RGC spike counts integrated over ~150 ms provide greater image reconstruction accuracy than spike latencies (Brackbill et al., 2020). Based on this work, we think that while our approach may not capture all of the information normally present in RGC visual signals, it likely captures a large fraction of what is useful for vision. However, we agree that features of RGC responses other than firing rate could be important in certain circumstances (e.g. (Meister, 1996)), and have clarified this in the Discussion.

Visual perception based on slow temporal integration: We appreciate that the reviewers raised this important and subtle point, which has valid arguments on both sides. We try to describe our perspective on it more fully here. There is ample evidence that visual signals in the brain are integrated over tens of milliseconds to produce perception. This evidence ranges from flicker fusion experiments to the time scale of synaptic signal transfer to neurophysiological tests of temporal integration (Borghuis et al., 2019; Samaha & Postle, 2015; Tadin et al., 2010; Wutz et al., 2016). It is also consistent with the widespread use of display technology with ~60-100 Hz refresh rates. This known coarse temporal resolution of vision likely arises from long time constants in phototransduction and synaptic transfer. However, in principle, signals transmitted by RGCs to the brain could have finer temporal precision than signals in the photoreceptors, and visual centers in the brain could be sensitive to the precise timing of RGC spikes (as described in modeling studies, e.g. (Gütig et al., 2013)). Indeed, there is some empirical evidence that the temporal precision of spikes in RGCs can, in certain conditions, be on the order of 1 ms (Berry et al., 1997; Berry & Meister, 1998; Keat et al., 2001; Reich et al., 1997; Uzzell & Chichilnisky, 2004). Furthermore, the idea that downstream mechanisms in the brain could “read out” RGC signals with millisecond temporal precision is supported to a limited degree by empirical studies of precisely correlated activity (e.g. (Alonso et al., 1996)). However, it is unclear from these studies exactly how this high temporal precision would be useful for vision. We have performed several in-depth studies of the temporal resolution of readout of RGC signals from the macaque retina. This work has shown that for reconstruction of images from flashed stimuli, and for speed/direction discrimination with moving stimuli, the optimal temporal resolution of primate RGC signal readout is ~10 ms or coarser (Brackbill et al., 2020; Chichilnisky & Kalmar, 2003; Frechette et al., 2005; Wu et al., 2023). On the other hand, high temporal precision could be part of the neural code of RGCs in other specific visual stimulus conditions. For example, we recently found that image reconstruction from macaque RGC spikes in the presence of fixational eye drift is sensitive to spike timing in the 2-5 ms range (Wu et al., 2023). In sum, much evidence leans toward the idea that the temporal resolution of RGC signal readout in the brain is likely to be on the order of tens of milliseconds for many visual tasks. However, this may not be true for all conditions. We have clarified this in the Discussion.

Neural interactions for non-simultaneous interleaved electrical stimulation: While such interactions have been observed in previous studies of electrical stimulation (Ho et al., 2020; Sekhar et al., 2020; Yoon et al., 2020), the current work relies on very low current levels (<4 µA) delivered in brief pulses (150 µsec) that typically produce a single spike, directly evoked by membrane depolarization, with a latency of a few milliseconds (Sekirnjak et al., 2006, 2008) and no network-mediated activation. This suggests that the primary temporal interactions are likely to be the relative refractory period of neurons (~10 ms). This interpretation is supported by the findings in our new experimental validation. We have clarified these considerations in the Discussion.

3) Please clarify which figures/results are from simulations and which are from experimental data.

None of the figures in the paper are based on simulations of retinal responses – all figures use real recorded data. Importantly, however, Figures 2, 3, 6, 7, and 8 use calibrated responses to single-electrode stimulation to compute the reconstruction that is possible if dithered and multiplexed stimulation is used — the data are from real recordings, but the independence and stability of evoked responses is assumed, and samples are drawn from the measured spike probabilities to compute the reconstruction quality. To test whether these assumptions are valid, the new manuscript also includes closed-loop experiments (Figures 4 and 5) which validate the entire calibration-stimulation pipeline. We have added clarifying text throughout the Results and also summarized this information in the Discussion.

Reviewer #1:

1.1) Shah et al. propose an algorithm to precisely control RGC activation using electrical stimulation, using temporal dithering, and spatial multiplexing. The main assumption is that the brain has perceptual integration windows, during which a visual percept can be built up by stimulating single (or small groups of) neurons in rapid succession. Which electrodes to stimulate to achieve a desired percept is dictated by a dictionary of stimulation patterns. The authors demonstrate the effectiveness of their method on ex vivo recordings of ON and OFF parasol cells.

The biggest strengths of the study are the theoretical contributions and the experimental recordings used to demonstrate the effectiveness of their algorithm. The thinking follows a number of recent efforts in the field to think about visual prosthetic stimulation as a closed-loop data-driven optimization problem. This may have benefits over other open-loop stimulation techniques.

Thank you for recognizing the strengths of the study.

1.2) However, the biggest weakness of the study is a reliance on a number of controversial assumptions about the neural code of vision. The first is the existence of a slow temporal integration window during which the brain cannot distinguish the order of stimuli presented and/or sums up RGC activity to decode the presented stimulus. The paper presents only limited (and dated) evidence for this.

We agree that this was limited in the original submission, and have clarified this important issue in our response to Essential Revisions 2 paragraph 3 above as well as in changes to the Discussion

1.3) Second, the assumption of a Bernoulli distribution is at the very least limiting, as neurons may respond with multiple spikes to a stimulation pattern and some spatial features may be encoded by the relative timing of spikes across neurons.

Please see our response to Recommendations for Authors 1.8 below.

1.4) Third, even though the temporal dithering may avoid electrical crosstalk, there may still be neuronal crosstalk on longer timescales, thus challenging the independence assumption. Extending the delay between stimuli in order to avoid neuronal crosstalk may severely limit the utility of the proposed algorithm since it would cap the number of stimuli that could be delivered in one of the assumed temporal integration windows.

Please see our response to Recommendations for Authors 1.9 below.

1.5) Even if the assumptions hold, the presented evidence of the ex vivo recordings would need to provide some additional detail before the practical utility of the proposed algorithm could be judged accordingly. Although the study reports reconstruction errors and example reconstructed images for the simulation experiment, the same cannot be said about the ex vivo experiment.

We have now performed the key closed-loop validation experiment and provided the results in the paper. Please see our response to Essential Revisions 1 above and the changes to the manuscript.

1.6) In addition, real-world implementation of the algorithm would presumably include not just stimulus delivery but also in-the-loop stimulus decoding, and it is not clear how quickly that could be done.

While the real-time, in-the-loop stimulus decoding (which would use the actual evoked spikes rather than the probability of an evoked spike) could improve performance, it is computationally prohibitive as the reviewer points out. The approach in this paper does not require real-time, in-the-loop stimulus decoding, but instead uses the expected stimulus decoding based on a fixed set of calibration measurements. We show that inter-trial variability in total number of spikes is minimal, so that expected spike decoding is sufficient. We have now clarified this in the Discussion.

1.7) Lastly, the linear filters would have to be estimated in a degenerated retina, where one could not rely on responses to light stimuli. The paper notes that this could be done by considering the spontaneous activity of cells – I can see how that could allow you to distinguish between ON and OFF cells, for instance, but it is not clear to me how that would allow you to determine the linear filter for each cell. For those reasons, it is somewhat hard to judge the practical utility and potential significance of the proposed algorithm.

These are excellent points. A recent paper from our group (Zaidi et al., 2023) addresses these issues directly. In that paper, the firing rate and autocorrelation function are first used to classify ON and OFF parasol cell types, as the reviewer suggests. Then, the average linear spatiotemporal filter for each cell type is translated in space to an estimated receptive field location for each recorded cell using its electrical image, which we have shown provides an accurate estimate of its physical location (Li et al., 2015). This procedure was evaluated quantitatively in the aforementioned publication and shown to work well, accurately reproducing the actual measured linear filters of the cells using the autocorrelation and electrical image location. Future work will be needed for additional cell types, including additional electrical features that we measure routinely which can be used to identify more cell types, but the overall approach is expected to be similar. We have clarified this important issue in the Discussion.

Recommendations for the authors:

1.8) Methods: The assumption of a Bernoulli distribution seems limiting, as neurons may respond with more than one spike. The decoding overall does not take into account that (at least some) visual information may be encoded in the relative timing of spikes across neurons.

With the low amplitude electrical stimulation that we use (150µs long pulses with peak current amplitude 4µA), we typically observe zero or one directly-evoked spikes for each stimulation pulse (see (Sekirnjak et al., 2006)). This is in part because the pulses are short and the mechanism of activation is direct depolarization (Sekirnjak et al., 2006), rather than a network-mediated excitation. Under these conditions, the Bernoulli assumption is reasonable. We agree that our approach would not translate to other electrical stimulation patterns, such as pulse trains or high current levels or network-mediated activation, which could elicit many spikes. We note that, unlike the present approach, those stimulation paradigms make it difficult/impossible to replicate the neural code, and that evoking one spike at a time is therefore a singular advantage of our approach. We have clarified this in the Discussion.

Please see our response to Essential Revisions 2 paragraph 2 for a discussion of relative spike timing. We also note that with the very high temporal precision of our stimulation (evoked spike time variation of roughly 0.1 ms), if there were some degree of stimulus coding in the relative timing of spikes, that relative timing could certainly be reproduced by the stimulation sequences that we provide, with a suitable modification of the optimization approach.

1.9) p.7 algorithm section: The major underlying assumption here is that temporally dithered stimuli will be linearly integrated by the retina. Dithering may avoid electrical crosstalk, but neurons may exhibit "crosstalk" on longer timescales due to the relatively slow (as compared to electrical stimulation) temporal dynamics of ion channels. The authors seem to be aware of that as they say "presumably" and mention the idea of a perceptual integration window. But it would have been great to refer to some existing (and more recent) literature on the topic (if any).

To clarify, the assumption is not that temporally dithered stimuli are linearly integrated by the retina. The assumption is that temporally dithered evoked spikes are linearly integrated by the visual system downstream of the retina, on time scales of tens of ms. The evidence for this is discussed in our response to Essential Revision 2 paragraph 3. In short, while this assumption is not necessarily correct in all conditions, and testing it thoroughly will require an implanted device that does not yet exist, there is ample evidence to support this assumption in many stimulus conditions.

We agree that neuronal “crosstalk” on longer timescales than the temporal dithering is a possibility. Please see our response to Essential Revisions (2) paragraph 4. To avoid the primary temporal interactions due to the relative refractory period of neurons, the stimuli at each timestep are chosen from a ‘valid’ subset of the dictionary that disallows stimulation of any recently targeted cell within its relative refractory period. Our new validation experiments directly test the possibility of crosstalk in the retina ex vivo, and the results are encouraging (Figures 4 and 5). We have now highlighted the importance of independent responses for the temporal dithering approach in the Assumptions subsection of the Discussion.

1.10) p.8 algorithm section: Greedy algorithms are often not guaranteed to find the global optimum. Can the authors show that their proposed algorithm does not get stuck in local optima? How is the final optimization performed at time t, given the dictionary elements?

While we do not show that the greedy algorithm does not get stuck in local optima analytically, we do show empirically that the gap between lower bound on the optimal solution and greedy solution is small (Figure 3, histograms colored orange and blue). This finding indicates that even if the greedy algorithm does sometimes get stuck in local optima, the degradation in performance is insignificant compared to the benefits in reconstruction performance that the algorithm provides.

The formula for the optimization performed at time t over the available set of dictionary elements is given as Equation 5 in the Methods.

1.11) Ex vivo experiments: It would be helpful to see reconstruction errors and reconstructed images, similar to how they were presented for the simulation study.

We agree that direct closed-loop validation with the temporally dithered stimulation is important and have now performed these experiments. Please see our response to Essential Revision (1). In brief, the temporally dithered stimulation conveys very substantial image structure, as predicted using the measurements at the start of the experiment.

1.12) How long does the decoding/stimulus selection take? Presumably, the dictionary is quite large given the number of neurons. Is there a more efficient way to search the dictionary than O(N)? Or how is that done, and how quickly could it be done in a real-world implementation? I am concerned that this step may severely limit the number of stimuli that could be realistically delivered in a temporal integration window.

These are important considerations. The dictionary can indeed potentially be large (the number of elements is equal to the number of stimulation patterns tested), and searching this dictionary could thus be a computationally prohibitive step. We have recently addressed this problem in our group (Lotlikar et al., 2023) using the insight that the greedy search can be decomposed into multiple smaller searches, because far-away electrodes stimulate a disjoint collection of cells. This approach gives a drastic increase in the speed of the algorithm. Other engineering insights (such as finding the right embedded processor, building custom chips, etc.) can further increase the speed. We have decided to not focus on engineering implementation for this conceptual/theoretical paper because it is already fairly long.

1.13) In all equations, it would help if the authors used proper formatting to label vectors vs. matrices. For example, is the stimulus reconstruction filter in Eq. 1 a vector and the product is elementwise? What is the size of (and what are the rows/columns in) D on page 8? Etc.

Thank you for the suggestion. We have clarified vector and matrix dimensions and fixed the equation formatting in the Results.

Reviewer #2 :

2.1)This study proposes a new algorithm for determining the electrical stimulation delivered through a sensory neural implant with the aim of improving the perceptual benefit to implant users. The algorithm is evaluated using data from an ex vivo prototype of a retinal prosthesis to computer-simulate the retinal responses expected from applying the algorithm and later by applying stimuli from the full temporally dithered, spatially multiplex algorithm to ex vivo retina.

Presently, stimulation algorithms used clinically are calibrated using limited perceptual data from the user. In contrast, the proposed algorithm uses detailed measurements of retinal responses to electrical stimulation to optimize the stimulation. This is achieved by minimizing the error between a target image and a version of that image reconstructed from the evoked response that is predicted by the algorithm based on the detailed measurements. The use of a data-driven, optimization approach is similar to several other recently proposed neural stimulation algorithms (which are not cited by the study). The distinguishing feature of the algorithm proposed in this study is that it seeks to stimulate in a way that minimizes the interactions between electrodes that can occur when stimulating neural responses. This avoids the need for the algorithm to account for such interactions.

Overall, the main advantage of the proposed approach is that it frames the problem of how to deliver perceptually beneficial electrical stimulation with an implant as a closed-loop/data-driven optimization problem. This has the potential to improve over presently used open-loop strategies. It then provides an algorithm for solving this optimization problem to a good approximation. Applying the algorithm using data recorded from ex vivo retina with a prototype implant is a strength.

Thank you for summarizing the work and recognizing its strengths. We have referenced additional data-driven optimization approaches for neural stimulation in the manuscript (Choi et al., 2016; Haji Ghaffari et al., 2021; Tafazoli et al., 2020; Vasireddy et al., 2023).

2.2) However, the evaluation of the efficacy of the algorithm is limited. In the first instance, it is limited to computer simulation of the retinal response for the version of the algorithm that uses just temporal dithering. While this analysis supports the conclusion that the proposed algorithm could provide improved visual perception relative to the clinical open-loop strategy, much stronger evidence would be provided by applying the optimal stimuli from the proposed algorithm directly to the ex vivo retinal preparation and measuring the retinal response. This approach to testing the algorithm directly to the ex vivo retina is done for the full version of the algorithm that combines spatial multiplexing with temporal dithering. However, in contrast to simulated results, the study does not report on the reconstructed images that result from applying the algorithm to ex vivo retina, nor on the reconstruction errors. This makes it difficult to evaluate the efficacy of the algorithm.

This is a crucial point. We note that the original analysis was not based on computer simulations, but samples drawn from calibration measurements at the start of each experiment. We have now clarified this, and more importantly added reconstructed images and errors using an actual closed-loop validation experiment as suggested. Please see our response to Essential Revisions 1 above and the changes to the manuscript.

2.3) Section: Introduction.

The motivation for using a temporally dithered, spatially multiplexed algorithm to optimize stimulation stems from the desire to minimize the interactions caused by simultaneous stimulation of the electrodes in evoking a neural response. While this is an important strategy to investigate, the interactions are not typically as "complex" as claimed in the manuscript. Indeed, previous studies in several labs (including the Chichilnisky lab) show that these interactions can typically be described by a linear, weighted sum of the electrode currents followed by a simple static nonlinearity to predict the probability of spiking (a small minority of retinal ganglion cells require more complex nonlinear descriptions) [1 -3]. This model and others have been the basis for alternative data-driven, closed-loop stimulation strategies that optimize the stimulation in a way that seeks to take advantage of the interactions between electrodes to improve the spatial resolution of evoked retinal activity through "current steering".

We agree that there has been some progress in understanding interactions during electrical stimulation (though, we emphasize that this is distinct from interactions obtained with visual stimulation, where the LN models the reviewer describes have been fairly successful), and indeed some of this work has come from our lab. However, current studies of electrical stimulation, including our own, typically only examine one or a few electrodes at a time, and even in these situations we’ve shown that nonlinear interactions are often more complex than LN (Jepson et al., 2014). Our recent work (not shown) indicates that it occurs more with particular electrode configurations. In related work, we have recently made progress toward a closed-loop calibration strategy that uses “current steering” on 3 neighboring electrodes to improve the selectivity of stimulation (Vasireddy et al., 2023), and this work certainly requires dealing with the substantial nonlinearity that is present in some cases. However, using the above approaches to replicate the complex spatio-temporal pattern of RGC activity with hundreds or thousands of electrodes is far more complex. The approach presented here scales naturally to large numbers of electrodes, and also can include well-calibrated simultaneous-stimulation patterns (e.g. 3-electrode current steering) as “dictionary elements”, within the exact same optimization framework. We have attempted to clarify this a bit more in the Extensions subsection of the Discussion.

2.4) Section: Greedy temporal dithering to replicate neural code.

Data-driven optimization: The data required for the proposed algorithm is of two types. An exhaustive dictionary of response probabilities to single electrode stimulation across all current amplitudes, and a set of responses used to reconstruct the target image from the predicted response to electrical stimulation. For the latter, reconstruction of the image is achieved by applying linear filters to the predicted response. In the study, these linear filters were derived from cells' receptive fields, obtained by measured responses in the retina to light stimulation. It is noted that this would not be possible in a clinical implant, as the retina is degenerate. However, it is not clear how a set of filters would be obtained in this case. The authors mention that distinct cell types can be identified from spontaneous activity. However, this does not explain how receptive field size and location would be estimated in this situation.

These points are exactly correct. Please see our response to Public Review 1.7 above.

2.5) The reconstruction of the image is achieved through linear filtering with a matrix A, with columns, A_j, that are the (scaled) receptive field filters (Eq. 1). However, this is only correct if the receptive field filters of the different cells are orthogonal, i.e. the inner product of each pair of receptive fields is zero. More generally, appropriate linear filtering should be performed by applying the pseudo inverse of the transpose of A. This is because the retinal spike rates are being approximated as the inner product of the receptive field and the image (A_j transposed, matrix-multiplied by the image vector), half-wave rectified. For the receptive fields of ON and OFF parasol cells given in the study, it appears that the receptive fields are approximately orthogonal for the two separate populations due to the non-overlapping tiling of the visual field by each population (e.g. Figure 2). However, it is not clear whether this situation would prevail in the blind retina, as the filters have not been specified in the case.

Again, excellent point. Please see our response to Recommendations for Authors 2.22 below.

2.6) The greedy optimization algorithm is insufficiently explained in the Methods, including the following points:

A derivation justifying splitting the objective function into the terms due to the mean and variance is required.

We have added this derivation to the Methods.

2.7) The terminology for the terms tr(var(A R_i)) is not clearly explained. I assume it is the matrix trace of the covariance matrix of the random variable A R_i.

Yes, it is the trace of the covariance matrix. We have corrected the term in the Methods.

2.8) The assumption of a Bernoulli random variable for the response, i.e. 1 or 0 spikes, is limiting, given there may be multiple spikes in response to electrical stimulation, especially for activation via the retinal network.

This is a good point that needed to be clarified in the text. Please see our response to Recommendations for Authors 1.8 above.

2.9) The expression that was derived for the term tr(var(A R_i)) in the case of Bernoulli random variables should be given.

We have added this expression in the Methods.

2.10) It is not explained how the algorithm performs the final optimization at time step t, given elements in a restricted dictionary D_t.

We have clarified this point in the Methods.

2.11) Section: Greedy temporal dithering outperforms open loop methods.

The image reconstruction shown in Figure 2 uses 500, 3000, and 10000 electrical stimuli (shown in A, B and C respectively). However, these are unrealistically large numbers of stimuli: given the temporal perceptual window of 50 ms, mentioned in the Introduction as the time over which retinal responses would be perceptually integrated, and the pulse duration of 0.15 ms used in the study, a maximum of 333 stimuli could be applied during the window. Consequently, the use of 3000 and 10,000 electrical stimuli in the simulations provides unrealistic estimates of the degree to which the image can be reconstructed.

Please see our response to Recommendations for Authors 2.29 below.

2.12) A full comparison of the proposed greedy, closed-loop algorithm to the conventional open-loop algorithm is difficult to evaluate based on the results presented. First, the number of electrical stimuli applied in making the comparison (Figure 3H) is not given. However, it seems likely, given the data in Figure 3G that an unrealistically large 10,000 stimuli were used. If instead a realistic 300-400 stimuli were used there may be little difference between the greedy-closed loop algorithm and the conventional open-loop algorithm.

We have clarified the number of electrical stimuli applied for the reconstructions in Figure 3. Please see our response to Recommendations for Authors 2.29 below.

2.13) A second limitation is that, in this subsection, the greedy, closed-loop algorithm appears to have only been tested in simulation. E.g. "For random checkerboard visual stimulus targets, the greedy dithering stimulation sequence was calculated, neural responses were sampled using measured response probabilities evoked by the individual selected stimuli, and then the target image was linearly reconstructed from these responses." Given that all the relevant data required to run the algorithm for the ex vivo retina and implant prototype had been collected during the experiment, it is unclear why the algorithm was not applied to test it by directly measuring responses to the algorithm's stimulation. This would have tested a critical assumption of the greedy-temporal dithering algorithm: that the responses to successive stimuli are statistically independent. Instead, the simulation assumes this to be the case.

This is a crucial point. We have now performed the closed-loop validation experiment and shown reconstructions. Please see our responses to Essential Revisions 1 above and the changes to the manuscript.

2.14) A third limitation is that the reconstructed image for the conventional open-loop algorithm does not resemble the phosphene images reported by most retinal implant users. Most implant users report predominantly bright, rather than dark, localized phosphenes [4]. The open-loop reconstruction shown in Figure 3d appears to be largely a gray averaging of light and dark phosphenes, likely due to the linear reconstruction method used.

Another good point. Please see our response to Recommendations for Authors 2.31 below.

2.15) Some details of the implementation of the open-loop strategy are unclear including:

How the area that was "near" the electrode was selected when calculating the intensity of the visual stimulus.

How the temporal sequence of the electrodes was chosen. It seems that the open-loop strategy is also likely, temporally dithered, but without the benefit of data-driven optimization.

We appreciate the suggestion to clarify. Please see our response to Recommendations for Authors 2.32 below.

2.16) Section: Greedy temporal dithering is nearly optimal given the interface constraints.

The comparison of the greedy, closed-loop approximately optimal algorithm to truly optimal algorithms is an important comparison in principle. However, again it is not clear if a realistic number of stimulation pulses were used in performing this comparison (i.e. < 400). Some details of the implementation of the optimal comparison strategy are unclear including:

The meaning and purpose of the term V^T w in the objective function.

Whether w>=0 was required after the integer requirement was relaxed in the optimization.

We have clarified the number of electrical stimuli applied for the reconstructions in Figure 3. Please see our response to Recommendations for Authors 2.34 below.

2.17) Section: Spatial multiplexing for fitting multiple stimuli in a visual integration window.

The idea to use spatial multiplexing of stimuli to overcome the limitation in the number of stimuli that can be delivered during a perceptual temporal window is a good idea to investigate. The aim is to choose stimuli on different electrodes that affect neural response independently. However, the initial formulation of what is meant by independence is not correct. This is stated as: "For independence to hold, the following condition must be met: if p1 is the activation probability of a given cell with stimulation on electrode 1, and p2 is the activation probability of the same cell with electrode 2, then the activation probability with simultaneous stimulation must be p1+p2." That this is incorrect can be seen because this formulation could give a probability greater than 1. However, the subsequent description of what is actually implemented appears correct. A general, in-principle way of describing what independence means is that if p1 is the probability of stimulating one cell with electrode 1 and p2 is the probability of stimulating a different cell with electrode 2, then the probability of stimulating both cell 1 and cell 2 using simultaneous stimulation with electrodes 1 and 2 is the product of those probabilities, p1.p2.

Please see our response to Recommendations for Authors 2.35 below.

2.18) Instead, the reported results of the greedy dithering-plus-multiplexing (Figure 4) show only that it is possible to select eight multiplexed electrodes with sufficient separation to ensure minimal interference. This could potentially increase the number of electrodes stimulated with the greedy, closed-loop algorithm by a factor of 8, bringing it to around 2,700 stimuli. This is closer to the 3000 electrode stimulations used in Figure 2b that gave errors that approached the asymptotic limit. However, the results in Figure 4 were obtained using stimulation every 2 ms, not every 0.15 ms (= pulse duration). With this limitation, this reduces the number of electrode stimuli to 200 in a 50 ms perceptual window, which again is not likely to give a good reconstruction error according to the simulations.

Please see our response to Recommendations for Authors 2.29 below.

2.19) Other results sections.

The sections on hardware constraints, naturalistic viewing conditions, and the use of perceptual similarity measures make useful observations about the potential benefits of the optimization framework for algorithmically determining the electrical stimulation.

Thank you.

2.20) Discussion.

The discussion covers many important points well. Regarding the translational potential, I would agree that an important point is "First, new surgical methods must be developed to implant a tiny chip on the surface of the retina with stable contact." But add that it must also be in extremely close contact for retinal ganglion cell spikes to be recorded. Further, a very high-density array (~ 60 μm pitch) and associated electronics for both stimulation and recording must be developed which is suitable in size, form factor, and power consumption for clinical use.

For the first point, we have added this to the Discussion as suggested. For the second point, please see the updated section on hardware design in the Discussion.

References

Jepson, L. H., Hottowy, P., Mathieson, K., Gunning, D. E., Dąbrowski, W., Litke, A. M., & Chichilnisky, E. J. (2014). Spatially patterned electrical stimulation to enhance resolution of retinal prostheses. Journal of Neuroscience, 34(14), 4871-4881.

Lorach, H., Goetz, G., Smith, R., Lei, X., Mandel, Y., Kamins, T.,.… & Palanker, D. (2015). Photovoltaic restoration of sight with high visual acuity. Nature medicine, 21(5), 476-482.

Maturana, M. I., Apollo, N. V., Hadjinicolaou, A. E., Garrett, D. J., Cloherty, S. L., Kameneva, T.,.… & Meffin, H. (2016). A simple and accurate model to predict responses to multi-electrode stimulation in the retina. PLoS Computational Biology, 12(4), e1004849.

Humayun, M. S., Weiland, J. D., Fujii, G. Y., Greenberg, R., Williamson, R., Little, J., et al. (2003) Visual perception in a blind subject with a chronic microelectronic retinal prosthesis. Vision Research, 43, 2573-2581).

Recommendations for the authors:

2.21) Overall, it appears that the approach may offer some important benefits for sensory-neural implant users. However, the reporting of results is not sufficiently complete to draw strong conclusions about the potential benefits. In addition to the Public Review, I have some related suggestions below.

Reconstruction model in the blind retina.

It would be helpful to provide more detail about how the image reconstruction would work in the blind retinas, beyond what is mentioned regarding the identification of ON and OFF retinal ganglion cell type. How would the size and location of receptive fields be estimated?

Please see our response to Public Review 1.7 above.

2.22) The assumptions underlying the reconstruction model should be described, especially with respect to the orthogonality of the receptive field filters. It would be helpful to describe an approach in the methods that do not rely on this assumption, as I describe in my public comments.

We do indeed approximate the optimal linear reconstruction filters using the measured receptive fields of the cells. We agree that the decoder used is not the ‘inverse’ of the receptive fields, except in the case that the receptive fields of the cells are orthogonal. We can use the standard least-squares solution to move away from the assumption of orthogonality. We have clarified the important point that this is an approximation in the Results and Methods.

2.23) Greedy optimization algorithm: There are several aspects of this that could be better explained. These include:

A derivation justifying splitting the objective function into the terms due to the mean and variance is required.

Thank you. We have added this derivation to the Methods.

2.24) The terminology for the terms tr(var(A R_i)) is not clearly explained. I assume it is the matrix trace of the covariance matrix of the random variable A R_i.

Yes, it is the trace of the covariance matrix. We have corrected the term in the Methods.

2.25) The assumption of a Bernoulli random variable, i.e. 1 or 0 spikes, is limiting, given there may be multiple spikes in response to electrical stimulation, especially for activation via the retinal network.

Please see our response to Recommendations for Reviewers 1.8 above.

2.26) The expression derived for the term tr(var(A R_i)) in the case of Bernoulli random variables should be given.

We have added this expression in the Methods.

2.27) It is not explained how the algorithm performs the final optimization at time step t, given elements in a restricted dictionary D_t.

We have clarified this point in the Methods.

2.28) It is not explained how to determine the time for which recently used dictionary elements are excluded from current use.

Thank you for pointing out this missing information. We exclude dictionary elements to disallow stimulation of any recently targeted cell for 100 steps, which covers the relative refractory period (10 ms) when the stimulation frequency is less than 100Hz. We have now included this number in the Methods.

2.29) Section: Greedy temporal dithering outperforms open loop methods.

Regarding the number of single-electrode stimuli used in image reconstruction, it would be better to place the numbers used in the context of what is possible in the perceptual time window. It would recommend using the value of 333 instead of 500, as this corresponds to the number of 0.15 pulses that could be fit into a 50 ms window. The value of 3000 roughly corresponds to what might be achieved with spatial multiplexing. The value of 10,000 corresponds to the upper limit that is achievable through this algorithm.

This is an important and somewhat subtle point. While these calculations are correct for the primate recording, we found that 225±29 electrical stimulations were needed for asymptotic reconstruction performance of checkerboard targets in the rat retina. The total number of electrical stimulations needed depends on many factors including the number of cells targeted, their expected firing rates for the visual image, and the distribution of RGC activation probabilities in the electrical stimulation dictionary. Future work will be needed to identify how these factors vary across individuals, species and neural circuits. We have added text in the Discussion highlighting this issue.

2.30) I think it would be beneficial to make it clearer that the results in Figure 3 are simulated. It would also strengthen the study to perform validation in ex vivo retina to apply the greedy temporal dithering stimuli to the retina and reconstruct the image from the responses. If there is a good reason not to do this, this should be explained.

We agree that direct closed-loop experimental validation with the temporally dithered stimulation is important. We have now performed these experiments and clarified our language throughout the manuscript. Please see our responses to Essential Revisions 1 and 3.

2.31) It would improve the study if a reconstruction algorithm that provides an image with a better match to the perception of phosphenes by retinal implant users was used. If this cannot be done, it should be discussed as a limitation of the study.

This would be a highly relevant point if our electrical stimulation approaches had the same coarse level of control as existing implants. But in fact, the situation is quite different in the present work. For the short (150 µs) and low current (<4 µA) pulses we use, only single or small groups of cells tend to be electrically activated (Figure 1C) (Sekirnjak et al., 2006, 2008). This is in part because the mechanism of activation is direct depolarization, rather than a network-mediated excitation, and in part because we explicitly avoid activation of axons (which produces large phosphenes in existing implants (Beyeler et al., 2019)) by pre-calibration. Due to these fundamental differences in stimulation compared to existing retinal implants, we do not expect the perception of phosphenes of the kind seen in present-day implants. We have clarified these considerations in the Results and Discussion.

2.32) It would be helpful to clarify some details of the implementation of the open-loop strategy including:

How the area that was "near" the electrode was selected when calculating the intensity of the visual stimulus.

How the temporal sequence of the electrodes was chosen.

We have replaced “open-loop” with “static pixel-wise mapping” to more accurately reflect the calculation we are performing which approximates the function of present-day retinal implants. Specifically, we used a mapping between the intensity of the visual stimulus incident on an electrode and the intensity of the current passed through that electrode to determine which electrical stimuli to deliver. The intention of this analysis is to provide a generous benchmark for present-day devices – it is generous in the sense that we assume a much greater degree of calibration precision than these devices can actually achieve.

First, an affine transformation mapped the visual stimulus onto the electrode array. Second, the average visual stimulus intensity was identified over an approximately 130 µm x 130 µm region around the electrode location. Third, the average visual stimulus intensity on the electrode (s) was mapped to the current amplitude (i) using a scaled sigmoid:

$i = a + b / (1 + \exp (c s + d))$ This electrical stimulus was then delivered n times at that electrode. All parameters (i.e. five parameters {a, b, c, d, n} for each electrode) were simultaneously optimized to minimize reconstruction error across a training set of random checkerboard images.

A temporal dithering strategy for ordering the electrical stimuli was assumed so that the stimuli would not interfere with one another. The performance of this static pixel-wise mapping approach was evaluated in the same way as the dynamically optimized stimulation approach. In particular, the target visual stimulus was linearly reconstructed from the identified electrical stimulation sequence using samples drawn from the single-electrode calibration data.

The static mapping at each electrode captures the common aspect of existing approaches and highlights the crucial improvement of our dynamic approach. Again, we note that this procedure provides a generous interpretation of existing methods because it uses actual measured neural responses for optimizing the mapping rather than relying on more limited patient feedback. We have updated the text in the Results and clarified these implementation details in the Methods.

2.33) Section: Greedy temporal dithering is nearly optimal given the interface constraints.

A realistic number of stimulation pulses should be used in performing this comparison e.g. < 400 for the pure temporal dithering or < 3000 for the spatially multiplexed, temporal dithering.

Please see our response to Recommendations for Authors 2.29 above.

2.34) It would be helpful to clarify some details of the implementation of the open-loop strategy including:

The meaning and purpose of the term V^T.w in the objective function.

Whether w>=0 was required after the integer requirement was relaxed in the optimization.

These details of the implementation of the optimal comparison strategy have been clarified in the Methods. Yes, the non-negativity constraint keeps the approximate objective closer to the original formulation.

2.35) Section: Spatial multiplexing for fitting multiple stimuli in a visual integration window.

As described in my public review, the description of independence is not correct. I have suggested an alternative description that I believe accords with what was actually implemented.

Thank you for pointing out the inconsistency in the description of spatial independence. We have corrected and simplified the description in the Methods.

2.36) It was surprising that the results of the validation experiments on ex vivo retina with the spatially multiplexed, temporally dithered algorithm were not reported more thoroughly. It is important to provide figures showing the image reconstruction that was achieved and the statistics for the reconstruction error.

We have now performed the closed-loop validation experiment, and added reconstructed images from it. Please see our responses to Essential Revisions 1 above and the changes to the manuscript.

Reviewer #3

3.1) In this study, Shah and colleagues propose an interesting solution to the non-linear interactions caused by simultaneously stimulating multiple electrodes within a retinal implant. Through high-resolution recordings of ON and OFF parasol retinal ganglion cells, the authors demonstrate that a greedy dithering and spatially multiplexed algorithm, which can also work in the presence of saccadic eye movements, is able to faithfully reconstruct images represented by total numbers of spikes in a given time window across multiple retinal ganglion cells. Essentially, Shah and colleagues propose and demonstrate a method to only stimulate single or groups of 8 electrodes at a time from a pre-established dictionary, but then interleave stimulation of multiple electrodes or groups rapidly across the dictionary to additively build an image. Through their very rigorous and elegant ex vivo recordings in 180 ON and OFF parasol cells across four primate retina preparations, the authors compellingly demonstrate that (i) their greedy algorithm performs better than an open loop algorithm, similar to an optimal algorithm considering the interface constraints, and close but not equal to an ideal control using only a single-electrode dictionary; (ii) that groups of electrodes can be simultaneously activated with a high-resolution neural interface without any retinal interactions provided that they are at least 160 μm apart; (iii) that the algorithm performs just as well even with only 50% of the electrodes on the interface and (iv) that the algorithm can work in the presence of saccadic eye movements and performs better when both saccadic and fixational eye movements are made as opposed to saccadic movements alone.

The experimental recordings and performance of the algorithm in various conditions are the biggest strengths of this study and the authors certainly demonstrate that their algorithm can reproduce spiking numbers across an array of cells that resemble closely spiking numbers evoked by visual stimuli for these conditions. In other words, the authors' primary claim that the neural code for visual images in the retina (in the form of spiking numbers) can be faithfully reproduced with electrical stimulation using such an algorithm, is well supported by evidence.

A major weakness in the study however is the reliance of this algorithm on several significant assumptions about neural coding in the retina, neural coding in the visual brain, and interactions between electrodes even with non-simultaneous stimulation. Some of these assumptions have already been highly challenged in several studies in the visual neuroscience field and in studies involving the perception of phosphenes with interleaved stimulation of single electrodes.

Therefore, in light of what is currently known about visual encoding and artificial vision, the study whilst showcasing an elegant computational tool perhaps provides only little hope that such an algorithm will actually work in practice to recreate the perception of images with electrical stimulation but instead does lay a foundation for further work to be done with the assessment of future algorithms. The main assumptions that the authors rely on include:

1) That neural coding in the retina is simply based on a number of spikes evoked by populations of cells ignoring any temporal patterns of responses. A plethora of studies has indicated that relative spike timing between groups of retinal ganglion cells for example can encode complex visual features but the greedy algorithm does not aim to mimic these spike timing features.

Please see our response to Recommendations for Authors 3.2 below.

2) That perception within the brain is solely based on a number of spikes within a slow temporal integration window (the authors cite a 1995 reference for this). Since 1995 though, this has also been challenged, therefore extending the authors' claims of reproducing spike numbers in the retina to reproducing perception in the brain would be contentious.

Please see our response to Essential Revisions 2 paragraph 3 above.

3) That neural interactions with non-simultaneous interleaved electrical stimulation are absent. There is in silico, electrophysiological and perceptual evidence with retinal implants that interleaving of electrodes still results in neural interactions and that perception with interleaved stimulation with multiple electrodes does not result in a linear summed perception of phosphenes evoked by single electrodes i.e. dictionary elements. Therefore, the algorithm would only work if such interactions are minimal or absent, for example with larger than 0.1 ms intervals between stimulations or more than 160 μm electrode separation. Note, interactions with interleaving also exist with cochlear implants as the current spread is large.

Please see our response to Essential Revisions 2 paragraph 4 above.

4) That even if the above 3 assumptions were applied and true, the algorithm can faithfully extrapolate to reconstruct moving images at 24 per second. This seems unlikely as presumably the total time required to linearly reconstruct a single static image would extend to many tens or even hundreds of ms given the number of times each dictionary element needs to be accessed to enable reproduction of similar spiking numbers between visual and electrical stimulation, runs in the thousands.

Please see our response to Recommendations for Authors 3.3 below.

In spite of major reliance on these assumptions, the authors do demonstrate a very useful tool in the form of the greedy algorithm for situations perhaps other than the visual system, where perception with artificial stimulation may be more predictable and interactions with non-simultaneous stimulation may be simpler.

Thank you for recognizing the strength of the work.

Recommendations for the authors:

3.2) It may be possible to address at least some of the limitations in particular (1) and (4) mentioned in the public review. For limitation (1), the authors could try and experiment with their algorithm and reanalyse data to examine if and how well spike timing features (perhaps relative first spike latencies between RGCs or other temporal patterns of spikes) are reproducible.

This is an important point. Please see our response to Essential Revisions 2 paragraph 2. We also note that with the very high temporal precision of our stimulation (evoked spike time variation of roughly 0.1 ms), if there were some degree of stimulus coding in the relative timing of spikes, that relative timing could certainly be reproduced by the stimulation sequences that we provide, with a suitable modification of the optimization approach. However, this would substantially increase the overall complexity of the algorithm and we think it is beyond the scope of this paper.

3.3) For limitation (4) the authors could at least perform calculations of time taken by the algorithm in each of the situations and targets presented, to examine if these times are realistic.

In fact, realistic numbers of electrical stimulations were required for the closed-loop experimental validation of greedy temporal dithering using rat retina (225±29 stimulations for asymptotic reconstruction performance, requiring 62±9 ms to deliver the temporally dithered sequence). For the thousands of stimulations reported for the macaque retina, spatial multiplexing could reduce the delivery time down to a realistic tens of milliseconds. For example, delivering 3000 stimuli at a 0.15 ms interval would require 450 ms without spatial multiplexing, but could require as short as 57 ms with spatial multiplexing delivering an average of 8 electrical stimuli per time step. While the former duration exceeds visual integration time, the latter duration approximately matches it. The exact number of electrical stimulations needed for each situation depends on many factors including the number of cells targeted, their expected firing rates for the particular visual image, and the distribution of RGC activation probabilities in the electrical stimulation dictionary. Future work will be needed to identify how these factors vary across individuals, species and neural circuits. We have added text in the Discussion highlighting this issue.

3.4) For limitations (2) and (3), the authors at a minimum should address these clearly in their discussion and the potential implications of the failure of these assumptions on their algorithm performance.

Please see our response to Essential Revisions 2 paragraphs 3 and 4 for a discussion of limitations (2) and (3), respectively. We have clarified these considerations in the Discussion.

3.5) Other things that the authors should consider is including some example raw data from their retinas before and after artifact subtraction in response to both visual targets and their greedy algorithm as a figure.

Thank you for this suggestion. While this could be beneficial, we ultimately decided against this as the spike sorting in the presence of electrical artifact is a very involved topic and has been extensively covered in other papers from our group (Gogliettino et al., 2023; Jepson et al., 2013, 2014; Madugula et al., 2022; Sekirnjak et al., 2006, 2008). We cite some of these papers in the Methods section.

References

Alonso, J. M., Usrey, W. M., & Reid, R. C. (1996). Precisely correlated firing in cells of the lateral geniculate nucleus. Nature, 383(6603), 815–819. https://doi.org/10.1038/383815a0

Berry, M. J., & Meister, M. (1998). Refractoriness and Neural Precision. Journal of Neuroscience, 18(6), 2200–2211.https://doi.org/10.1523/JNEUROSCI.18-06-02200.1998

Berry, M. J., Warland, D. K., & Meister, M. (1997). The structure and precision of retinal spike trains. Proceedings of the National Academy of Sciences, 94(10), 5411–5416. https://doi.org/10.1073/pnas.94.10.5411

Beyeler, M., Nanduri, D., Weiland, J. D., Rokem, A., Boynton, G. M., & Fine, I. (2019). A model of ganglion axon pathways accounts for percepts elicited by retinal implants. Scientific Reports, 9(1), 9199. https://doi.org/10.1038/s41598-019-45416-4

Borghuis, B. G., Tadin, D., Lankheet, M. J. M., Lappin, J. S., & van de Grind, W. A. (2019). Temporal Limits of Visual Motion Processing: Psychophysics and Neurophysiology. Vision, 3(1), 5. https://doi.org/10.3390/vision3010005

Brackbill, N., Rhoades, C., Kling, A., Shah, N. P., Sher, A., Litke, A. M., & Chichilnisky, E. J. (2020). Reconstruction of natural images from responses of primate retinal ganglion cells. eLife, 9, e58516. https://doi.org/10.7554/eLife.58516

Chichilnisky, E. J., & Kalmar, R. S. (2003). Temporal Resolution of Ensemble Visual Motion Signals in Primate Retina. The Journal of Neuroscience, 23(17), 6681–6689. https://doi.org/10.1523/JNEUROSCI.23-17-06681.2003

Choi, J. S., Brockmeier, A. J., McNiel, D. B., Kraus, L. M. von, Príncipe, J. C., & Francis, J. T. (2016). Eliciting naturalistic cortical responses with a sensory prosthesis via optimized microstimulation. Journal of Neural Engineering, 13(5), 056007. https://doi.org/10.1088/1741-2560/13/5/056007

Frechette, E. S., Sher, A., Grivich, M. I., Petrusca, D., Litke, A. M., & Chichilnisky, E. J. (2005).

Fidelity of the Ensemble Code for Visual Motion in Primate Retina. Journal of Neurophysiology, 94(1), 119–135. https://doi.org/10.1152/jn.01175.2004

Gogliettino, A. R., Madugula, S. S., Grosberg, L. E., Vilkhu, R. S., Brown, J., Nguyen, H., Kling, A., Hottowy, P., Dąbrowski, W., Sher, A., Litke, A. M., & Chichilnisky, E. J. (2023).

High-Fidelity Reproduction of Visual Signals by Electrical Stimulation in the Central Primate Retina. Journal of Neuroscience, 43(25), 4625–4641. https://doi.org/10.1523/JNEUROSCI.1091-22.2023

Gollisch, T., & Meister, M. (2008). Rapid Neural Coding in the Retina with Relative Spike Latencies. Science, 319(5866), 1108–1111. https://doi.org/10.1126/science.1149639

Gütig, R., Gollisch, T., Sompolinsky, H., & Meister, M. (2013). Computing Complex Visual Features with Retinal Spike Times. PLoS ONE, 8(1), e53063. https://doi.org/10.1371/journal.pone.0053063

Haji Ghaffari, D., Akwaboah, A. D., Mirzakhalili, E., & Weiland, J. D. (2021). Real-Time Optimization of Retinal Ganglion Cell Spatial Activity in Response to Epiretinal Stimulation. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 29, 2733–2741. https://doi.org/10.1109/TNSRE.2021.3138297

Ho, E., Shmakov, A., & Palanker, D. (2020). Decoding network-mediated retinal response to electrical stimulation: Implications for fidelity of prosthetic vision. Journal of Neural Engineering, 17(6), 10.1088/1741-2552/abc535. https://doi.org/10.1088/1741-2552/abc535

Jepson, L. H., Hottowy, P., Mathieson, K., Gunning, D. E., Dabrowski, W., Litke, A. M., & Chichilnisky, E. J. (2013). Focal electrical stimulation of major ganglion cell types in the primate retina for the design of visual prostheses. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 33(17), 7194–7205. https://doi.org/10.1523/JNEUROSCI.4967-12.2013

Jepson, L. H., Hottowy, P., Mathieson, K., Gunning, D. E., Dąbrowski, W., Litke, A. M., & Chichilnisky, E. J. (2014). Spatially patterned electrical stimulation to enhance resolution of retinal prostheses. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 34(14), 4871–4881. https://doi.org/10.1523/JNEUROSCI.2882-13.2014

Keat, J., Reinagel, P., Reid, R. C., & Meister, M. (2001). Predicting Every Spike: A Model for the Responses of Visual Neurons. Neuron, 30(3), 803–817. https://doi.org/10.1016/S0896-6273(01)00322-1

Li, P. H., Gauthier, J. L., Schiff, M., Sher, A., Ahn, D., Field, G. D., Greschner, M., Callaway, E. M., Litke, A. M., & Chichilnisky, E. J. (2015). Anatomical identification of extracellularly recorded cells in large-scale multielectrode recordings. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 35(11), 4663–4675.https://doi.org/10.1523/JNEUROSCI.3675-14.2015

Lotlikar, A., Shah, N. P., Gogliettino, A. R., Vilkhu, R., Madugula, S., Grosberg, L., Hottowy, P., Sher, A., Litke, A., Chichilnisky, E. J., & Mitra, S. (2023). Partitioned Temporal Dithering for Efficient Epiretinal Electrical Stimulation. 2023 11th International IEEE/EMBS Conference on Neural Engineering (NER), 1–5. https://doi.org/10.1109/NER52421.2023.10123787

Madugula, S. S., Gogliettino, A. R., Zaidi, M., Aggarwal, G., Kling, A., Shah, N. P., Brown, J. B.,

Vilkhu, R., Hays, M. R., Nguyen, H., Fan, V., Wu, E. G., Hottowy, P., Sher, A., Litke, A. M., Silva, R. A., & Chichilnisky, E. J. (2022). Focal Electrical Stimulation of Human Retinal Ganglion Cells for Vision Restoration. Journal of Neural Engineering, 19(6), 10.1088/1741-2552/aca5b5. https://doi.org/10.1088/1741-2552/aca5b5

Meister, M. (1996). Multineuronal codes in retinal signaling. Proceedings of the National Academy of Sciences, 93(2), 609–614. https://doi.org/10.1073/pnas.93.2.609

Reich, D. S., Victor, J. D., Knight, B. W., Ozaki, T., & Kaplan, E. (1997). Response Variability and Timing Precision of Neuronal Spike Trains in vivo. Journal of Neurophysiology, 77(5), 2836–2841. https://doi.org/10.1152/jn.1997.77.5.2836

Samaha, J., & Postle, B. R. (2015). The speed of α-band oscillations predicts the temporal resolution of visual perception. Current Biology : CB, 25(22), 2985–2990. https://doi.org/10.1016/j.cub.2015.10.007

Sekhar, S., Ramesh, P., Bassetto, G., Zrenner, E., Macke, J. H., & Rathbun, D. L. (2020).Characterizing Retinal Ganglion Cell Responses to Electrical Stimulation Using Generalized Linear Models. Frontiers in Neuroscience, 14, 378. https://doi.org/10.3389/fnins.2020.00378

Sekirnjak, C., Hottowy, P., Sher, A., Dabrowski, W., Litke, A. M., & Chichilnisky, E. J. (2006). Electrical stimulation of mammalian retinal ganglion cells with multielectrode arrays. Journal of Neurophysiology, 95(6), 3311–3327. https://doi.org/10.1152/jn.01168.2005

Sekirnjak, C., Hottowy, P., Sher, A., Dabrowski, W., Litke, A. M., & Chichilnisky, E. J. (2008). High-resolution electrical stimulation of primate retina for epiretinal implant design. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 28(17), 4446–4456. https://doi.org/10.1523/JNEUROSCI.5138-07.2008

Shadlen, M. N., & Newsome, W. T. (1994). Noise, neural codes and cortical organization. Current Opinion in Neurobiology, 4(4), 569–579. https://doi.org/10.1016/0959-4388(94)90059-0

Tadin, D., Lappin, J. S., Blake, R., & Glasser, D. M. (2010). High temporal precision for perceiving event offsets. Vision Research, 50(19), 1966–1971. https://doi.org/10.1016/j.visres.2010.07.005

Tafazoli, S., MacDowell, C. J., Che, Z., Letai, K. C., Steinhardt, C. R., & Buschman, T. J. (2020). Learning to control the brain through adaptive closed-loop patterned stimulation. Journal of Neural Engineering, 17(5), 056007. https://doi.org/10.1088/1741-2552/abb860

Uzzell, V. J., & Chichilnisky, E. J. (2004). Precision of Spike Trains in Primate Retinal Ganglion Cells. Journal of Neurophysiology, 92(2), 780–789. https://doi.org/10.1152/jn.01171.2003

Vasireddy, P. K., Gogliettino, A. R., Brown, J. B., Vilkhu, R. S., Madugula, S. S., Phillips, A. J., Mitral, S., Hottowy, P., Sher, A., Litke, A., Shah, N. P., & Chichilnisky, E. J. (2023).Efficient Modeling and Calibration of Multi-Electrode Stimuli for Epiretinal Implants. 2023 11th International IEEE/EMBS Conference on Neural Engineering (NER), 1–4. https://doi.org/10.1109/NER52421.2023.10123907

Wu, E. G., Brackbill, N., Rhoades, C., Kling, A., Gogliettino, A. R., Shah, N. P., Sher, A., Litke, A. M., Simoncelli, E. P., & Chichilnisky, E. J. (2023). Fixational Eye Movements Enhance the Precision of Visual Information Transmitted by the Primate Retina (p.2023.08.12.552902). bioRxiv. https://doi.org/10.1101/2023.08.12.552902

Wutz, A., Muschter, E., van Koningsbruggen, M. G., Weisz, N., & Melcher, D. (2016). Temporal Integration Windows in Neural Processing and Perception Aligned to Saccadic Eye Movements. Current Biology: CB, 26(13), 1659–1668. https://doi.org/10.1016/j.cub.2016.04.070

Yoon, Y. J., Lee, J.-I., Jang, Y. J., An, S., Kim, J. H., Fried, S. I., & Im, M. (2020). Retinal Degeneration Reduces Consistency of Network-mediated Responses Arising in Ganglion Cells to Electric Stimulation. IEEE Transactions on Neural Systems and Rehabilitation Engineering : A Publication of the IEEE Engineering in Medicine and Biology Society, 28(9), 1921–1930. https://doi.org/10.1109/TNSRE.2020.3003345

Zaidi, M., Aggarwal, G., Shah, N. P., Karniol-Tambour, O., Goetz, G., Madugula, S. S., Gogliettino, A. R., Wu, E. G., Kling, A., Brackbill, N., Sher, A., Litke, A. M., & Chichilnisky, E. J. (2023). Inferring light responses of primate retinal ganglion cells using intrinsic electrical signatures. Journal of Neural Engineering, 20(4), 045001. https://doi.org/10.1088/1741-2552/ace657

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

Shah N, Phillips AJ, Madugula S, Lotlikar A, Gogliettino A, Hays M, Grosberg L, Brown J, Dusi A, Tandon P, Hottowy P, Dabrowski W, Sher A, Litke A, Mitra S, Chichilnisky EJ. 2024. Data from: Precise control of neural activity using dynamically optimized electrical stimulation. Dryad Digital Repository. [DOI]

Supplementary Materials

MDAR checklist

elife-83424-mdarchecklist1.docx^{(99.6KB, docx)}

Data Availability Statement

Data and code are available on Dryad at https://doi.org/10.5061/dryad.pk0p2ngrv.

The following dataset was generated:

[bib1] Alonso JM, Usrey WM, Reid RC. Precisely correlated firing in cells of the lateral geniculate nucleus. Nature. 1996;383:815–819. doi: 10.1038/383815a0. [DOI] [PubMed] [Google Scholar]

[bib2] Beauchamp MS, Oswalt D, Sun P, Foster BL, Magnotti JF, Niketeghad S, Pouratian N, Bosking WH, Yoshor D. Dynamic stimulation of visual cortex produces form vision in sighted and blind humans. Cell. 2020;181:774–783. doi: 10.1016/j.cell.2020.04.033. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib3] Berry MJ, Warland DK, Meister M. The structure and precision of retinal spike trains. PNAS. 1997;94:5411–5416. doi: 10.1073/pnas.94.10.5411. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib4] Berry MJ, Meister M. Refractoriness and neural precision. The Journal of Neuroscience. 1998;18:2200–2211. doi: 10.1523/JNEUROSCI.18-06-02200.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib5] Beyeler M, Nanduri D, Weiland JD, Rokem A, Boynton GM, Fine I. A model of ganglion axon pathways accounts for percepts elicited by retinal implants. Scientific Reports. 2019;9:9199. doi: 10.1038/s41598-019-45416-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib6] Bloch E. The argus ii retinal prosthesis system. Prosthesis. 2020;01:e4947. doi: 10.5772/intechopen.84947. [DOI] [Google Scholar]

[bib7] Borghuis BG, Tadin D, Lankheet MJM, Lappin JS. Temporal limits of visual motion processing: psychophysics and neurophysiology. Vision. 2019;01:e0005. doi: 10.3390/vision3010005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib8] Brackbill N, Rhoades C, Kling A, Shah NP, Sher A, Litke AM, Chichilnisky EJ. Reconstruction of natural images from responses of primate retinal ganglion cells. eLife. 2020;9:e58516. doi: 10.7554/eLife.58516. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] Cha K, Horch KW, Normann RA. Mobility performance with a pixelized vision system. Vision Research. 1992;32:1367–1372. doi: 10.1016/0042-6989(92)90229-c. [DOI] [PubMed] [Google Scholar]

[bib10] Chen X, Wang F, Fernandez E, Roelfsema PR. Shape perception via a high-channel-count neuroprosthesis in monkey visual cortex. Science. 2020;370:1191–1196. doi: 10.1126/science.abd7435. [DOI] [PubMed] [Google Scholar]

[bib11] Chichilnisky EJ, Kalmar RS. Functional asymmetries in ON and OFF ganglion cells of primate retina. The Journal of Neuroscience. 2002;22:2737–2747. doi: 10.1523/JNEUROSCI.22-07-02737.2002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib12] Choi JS, Brockmeier AJ, McNiel DB, von Kraus L, Príncipe JC, Francis JT. Eliciting naturalistic cortical responses with a sensory prosthesis via optimized microstimulation. Journal of Neural Engineering. 2016;13:056007. doi: 10.1088/1741-2560/13/5/056007. [DOI] [PubMed] [Google Scholar]

[bib13] Cowan CS, Renner M, Gross-Scherf B, Goldblum D, Munz M, Krol J, Szikra T, Papasaikas P, Cuttat R, Waldt A, Diggelmann R, Patino-Alvarez CP, Gerber-Hollbach N, Schuierer S, Hou Y, Srdanovic A, Balogh M, Panero R, Hasler PW, Kusnyerik A, Szabo A, Stadler MB, Orgül S, Hierlemann A, Scholl HPN, Roma G, Nigsch F, Roska B. Cell types of the human retina and its organoids at single-cell resolution: developmental convergence, transcriptomic identity, and disease map. Cell. 2019;182:1623–1640. doi: 10.1016/j.cell.2020.08.013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] Dacey DM. The mosaic of midget ganglion cells in the human retina. The Journal of Neuroscience. 1993;13:5334–5355. doi: 10.1523/JNEUROSCI.13-12-05334.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] de Ruyter van Steveninck J, Güçlü U, van Wezel R, van Gerven M. End-to-end optimization of prosthetic vision. Journal of Vision. 2022;22:20. doi: 10.1167/jov.22.2.20. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib16] Devries SH, Baylor DA. Mosaic arrangement of ganglion cell receptive fields in rabbit retina. Journal of Neurophysiology. 1997;78:2048–2060. doi: 10.1152/jn.1997.78.4.2048. [DOI] [PubMed] [Google Scholar]

[bib17] Downey JE, Schwed N, Chase SM, Schwartz AB, Collinger JL. Intracortical recording stability in human brain-computer interface users. Journal of Neural Engineering. 2018;15:046016. doi: 10.1088/1741-2552/aab7a0. [DOI] [PubMed] [Google Scholar]

[bib18] Fan VH, Grosberg LE, Madugula SS, Hottowy P, Dabrowski W, Sher A, Litke AM, Chichilnisky EJ. Epiretinal stimulation with local returns enhances selectivity at cellular resolution. Journal of Neural Engineering. 2019;16:025001. doi: 10.1088/1741-2552/aaeef1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib19] Field GD, Chichilnisky EJ. Information processing in the primate retina: circuitry and coding. Annual Review of Neuroscience. 2007;30:1–30. doi: 10.1146/annurev.neuro.30.051606.094252. [DOI] [PubMed] [Google Scholar]

[bib20] Flesher SN, Collinger JL, Foldes ST, Weiss JM, Downey JE, Tyler-Kabara EC, Bensmaia SJ, Schwartz AB, Boninger ML, Gaunt RA. Intracortical microstimulation of human somatosensory cortex. Science Translational Medicine. 2016;8:361ra141. doi: 10.1126/scitranslmed.aaf8083. [DOI] [PubMed] [Google Scholar]

[bib21] Frechette ES, Sher A, Grivich MI, Petrusca D, Litke AM, Chichilnisky EJ. Fidelity of the ensemble code for visual motion in primate retina. Journal of Neurophysiology. 2005;94:119–135. doi: 10.1152/jn.01175.2004. [DOI] [PubMed] [Google Scholar]

[bib22] Gaylor JM, Raman G, Chung M, Lee J, Rao M, Lau J, Poe DS. Cochlear implantation in adults: a systematic review and meta-analysis. JAMA Otolaryngology-- Head & Neck Surgery. 2013;139:265–272. doi: 10.1001/jamaoto.2013.1744. [DOI] [PubMed] [Google Scholar]

[bib23] Gogliettino AR, Madugula SS, Grosberg LE, Vilkhu RS, Brown J, Nguyen H, Kling A, Hottowy P, Dąbrowski W, Sher A, Litke AM, Chichilnisky EJ. High-fidelity reproduction of visual signals by electrical stimulation in the central primate retina. The Journal of Neuroscience. 2023;43:4625–4641. doi: 10.1523/JNEUROSCI.1091-22.2023. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib24] Goo YS, Park DJ, Ahn JR, Senok SS. Spontaneous oscillatory rhythms in the degenerating mouse retina modulate retinal ganglion cell responses to electrical stimulation. Frontiers in Cellular Neuroscience. 2015;9:512. doi: 10.3389/fncel.2015.00512. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib25] Granley J, Relic L, Beyeler M. A Hybrid Neural Autoencoder for Sensory Neuroprostheses and Its Applications in Bionic Vision. arXiv. 2022 http://arxiv.org/abs/2205.13623 [PMC free article] [PubMed]

[bib26] Grosberg LE, Ganesan K, Goetz GA, Madugula SS, Bhaskhar N, Fan V, Li P, Hottowy P, Dabrowski W, Sher A, Litke AM, Mitra S, Chichilnisky EJ. Activation of ganglion cells and axon bundles using epiretinal electrical stimulation. Journal of Neurophysiology. 2017;118:1457–1471. doi: 10.1152/jn.00750.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib27] Haji Ghaffari D, Akwaboah AD, Mirzakhalili E, Weiland JD. Real-time optimization of retinal ganglion cell spatial activity in response to epiretinal stimulation. IEEE Transactions on Neural Systems and Rehabilitation Engineering. 2021;29:2733–2741. doi: 10.1109/TNSRE.2021.3138297. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib28] Ho E, Boffa J, Palanker D. Performance of complex visual tasks using simulated prosthetic vision via augmented-reality glasses. Journal of Vision. 2019;19:22. doi: 10.1167/19.13.22. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib29] Hottowy P, Dąbrowski W, Skoczeń A, Wiącek P. An integrated multichannel waveform generator for large-scale spatio-temporal stimulation of neural tissue. Analog Integrated Circuits and Signal Processing. 2008;55:239–248. doi: 10.1007/s10470-007-9125-x. [DOI] [Google Scholar]

[bib30] Hottowy P, Skoczeń A, Gunning DE, Kachiguine S, Mathieson K, Sher A, Wiącek P, Litke AM, Dąbrowski W. Properties and application of a multichannel integrated circuit for low-artifact, patterned electrical stimulation of neural tissue. Journal of Neural Engineering. 2012;9:066005. doi: 10.1088/1741-2560/9/6/066005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib31] Humayun MS, Dorn JD, da Cruz L, Dagnelie G, Sahel JA, Stanga PE, Cideciyan AV, Duncan JL, Eliott D, Filley E, Ho AC, Santos A, Safran AB, Arditi A, Del Priore LV, Greenberg RJ, Argus II Study Group Interim results from the international trial of Second Sight's visual prosthesis. Ophthalmology. 2012;119:779–788. doi: 10.1016/j.ophtha.2011.09.028. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib32] Jepson LH, Hottowy P, Mathieson K, Gunning DE, Dabrowski W, Litke AM, Chichilnisky EJ. Focal electrical stimulation of major ganglion cell types in the primate retina for the design of visual prostheses. The Journal of Neuroscience. 2013;33:7194–7205. doi: 10.1523/JNEUROSCI.4967-12.2013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib33] Jepson LH, Hottowy P, Mathieson K, Gunning DE, Dąbrowski W, Litke AM, Chichilnisky EJ. Spatially patterned electrical stimulation to enhance resolution of retinal prostheses. The Journal of Neuroscience. 2014;34:4871–4881. doi: 10.1523/JNEUROSCI.2882-13.2014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib34] Johnson LA, Wander JD, Sarma D, Su DK, Fetz EE, Ojemann JG. Direct electrical stimulation of the somatosensory cortex in humans using electrocorticography electrodes: a qualitative and quantitative report. Journal of Neural Engineering. 2013;10:036021. doi: 10.1088/1741-2560/10/3/036021. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib35] Keat J, Reinagel P, Reid RC, Meister M. Predicting every spike: a model for the responses of visual neurons. Neuron. 2001;30:803–817. doi: 10.1016/s0896-6273(01)00322-1. [DOI] [PubMed] [Google Scholar]

[bib36] Kim YJ, Brackbill N, Batty E, Lee J, Mitelut C, Tong W, Chichilnisky EJ, Paninski L. Nonlinear decoding of natural images from large-scale primate retinal ganglion recordings. Neural Computation. 2021;33:1719–1750. doi: 10.1162/neco_a_01395. [DOI] [PubMed] [Google Scholar]

[bib37] Kling A, Gogliettino AR, Shah NP, Wu EG, Brackbill N, Sher A, Litke AM, Silva RA, Chichilnisky EJ. Functional organization of midget and parasol ganglion cells in the human retina. Neuroscience. 2020;01:e0762. doi: 10.1101/2020.08.07.240762. [DOI] [Google Scholar]

[bib38] Li PH, Gauthier JL, Schiff M, Sher A, Ahn D, Field GD, Greschner M, Callaway EM, Litke AM, Chichilnisky EJ. Anatomical identification of extracellularly recorded cells in large-scale multielectrode recordings. The Journal of Neuroscience. 2015;35:4663–4675. doi: 10.1523/JNEUROSCI.3675-14.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib39] Lieby P, Barnes N, McCarthy C, Dennett H, Walker JG, Botea V, Scott AF. Substituting Depth for Intensity and Real-Time Phosphene Rendering: Visual Navigation under Low Vision Conditions. Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2011. pp. 8017–8020. [DOI] [PubMed] [Google Scholar]

[bib40] Litke AM, Bezayiff N, Chichilnisky EJ, Cunningham W, Dabrowski W, Grillo AA, Grivich M, Grybos P, Hottowy P, Kachiguine S, Kalmar RS, Mathieson K, Petrusca D, Rahman M, Sher A. What does the eye tell the brain?: Development of a system for the large-scale recording of retinal output activity. IEEE Transactions on Nuclear Science. 2004;51:1434–1440. doi: 10.1109/TNS.2004.832706. [DOI] [Google Scholar]

[bib41] Lotlikar A, Shah NP, Gogliettino AR, Vilkhu R, Madugula S, Grosberg L, Hottowy P, Sher A, Litke A, Chichilnisky EJ, Mitra S. Partitioned Temporal Dithering for Efficient Epiretinal Electrical Stimulation. 11th International IEEE/EMBS Conference on Neural Engineering; 2023. pp. 1–5. [DOI] [Google Scholar]

[bib42] Loudin JD, Simanovskii DM, Vijayraghavan K, Sramek CK, Butterwick AF, Huie P, McLean GY, Palanker DV. Optoelectronic retinal prosthesis: system design and performance. Journal of Neural Engineering. 2007;4:S72–S84. doi: 10.1088/1741-2560/4/1/S09. [DOI] [PubMed] [Google Scholar]

[bib43] Madugula SS, Gogliettino AR, Zaidi M, Aggarwal G, Kling A, Shah NP, Brown JB, Vilkhu R, Hays MR, Nguyen H, Fan V, Wu EG, Hottowy P, Sher A, Litke AM, Silva RA, Chichilnisky EJ. Focal electrical stimulation of human retinal ganglion cells for vision restoration. Journal of Neural Engineering. 2022;19 doi: 10.1088/1741-2552/aca5b5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib44] McCarthy C, Barnes N, Lieby P. 33rd Annual International Conference. 33rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2011. pp. 4457–4460. [DOI] [PubMed] [Google Scholar]

[bib45] Mena GE, Grosberg LE, Madugula S, Hottowy P, Litke A, Cunningham J, Chichilnisky EJ, Paninski L. Electrical stimulus artifact cancellation and neural spike detection on large multi-electrode arrays. PLOS Computational Biology. 2017;13:e1005842. doi: 10.1371/journal.pcbi.1005842. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib46] Merabet LB, Pascual-Leone A. Neural reorganization following sensory loss: the opportunity of change. Nature Reviews. Neuroscience. 2010;11:44–52. doi: 10.1038/nrn2758. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib47] Palanker D, Le Mer Y, Mohand-Said S, Muqit M, Sahel JA. Photovoltaic restoration of central vision in atrophic age-related macular degeneration. Ophthalmology. 2020;127:1097–1104. doi: 10.1016/j.ophtha.2020.02.024. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib48] Parthasarathy N, Batty E, Falcon W, Rutten T, Rajpal M, Chichilnisky EJ, Paninski L. Neural networks for efficient bayesian decoding of natural images from retinal neurons. Neuroscience. 2017;01:e3759. doi: 10.1101/153759. [DOI] [Google Scholar]

[bib49] Perge JA, Homer ML, Malik WQ, Cash S, Eskandar E, Friehs G, Donoghue JP, Hochberg LR. Intra-day signal instabilities affect decoding performance in an intracortical neural interface system. Journal of Neural Engineering. 2013;10:036004. doi: 10.1088/1741-2560/10/3/036004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib50] Ravi S, Ahn D, Greschner M, Chichilnisky EJ, Field GD. Pathway-specific asymmetries between on and off visual signals. The Journal of Neuroscience. 2018;38:9728–9740. doi: 10.1523/JNEUROSCI.2008-18.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib51] Reich DS, Victor JD, Knight BW, Ozaki T, Kaplan E. Response variability and timing precision of neuronal spike trains in vivo. Journal of Neurophysiology. 1997;77:2836–2841. doi: 10.1152/jn.1997.77.5.2836. [DOI] [PubMed] [Google Scholar]

[bib52] Relic L, Zhang B, Tuan YL, Beyeler M. Deep Learning–Based Perceptual Stimulus Encoder for Bionic Vision. arXiv. 2022 https://arxiv.org/abs/2203.05604

[bib53] Rhoades CE, Shah NP, Manookin MB, Brackbill N, Kling A, Goetz G, Sher A, Litke AM, Chichilnisky EJ. Unusual physiological properties of smooth monostratified ganglion cell types in primate retina. Neuron. 2019;103:658–672. doi: 10.1016/j.neuron.2019.05.036. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib54] Richard E, Goetz GA, Chichilnisky EJ. In Advances in Neural Information Processing Systems 28. Curran Associates, Inc; 2015. Recognizing Retinal Ganglion Cells in the Dark; pp. 2476–2484. [Google Scholar]

[bib55] Rodieck RW. The First Steps in Seeing. Sinauer Associates Incorporated; 1998. [Google Scholar]

[bib56] Rouger J, Lagleyre S, Fraysse B, Deneve S, Deguine O, Barone P. Evidence that cochlear-implanted deaf patients are better multisensory integrators. PNAS. 2007;104:7295–7300. doi: 10.1073/pnas.0609419104. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib57] Salas A, Michelle LB, Kellis S, Jafari M, Jo H, Kramer D, Shanfield K. Proprioceptive and cutaneous sensations in humans elicited by intracortical microstimulation. eLife. 2018;07:e2904. doi: 10.7554/eLife.32904. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib58] Samaha J, Postle BR. The speed of alpha-band oscillations predicts the temporal resolution of visual perception. Current Biology. 2015;25:2985–2990. doi: 10.1016/j.cub.2015.10.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib59] Sekirnjak C, Jepson LH, Hottowy P, Sher A, Dabrowski W, Litke AM, Chichilnisky EJ. Changes in physiological properties of rat ganglion cells during retinal degeneration. Journal of Neurophysiology. 2011;105:2560–2571. doi: 10.1152/jn.01061.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib60] Soto F, Hsiang JC, Rajagopal R, Piggott K, Harocopos GJ, Couch SM, Custer P, Morgan JL, Kerschensteiner D. Efficient coding by midget and parasol ganglion cells in the human retina. Neuron. 2020;107:656–666. doi: 10.1016/j.neuron.2020.05.030. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib61] Stingl K, Bartz-Schmidt KU, Besch D, Braun A, Bruckmann A, Gekeler F, Greppmaier U, Hipp S, Hörtdörfer G, Kernstock C, Koitschev A, Kusnyerik A, Sachs H, Schatz A, Stingl KT, Peters T, Wilhelm B, Zrenner E. Artificial vision with wirelessly powered subretinal electronic implant alpha-IMS. Proceedings. Biological Sciences. 2013;280:20130077. doi: 10.1098/rspb.2013.0077. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib62] Tadin D, Lappin JS, Blake R, Glasser DM. High temporal precision for perceiving event offsets. Vision Research. 2010;50:1966–1971. doi: 10.1016/j.visres.2010.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib63] Tafazoli S, MacDowell CJ, Che Z, Letai KC, Steinhardt CR, Buschman TJ. Learning to control the brain through adaptive closed-loop patterned stimulation. Journal of Neural Engineering. 2020;17:056007. doi: 10.1088/1741-2552/abb860. [DOI] [PubMed] [Google Scholar]

[bib64] Talaminos-Barroso A, Reina-Tosina J, Roa-Romero LM. In Control Applications for Biomedical Engineering Systems. Elsevier; 2020. Models Based on Cellular Automata for the Analysis of Biomedical Systems; pp. 405–445. [DOI] [Google Scholar]

[bib65] Tandon P, Bhaskhar N, Shah N, Madugula S, Grosberg L, Fan VH, Hottowy P, Sher A, Litke AM, Chichilnisky EJ, Mitra S. Automatic identification of axon bundle activation for epiretinal prosthesis. IEEE Transactions on Neural Systems and Rehabilitation Engineering. 2021;29:2496–2502. doi: 10.1109/TNSRE.2021.3128486. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib66] Trenholm S, Awatramani GB. Origins of spontaneous activity in the degenerating retina. Frontiers in Cellular Neuroscience. 2015;9:277. doi: 10.3389/fncel.2015.00277. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib67] Uzzell VJ, Chichilnisky EJ. Precision of spike trains in primate retinal ganglion cells. Journal of Neurophysiology. 2004;92:780–789. doi: 10.1152/jn.01171.2003. [DOI] [PubMed] [Google Scholar]

[bib68] Vasireddy PK, Gogliettino AR, Brown JB, Vilkhu RS, Madugula SS, Phillips AJ, Mitral S, Hottowy P, Sher A, Litke A, Shah NP, Chichilnisky EJ. Efficient Modeling and Calibration of Multi-Electrode Stimuli for Epiretinal Implants. 2023 11th International IEEE/EMBS Conference on Neural Engineering (NER; Baltimore, MD, USA. 2023. [DOI] [Google Scholar]

[bib69] Vergnieux V, Macé MJM, Jouffrais C. Simplification of visual rendering in simulated prosthetic vision facilitates navigation. Artificial Organs. 2017;41:852–861. doi: 10.1111/aor.12868. [DOI] [PubMed] [Google Scholar]

[bib70] Vilkhu RS, Madugula SS, Grosberg LE, Gogliettino AR, Hottowy P, Dabrowski W, Sher A, Litke AM, Mitra S, Chichilnisky EJ. Spatially patterned bi-electrode epiretinal stimulation for axon avoidance at cellular resolution. Journal of Neural Engineering. 2021;18 doi: 10.1088/1741-2552/ac3450. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib71] Wandell BA. Foundations of Vision. Sinauer Associates, Incorporated; 1995. [Google Scholar]

[bib72] Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing. 2004;13:600–612. doi: 10.1109/tip.2003.819861. [DOI] [PubMed] [Google Scholar]

[bib73] Warland DK, Reinagel P, Meister M. Decoding visual information from a population of retinal ganglion cells. Journal of Neurophysiology. 1997;78:2336–2350. doi: 10.1152/jn.1997.78.5.2336. [DOI] [PubMed] [Google Scholar]

[bib74] Wässle H, Peichl L, Boycott BB. Dendritic territories of cat retinal ganglion cells. Nature. 1981;292:344–345. doi: 10.1038/292344a0. [DOI] [PubMed] [Google Scholar]

[bib75] Wu EG, Brackbill N, Sher A, Litke AM, Simoncelli EP, Chichilnisky EJ. Maximum a posteriori natural scene reconstruction from retinal ganglion cells with deep denoiser priors. Neuroscience. 2022;01:e2737. doi: 10.1101/2022.05.19.492737. [DOI] [Google Scholar]

[bib76] Wu EG, Brackbill N, Rhoades C, Kling A, Gogliettino AR, Shah NP, Sher A, Litke AM, Simoncelli EP, Chichilnisky EJ. Fixational Eye Movements Enhance the Precision of Visual Information Transmitted by the Primate Retina. bioRxiv. 2024 doi: 10.1101/2023.08.12.552902. [DOI] [PMC free article] [PubMed]

[bib77] Wutz A, Muschter E, van Koningsbruggen MG, Weisz N, Melcher D. Temporal Integration Windows in Neural Processing and Perception Aligned to Saccadic Eye Movements. Current Biology. 2016;26:1659–1668. doi: 10.1016/j.cub.2016.04.070. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib78] Yarbus AL. Eye Movements and Vision. Springer; 1967. [DOI] [Google Scholar]

[bib79] Zaidi M, Aggarwal G, Shah NP, Karniol-Tambour O, Goetz G, Madugula S, Gogliettino AR, Wu EG, Kling A, Brackbill N, Sher A, Litke AM, Chichilnisky EJ. Inferring Light Responses of Primate Retinal Ganglion Cells Using Intrinsic Electrical Signatures. J Neural Eng. 2022;20:e3858. doi: 10.1101/2022.05.29.493858. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Precise control of neural activity using dynamically optimized electrical stimulation

Nishal Pradeepbhai Shah

AJ Phillips

Sasidhar Madugula

Amrith Lotlikar

Alex R Gogliettino

Madeline Rose Hays

Lauren Grosberg

Jeff Brown

Aditya Dusi

Pulkit Tandon

Pawel Hottowy

Wladyslaw Dabrowski

Alexander Sher

Alan M Litke

Subhasish Mitra

EJ Chichilnisky

Roles

Abstract

Introduction

Results

Dynamic optimization to approximately replicate neural code

Figure 1. Algorithmic components of the proposed framework for electrical stimulation.

Objective: reconstructing the visual stimulus from neural responses

Constraint: calibrating the collection of neural responses that can be electrically evoked

Algorithm: greedy temporal dithering to approximate the optimal spatiotemporal electrical stimulus

Greedy temporal dithering outperforms existing static methods

Figure 2. Visual stimulus reconstruction achieved using the greedy temporal dithering algorithm.

Figure 3. Quantifying the performance of dynamically optimized stimulation.

Greedy temporal dithering is nearly optimal given the interface constraints

Closed-loop experimental validation of greedy temporal dithering

Figure 4. Experimental validation of dynamically optimized stimulation in the rat retina.

Spatial multiplexing increases throughput within the visual integration window

Figure 5. Spatial multiplexing by simultaneous stimulation of distant electrodes.

Dynamic optimization framework enables data-driven hardware design

Figure 6. Subsampling electrodes for hardware efficiency.

Dynamic optimization framework extends to naturalistic viewing conditions

Figure 7. Extension of dynamically optimized stimulation to naturalistic conditions with eye movements.

Figure 7—video 1. Greedy temporal dithering and spatial multiplexing in natural viewing conditions.

Optimizing stimulation using a perceptual similarity measure

Figure 8. Extension of dynamically optimized stimulation using Structural Similarity (SSIM) perceptual error metric.

Discussion

Assumptions underlying the temporal dithering and spatial multiplexing approach

Extensions and broader applicability of the proposed approach

Translational potential

Methods

Retinal preparation

Electrical recordings

Electrical stimulation

Visual stimulation

Temporal dithering algorithm

Greedy optimization

Approximate joint optimization

Perfect control optimization

Static pixel-wise optimization

Analysis of temporal dithering using calibrated responses

Validation of temporal dithering using experimentally evoked responses

Characterizing spatial exclusion radius for spatial multiplexing

Extension of greedy dithering to natural scenes with eye movements

Incorporating perceptual similarity metrics

Acknowledgements

Funding Statement

Contributor Information

Funding Information

Additional information

Competing interests

Author contributions

Additional files

Data availability

References

Editor's evaluation

Michael Beyeler

Roles

Decision letter

Roles

Author response

Associated Data

Data Citations

Supplementary Materials