Hybrid Neural Autoencoders for Stimulus Encoding in Visual and Other Sensory Neuroprostheses

Jacob Granley; Lucas Relic; Michael Beyeler

. Author manuscript; available in PMC: 2023 Sep 16.

Published in final edited form as: Adv Neural Inf Process Syst. 2022 Dec;35:22671–22685.

Hybrid Neural Autoencoders for Stimulus Encoding in Visual and Other Sensory Neuroprostheses

Jacob Granley ¹, Lucas Relic ², Michael Beyeler ³

PMCID: PMC10504858 NIHMSID: NIHMS1928640 PMID: 37719469

Abstract

Sensory neuroprostheses are emerging as a promising technology to restore lost sensory function or augment human capabilities. However, sensations elicited by current devices often appear artificial and distorted. Although current models can predict the neural or perceptual response to an electrical stimulus, an optimal stimulation strategy solves the inverse problem: what is the required stimulus to produce a desired response? Here, we frame this as an end-to-end optimization problem, where a deep neural network stimulus encoder is trained to invert a known and fixed forward model that approximates the underlying biological system. As a proof of concept, we demonstrate the effectiveness of this hybrid neural autoencoder (HNA) in visual neuroprostheses. We find that HNA produces high-fidelity patient-specific stimuli representing handwritten digits and segmented images of everyday objects, and significantly outperforms conventional encoding strategies across all simulated patients. Overall this is an important step towards the long-standing challenge of restoring high-quality vision to people living with incurable blindness and may prove a promising solution for a variety of neuroprosthetic technologies.

1. Introduction

Sensory neuroprostheses are emerging as a promising technology to restore lost sensory function or augment human capacities [1, 2]. In such devices, diminished sensory modalities (e.g., hearing [3], vision [4, 5], cutaneous touch [6]) are re-enacted through streams of artificial input to the nervous system. For example, visual neuroprostheses electrically stimulate neurons in the early visual system to elicit neuronal responses that the brain interprets as visual percepts. Such devices have the potential to restore a rudimentary form of vision to millions of people living with incurable blindness.

However, all of these technologies face the challenge of interfacing with a highly nonlinear system of biological neurons whose role in perception is not fully understood. Due to the limited resolution of electrical stimulation, prostheses often create neural response patterns foreign to the brain. Consequently, sensations elicited by current sensory neuroprostheses often appear artificial and distorted [7, 8]. A major outstanding challenge is thus to identify a stimulus encoding that leads to perceptually intelligible sensations. Often there exists a forward model, $f$ (Fig. 1A), constrained by empirical data, that can predict a neuronal or (ideally) perceptual response to the applied stimulus (see [9] for a recent review). To find the stimulus that will elicit a desired response, one essentially needs to find the inverse of the forward model, $f^{- 1}$ . However, realistic forward models are rarely analytically invertible, making this a challenging open problem for neuroprostheses.

Figure 1: — A) Sensory neuroprosthesis. A forward model ( $f$ ) is used to approximate the neuronal or, ideally, perceptual response to electrical stimuli. B) Hybrid neural autoencoder (HNA). A deep neural encoder ( $f^{- 1}$ ) is trained to predict the patterns of electrical stimulation that elicit responses closest to the target. C) Visual neuroprostheses are one prominent application of HNA, where an encoder can be trained to predict the electrical stimulation needed to elicit a desired visual percept. D) The trained encoder is deployed on a vision processing unit (VPU), predicting stimuli in real-time that are decoded by the patient’s visual cortex.

Here we propose to approach this as an end-to-end optimization problem, where a deep neural network (DNN) (encoder) is trained to invert a known, fixed forward model (decoder, Fig. 1B). The encoder is trained to predict the patterns of electrical stimulation patterns that elicit perception (e.g., vision, audition) or neural responses (e.g., firing rates) closest to the target. This hybrid neural autoencoder (HNA) could in theory be used to optimize stimuli for any open-loop neuroprosthesis with a known forward model that approximates the underlying biological system.

In order to optimize end-to-end, the forward model must be differentiable and computationally efficient. When this is not the case, an alternative approach is to train a surrogate neural network, $\hat{f}$ , to approximate the forward model [10-13]. However, even well-trained surrogates may result in errors when included in our end-to-end framework, due to the encoders’ ability to learn to exploit the surrogate model. We therefore also evaluate whether a surrogate approach to HNA is suitable.

To this end, we make the following contributions:

We propose a hybrid neural autoencoder (HNA) consisting of a deep neural encoder trained to invert a fixed, numerical or symbolic forward model (decoder) to optimize stimulus selection. Our framework is general and addresses an important challenge with sensory neuroprostheses.
As a proof of concept, we demonstrate the utility of HNA for visual neuroprostheses, where we predict electrode activation patterns required to produce a desired visual percept. We show that the HNA is able to produce high-fidelity, patient-specific stimuli representing handwritten digits and segmented images of everyday objects, drastically outperforming conventional approaches.
We evaluate replacing a computationally expensive or nondifferentiable forward model with a surrogate, highlighting benefits and potential dangers of popular surrogate techniques.

2. Background

Sensory Neuroprostheses

Recent advances in neural understanding, wearable electronics, and biocompatible materials have accelerated the development of sensory neuroprostheses to restore perceptual function to people with impaired sensation. Use of neuroprostheses typically requires invasive implants that elicit neural responses via electrical, magnetic, or optogenetic stimulation. Two of the most promising applications are cochlear implants, which stimulate the auditory nerve to elicit sounds [3], and visual implants (see next subsection) to restore vision to the blind. However, a variety of other devices are in development; for instance, to restore touch [6, 14] or motor function [15]. The latter differ from other sensory neuroprostheses in that they generate stimuli using motor feedback (closed loop) [16, 17]. In the absence of feedback (open loop), a proper stimulus encoding is paramount to the success of these devices.

Restoring Vision to the Blind

For millions of people who are living with incurable blindness, a visual prostheses (bionic eye, Fig. 2, left) may be the only treatment option [18]. Analogous to cochlear implants, these devices electrically stimulate surviving cells in the visual pathway to evoke visual percepts (phosphenes), which can support simple behavioral tasks [5, 19, 20].

Figure 2: — *Left*: Visual prosthesis. Incoming target images are transmitted from a camera to an implant in the retina, which encodes the image as an electrical stimulus pattern. *Center*: Electrical stimulation (red disc) of a nerve fiber bundle (gray lines) leads to elongated tissue activation (gray shaded region) and phosphenes whose shape can be described by two parameters, $λ$ (axonal spread) and $ρ$ (radial spread). *Right*: Predicted percepts for an MNIST digit using varying $ρ$ and $λ$ values.

A common misconception is that each electrode in the array can be thought of as a pixel in an image; to generate a complex visual experience, one then simply needs to turn on the right combination of pixels [21]. However, recent evidence suggests that phosphenes often appear distorted (e.g., as lines, wedges, and blobs) and vary drastically across subjects and electrodes [4, 7].

Phosphene appearance has been best characterized in epiretinal implants, where inadvertent activation of nerve fiber bundles (NFBs) in the optic fiber layer of the retina leads to elongated phosphenes [22, 23] (Fig. 2, center). To this end, Granley et. al [24] developed a forward model to predict phosphene shape as a function of both neuroanatomical parameters (i.e., location of the stimulating electrode) and stimulus parameters (i.e., pulse frequency, amplitude, and duration). With this model, phosphenes are primarily characterized by two main parameters, $ρ$ and $λ$ , which dictate the size and elongation of the elicited phosphene, respectively (Fig. 2, right). These parameters can be determined using psychophysical tasks (e.g., drawings, brightness ratings) [22, 24], and although they vary drastically across patients [22], they do not change much over time [25, 26]. Stimulation from multiple electrodes is nonlinearly integrated into a combined perception, and if two electrodes happen to activate the same NFB, they might not generate two distinct phosphenes.

3. Related Work

The conventional ‘naive’ encoding strategy sets the amplitude of each electrode to the brightness of the corresponding pixel in the target image [5, 27], making the stimulus a down-sampled version of the target. Although simple, this strategy only works with near-linear forward models, cannot account for real phosphene data, and often leads to unrecognizable percepts (Fig. 2, right) [7, 22].

Many alternative stimulation strategies have been proposed [28]. Shah et al. [29] used a greedy approach to iteratively select the stimuli that best recreate a desired neural activity pattern over a given temporal window, assuming that the brain would integrate them into a coherent visual percept. Ghaffari et al. [30] used a neural network surrogate model combined with an interior point algorithm to optimize for localized, circular neural activation patterns for individual electrodes. Fauvel et al. [31] used human in-the-loop Bayesian optimization to achieve encodings perceptually favored by the simulated patient. Spencer et al. [32] proposed framing stimulus encoding as inversion of a forward model of neural activation patterns, but to approximate the inverse, their approach either requires simplification or is NP-hard [32].

Van Steveninck et al. [33] proposed an end-to-end optimization strategy with a fixed phosphene model, similar to HNA. However, their approach crucially differs from ours in its inclusion of a secondary DNN to post-process the predicted phosphenes. This is a critical limitation, because a low reconstruction loss does not reveal whether a high-fidelity encoder was learned or whether the secondary decoder network simply learned to correct for the encoder’s mistakes. In addition, they used an unrealistic phosphene model that simply upscales and smooths a binary stimulus pattern. It is therefore not clear whether their results would generalize to real visual prosthesis patients.

Relic et al. [10] also utilized the end-to-end approach, but without the secondary decoder network used in [33]. They used a more realistic phosphene model, which accounts for some spatial distortions (e.g., axonal streaks), but not the effects of stimulus parameters. Since including a realistic phosphene model in the loop is not straightforward, they instead trained a surrogate neural network to approximate the forward model. We re-implemented Relic’s surrogate approach in this paper as a baseline method to compare against, as described in Section 4.

Taken together, we identified three main limitations of previous work that this study aims to address:

Unrealistic forward models. Most previous approaches (e.g., [29, 32, 33]) use an overly simplified forward model that cannot account for empirical data [7, 22]. We overcome this limitation by developing (and inverting) a differentiable version of a neurophysiologically informed and psychophysically validated phosphene model [24] that can account for the effects of stimulus amplitude, frequency, and pulse duration on phosphene appearance.
Optimization of neural responses. Most previous approaches (e.g., [29, 32]) focus on optimizing neural activation patterns in the retina in response to electrical stimulation (“bottom-up”). However, the visual system undergoes extensive remodeling during blinding diseases such as retinitis pigmentosa [34]. Thus the link between neural activity and visual perception is unclear. We overcome this limitation by inverting a phenomenological (“top-down”) model constrained by behavioral data that predicts visual perception directly from electrical stimuli [22, 24].
Objective function Most previous approaches rely on minimizing mean squared error (MSE) between the target and reconstructed image. Although simple and efficient, MSE is known to be a poor measure of perceptual dissimilarity for images [35] and does not align well with human assessments of image quality [36]. We overcome this limitation by proposing a joint perceptual metric that combines mean absolute error (MAE), VGG, and Laplacian smoothing losses.

4. Methods

Problem Formulation

We consider a system where there is some known forward process $f$ mapping stimuli to responses $f : 𝒮 \mapsto ℛ$ , $f (𝒮) \subset ℛ$ . In the case of visual prostheses, $f$ may map stimuli to visual percepts. $f$ may optionally be parameterized by patient-specific parameters $ϕ$ .

Finding the best stimulus for an arbitrary target response $t \in ℛ$ is equivalent to finding the inverse of $f$ . However, since not every response can be perfectly reproduced by a stimulus, the true inverse of $f$ is not well defined. We therefore seek the pseudoinverse (still denoted as $f^{- 1}$ for simplicity) instead, which outputs the stimuli that produce the closest response to $t$ under some distance metric $d$ :

f^{- 1} (t, ϕ) ≔ \underset{s \in 𝒮}{arg min} d (f (s; ϕ), t) .

(1)

This problem is straightforward to solve using an autoencoder approach, where a learned encoder $f^{- 1}$ is trained to invert the fixed decoder $f$ (i.e., forward model).

Encoder

We approximate the pseudoinverse $f^{- 1}$ with a DNN encoder ${\hat{f}}^{- 1} (t, ϕ; θ)$ with weights $θ$ , trained to minimize the distance $d$ between the target image $t$ and predicted image $\hat{t}$ :

min_{θ, ϕ \sim p (ϕ)} d (t, \hat{t})

(2)

\hat{t} = f ({\hat{f}}^{- 1} (t, ϕ; θ); ϕ),

(3)

where $ϕ$ is sampled from a uniform random distribution spanning the empirically observed range of patient-specific parameters [22, 24].

We use a simple architecture consisting solely of fully connected (FC) and batch normalization (BN) [37] layers (1.4M trainable parameters). First, the target $t$ is flattened and input to a FC layer. In parallel, the patient parameters $ϕ$ are input to a BN layer and two hidden FC layers. Then, the outputs of these two paths are concatenated, and the combined vector fed through two FC layers, producing an intermediate representation $x$ . Amplitudes are predicted from $x$ with a FC layer. The amplitudes are then concatenated with $x$ , put through a BN layer, and used to predict frequency and pulse duration, each with a FC layer. The outputs are merged into a stimulus matrix $\hat{s}$ . All layers use leaky ReLU activation, except for stimulus outputs, which use ReLU to enforce nonnegativity.

Decoder

The HNA decoder is a differentiable approximation of the underlying biological system, and describes the transform from stimulus to response. For our decoder $f$ , we use a reformulated but equivalent version of the model described in [24]. This model is specific to epiretinal prostheses; analogous models exist for other neuroprostheses (e.g., auditory [38-43], tactile and somatosensory [44-48]), and could potentially be adapted for use with HNA. We use a square 15 × 15 array of 150μm electrodes, spaced 400μm apart and centered on the fovea. The size and scale of this device are motivated by similar designs in real epiretinal implants.

$f$ takes as input a stimulus matrix $s \in R_{\geq 0}^{n_{e} x 3}$ , where the stimulus on each electrode ( $s_{e}$ ) is a biphasic pulse train described by its frequency, amplitude, and pulse duration. A simulated map of retinal NFBs is used to predict phosphene shape. Following [22], each retinal ganglion cells’ activation is assumed to be the maximum stimulation intensity along its axon. Axon sensitivity is assumed to decay exponentially with i) distance to the stimulating electrode (radial decay rate, $ρ$ ) and distance to the soma along the curved axon (axonal decay rate, $λ$ ). To account for stimulus properties [24], $ρ$ , $λ$ , and the per-electrode brightness are scaled by three simple equations $F_{size} (s_{e}, ϕ)$ , $F_{streak} (s_{e}, ϕ)$ , and $F_{bright} (s_{e}, ϕ)$ , respectively.

The brightness of a pixel located at the point $x \in R^{2}$ in the output image is given by

f (s; ϕ) = \max_{a \in A} \sum_{e \in E} F_{bright} (s_{e}, ϕ) exp (\frac{- {‖ x - e ‖}_{2}^{2}}{2 ρ^{2} F_{size} (s_{e}, ϕ)} + \frac{{- d_{s} (x, a)}^{2}}{2 λ^{2} F_{streak} (s_{e}, ϕ)})

(4)

where $A$ is the cells’ axon trajectory, $E$ is the set of electrodes, $ϕ = {ρ, λ, \dots}$ is a set of 12 patient-specific parameters, and $d_{s}$ is the path length along the axon trajectory [49]from $a$ to $x$ :

d_{s} (x, a) = \int_{a}^{x} \sqrt{{A (θ)}^{2} + {(\frac{d A (θ)}{d θ})}^{2}} d θ .

(5)

This model ( $f$ ) can be fit to individual patients; however, it is computationally slow and not differentiable. For more details on these equations, see [24]. We therefore considered two approaches:

Differentiable Model: We reformulated equations 4 and 5 into an equivalent set of parallelized matrix operations, implemented in Tensorflow [50]. Significant efforts were put towards developing a model optimized for XLA engines on GPU, resulting in speedups of up to 5000x compared to the model as presented in [24], enabling large-scale gradient descent. To enforce differentiability, we numerically approximated $d_{s}$ using $∣ A ∣ = 500$ axon segments per axon.
Surrogate Model: We also implemented the surrogate approach from [10] as a baseline method, where $f$ is approximated with another DNN ${\hat{f}}_{ϕ} (s; θ_{s})$ with weights $θ_{f}$ . To achieve this we generated 50,000 percepts using randomly selected stimuli passed through $f$ and fit a DNN to produce identical images. As $f$ is highly dependent on patient-specific parameters $ϕ$ , we generated new data and fit a separate surrogate model for each $ϕ$ in our experimental set. Specific implementation details of the surrogate are presented in Appendix A. Our implementation improves upon [10] by using the more advanced phosphene model described above, which accounts for effects of stimulus properties and allows for optimization of stimulus frequency in addition to amplitude.

Metrics

To measure perceptual similarity, we use a joint perceptual objective consisting of a VGG [51] similarity term, a mean absolute error (MAE) term, and a smoothness regularization term. The MAE term is given by $L_{MAE} = \frac{1}{∣ t ∣} {‖ t - \hat{t} ‖}_{1}$ .

The VGG term aims to capture higher-level differences between images [33, 52]. The target image and reconstructed phosphene are input to VGG-19 pretrained on ImageNet [53], and the MSE between the activations on a downstream convolutional layer is computed. Let $V_{l}$ be a function that extracts the activations of the $l$ -th convolutional layer for a given image. The VGG loss is then defined as $L_{VGG} = \frac{1}{∣ t ∣} {‖ V_{l} (t) - V_{l} (\hat{t}) ‖}_{2}^{2}$ .

We also include a smoothing regularization term. This term imposes a loss on the second spatial derivative of the predicted image. A low second derivative implies that where the target image does change, it changes slowly. We found this encouraged smoother, more connected phosphenes. To approximate the second derivative, we convolve the image with a Laplacian filter [54] of size $k$ , denoted by $L a p (\cdot, k)$ , and compute the mean. The smoothness loss is given by:

L_{Smooth} = \frac{1}{∣ \hat{t} ∣} \sum_{i} {(\frac{\partial^{2}}{d x^{2}} \hat{t})}_{i} = \frac{1}{∣ \hat{t} ∣} \sum_{i} L a p ({\hat{t}, k)}_{i} .

(6)

Our final objective is the weighted sum of the three individual losses, given by Eq. 7, where $α$ and $β$ are hyperparameters controlling the relative weighting of the three terms.

d = L_{MAE} + α L_{Smooth} + β L_{VGG} .

(7)

We also implement a secondary metric to quantify phosphene recognizability, applicable only for the MNIST reconstruction task. We first pre-train a classifier network on the MNIST targets until it reaches 99% test accuracy, and then fix the weights. The relative accuracy (RA) is then defined as the ratio of the classifiers accuracy on the reconstructed images to its accuracy on the targets $R A = A C C ∕ A C C (t)$ . A perfect encoder would result in $R A = 100 %$ . A similar process was not possible for the COCO task due to the possibility of having multiple objects in each target image.

Training/Optimization

We trained using Tensorflow 2.7 [50] on a single NVIDIA RTX 3090 with 24GB memory. Stochastic gradient descent with Nesterov momentum was used to minimize the joint perceptual loss. We used a batch size of 16 due to memory limitations imposed by $f$ . The amplitude, frequency predictions are scaled by 2, 20 respectively, while the pulse duration predictions were shifted by 1e-3 prior to being fed through the decoder. This encourages the initial predictions of the network to be in a reasonable range. The Laplacian filter size $k$ is set to 5. We choose $l$ to be first convolutional layer in the last block using cross validation (see Appendix B). Similarly, we perform cross validation to find the best values for $α$ and $β$ . Instead of using one value, we found that incrementally increasing the weighting of the VGG loss ( $β$ ) from 0 while simultaneously decreasing the initially high weight on the smoothing constraint ( $α$ ) was crucial for performance, especially when the range of allowed $ϕ$ values was large (see Appendix B).

Datasets

We first evaluated on handwritten digits from MNIST [55], enabling comparison to previous works [10]. Images preprocessing consisted of resizing the target images to the same shape as the output of $f$ (49x49). We also evaluate on more realistic images of common objects from the MS-COCO [56] dataset. We selected a subset of 25 of the MS-COCO object categories deemed more likely to be encountered by blind individuals (e.g. people, household objects), and use only images that contain at least one instance of these objects. We further filter out images by various other criteria, such as being too cluttered or too dim. This process results in a total of approximately 47K training images and 12K test images. See Appendix C for a full description of the selection process.

Natural images often contain too much detail to be encoded with prosthetic vision. While scene simplification strategies exist [57], we focus on the encoding algorithm, so we simply used the ground-truth segmentation masks to segment out the objects of interest. The images were then converted to grayscale, and resized to 49 × 49 pixels.

5. Results

5.1. MNIST

The phosphenes produced from the HNA, surrogate, and naive encoders on the MNIST test set are shown in Fig. 3 and performance is summarized in Table 1. For each MNIST sample, the target image is input to the encoder, which predicts a stimulus. The stimulus is fed through the true forward model $f$ , and the predicted phosphene is shown. Since the surrogate method must be retrained for each $ϕ$ , results are only shown for 4 simulated patients. Our proposed approach outperformed the baselines across all metrics (see Appendix D for a comparison of stimuli).

Figure 3: — Reconstructed MNIST targets for HNA, surrogate, and naive encoders across 4 specific simulated patients. Note that the brightness of the naive encoder is clipped for display

Table 1:

MNIST performance

Encoding	$ρ = 150 λ = 100$			$ρ = 150 λ = 1500$			$ρ = 800 λ = 100$			$ρ = 800 λ = 1500$
	Joint Loss	MAE	RA	Joint Loss	MAE	RA	Joint Loss	MAE	RA	Joint Loss	MAE	RA
Naive	1.161	0.1855	90.3	1.442	0.214	78.1	8.152	1.500	34.8	8.780	1.726	28.8
Surrogate	2.509	0.1351	53.8	3.118	0.2431	30.7	1.692	0.2135	19.9	1.694	0.2237	18.1
HNA	0.559	0.064	98.1	1.029	0.1412	89.3	0.913	0.113	95.9	0.957	0.126	94.8

Open in a new tab

5.2. COCO

The phosphenes produced by HNA and the naive encoder for the segmented COCO dataset are shown in Fig. 4. We omit the surrogate results due to its poor perceptual performance on MNIST. Averaged across all $ϕ$ , HNA had a joint loss of 0.713 on the test set and MAE of 0.1408, while the naive encoder had a joint loss of 1.873 and MAE of 0.2830.

Figure 4: — Original (*top row*), segmented (*second row*), and reconstructed targets for the COCO dataset, for both HNA (*third row*) and naive encoders (*bottom row*). Left to right within each block of 4 images, $ρ$ takes values of 200, 400, 600, 800. Left to right across blocks, $λ$ takes values of 250, 750, 1250, 2000. Note that the brightness of the naive method is clipped for display.

5.3. Modeling Patient-to-Patient Variations

MNIST encoder performance across simulated patients ( $ϕ$ ) is shown in Fig. 5. Since the surrogate encoder has to be retrained for each patient, comparison is infeasible. To visualize the effects of changing $ρ$ and $λ$ on the produced phosphenes, Fig. 5A shows the result of encoding two example MNIST digits, both using the naive method and our encoder. As $λ$ increases, the naive phosphenes appear increasingly elongated, and as $ρ$ increases, the phosphenes become increasingly large and blurry. The phosphenes from HNA are slightly too dim and disconnected at low $ρ$ , but are relatively stable across other values of $ρ$ and $λ$ .

Figure 5: — Encoder performance across simulated patients (varying $ρ$ and $λ$ ) on the MNIST dataset. A: Target, HNA encoder, and naive encoder phosphenes for two example digits. B: Heatmaps showing the log joint loss across $ρ$ and $λ$ for HNA and naive encoders. C: T-SNE clusterings on original MNIST targets, HNA reconstructed phosphenes, and naive reconstructed phosphenes.

To compare performance across the entire dataset, we computed the average test set loss across the same range of $ρ$ and $λ$ (Fig. 5B). The encoder performs well across a wide range of simulated patients, with larger loss only at low $ρ$ . The naive method performs well only on a limited set of $ϕ$ , with small $λ$ and $ρ \approx 200$ . The naive loss was higher than the learned encoder at every simulated point. Random sampling of $ρ$ and $λ$ for each image results in a joint loss of 0.921, MAE of 0.120, and RA of 94.0% for HNA, while the naive encoder results in a joint loss of 3.17, MAE of 0.596, and RA of 63.6%. The same analysis yielded similar results on COCO (Appendix E). An analysis across other parameters is presented in Appendix F.

In order for prosthetic vision to be useful, different instances of the same objects would ideally produce similar phosphenes, allowing for consistent perception. To evaluate whether our model achieves this, we cluster the target images and resulting phosphenes using t-SNE [58] shown in Fig. 5C. The ground truth images form clusters corresponding to the digits 0-9. The phosphenes from our encoder roughly form similar, slightly less separated groupings, whereas the naive phosphenes do not. To ensure that this was not the result of bad t-SNE hyperparameters, we repeated the clustering across different perplexities and learning rates, obtaining similar or worse results.

5.4. Joint Perceptual Error Ablation Study

To show that the joint perceptual metric performs better than any of its individual components, we train models using just the VGG loss and just MAE loss. Shown are values for $ρ = 150$ and $ρ = 600$ . As mentioned previously, encoders trained using just VGG loss fail to converge, thus we pretrain the VGG encoder using MAE and smoothing loss, then transition to using only VGG. We do not consider ablating the smoothing term (Eq. 6) because it is simply a regularization term. Fig. 6 shows the phosphenes produced by HNA trained on the joint, VGG-only, and MAE-only loss.

Figure 6: — MNIST images for HNA encoders trained using the joint, VGG-only, and MAE-only loss.

The VGG encoder had a test VGG loss of 4% lower than the joint model, but its produced phosphenes are oversmoothed and blurry. The MAE encoder had a final test MAE of 9% lower than the joint model, but its produced phosphenes are disconnected and low-quality. The joint model had a RA of 99.0%, the VGG encoder had a RA of 95.9%, and the joint model had a RA of 77.6%

6. Discussion

Visual Prostheses

We found that HNA is able to produce high-fidelity stimuli from the MNIST and COCO datasets that outperform conventional encoding strategies across all tested conditions. Importantly, HNA produces phosphenes that are consistent across representations of the same object (Fig. 5C), which is critical to allowing prosthesis users to learn to associate certain visual patterns with specific objects. On the MNIST task, HNA produced high quality reconstructions, nearly matching the targets (Figure 3). On the harder COCO task, HNA significantly outperformed the naive encoder, but was still unable to capture all of the detail in the images. In Appendix G, we demonstrate that this is largely due to the implant’s limited spatial resolution and not a fundamental limitation of HNA.

Another advantage of the HNA is that it can be trained to predict stimuli across a wide range of patient-specific parameter values $ϕ$ , whereas the conventional naive encoder works well only for small values of $ρ$ and $λ$ . This may be one reason why the naive encoding strategy has been shown to lead to substantial individual differences in visual outcomes [19, 59]. Our results suggest that stimuli produced with HNA may be able to reduce at least some amount of this patient-to-patient variability.

Furthermore, HNA also proved superior to a surrogate forward model. The latter offer an alternative when the forward model is computationally expensive or not differentiable. Understandably, any inaccuracies in the surrogate model will propagate to the learned encoder during training. However, we observed that even for well trained surrogates, the encoder may still learn to exploit the inexact surrogate instead of learning to invert the true model (see Appendix A). It is possible that this exploitation could be mitigated to some extent by adversarially-robust training techniques [60]. We suspect that the surrogate method’s inferior performance here compared to [10] can be explained by our larger stimulus search space. Thus, we cannot currently suggest HNA for surrogate forward models, unless the forward model is sufficiently simple or has a small stimulus space.

Deployment

HNA encoders must be lightweight enough to be deployed in resource-limited neuroprosthetic environments. Our encoder’s single image inference time was 1.2ms on GPU and 4ms on CPU. Future work could reduce these numbers through network pruning, mixed precision, and architecture search. Low-power Edge AI accelerators (e.g., Intel’s Neural Compute Stick) and dedicated neuromorphic hardware (e.g., BrainChip’s Akida SoC) may provide another solution.

Broader Impacts

While our work is presented in the context of visual prostheses, the HNA framework may apply to any sensory neuroprosthesis where stimulus selection can be informed by a numeric or symbolic forward model. For example, HNA could be used in cochlear implants [3] to choose stimuli that result in a desired sound, and in spinal cord implants [15] to find the best way to relay neural signals through a damaged section of the spinal cord. Conveniently, the forward models required by HNA have already been developed for a range of applications [38-48]. However, HNA might not apply to all neural interfaces, such as systems without a clear neural or perceptual target (e.g., deep brain stimulation for the treatment of Parkinson’s [61]) or closed-loop systems [16, 62].

Limitations

Despite HNA’s potential, the current implementation has a number of limitations. First, as presented the HNA encoder only applies to static targets. Hence dynamic targets must be split into individual frames and encoded separately. However, one approach might be to encode entire stimulus sequences (instead of frames) that are optimized to reconstruct the dynamic target sequence.

Second, HNA works best if there is an accurate forward model mapping from stimulus space to perception. However, Appendix H shows that HNA may still give benefits over a naive encoding even when patient-specific parameters are unknown or mis-specified. In general, if a prosthesis elicits similar results across patients, then a non-patient-specific model would suffice.

Third, the current works deals only with simulated patients. The use of a DNN for stimulus encoding in real patients may raise safety concerns. Since we cannot examine the process by which stimuli are chosen, it is possible that HNA might produce harmful stimuli that could lead to serious adverse events (e.g., seizures). However, this concern is mitigated by the fact that most neuroprostheses are equipped with firmware responsible for ensuring stimuli stay within FDA-approved safety limits.

7. Conclusion

In summary, this paper proposes a hybrid autoencoder structure as a general framework for stimulus optimization in sensory neuroprostheses and, as a proof of concept, demonstrates its utility on the prominent example of visual neuroprostheses, drastically outperforming conventional encoding strategies. This constitutes an important step towards the long-standing challenge of restoring high-quality vision to people living with incurable blindness and may prove a promising solution for a variety of neuroprosthetic technologies.

Supplementary Material

Appendix

NIHMS1928640-supplement-Appendix.pdf^{(7.4MB, pdf)}

Acknowledgements

This work was supported by the National Institutes of Health (NIH R00 EY-029329 to MB).

Contributor Information

Jacob Granley, Department of Computer Science, University of California, Santa Barbara.

Lucas Relic, Department of Computer Science, University of California, Santa Barbara.

Michael Beyeler, Department of Computer Science, University of California, Santa Barbara; Department of Psychological & Brain Sciences, University of California, Santa Barbara.

References

[1].Cinel Caterina, Valeriani Davide, and Poli Riccardo. Neurotechnologies for Human Cognitive Augmentation: Current State of the Art and Future Prospects. Frontiers in Human Neuroscience, 13:13, January 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[2].Fernandez Eduardo. Development of visual Neuroprostheses: trends and challenges. Bioelectronic Medicine, 4(1):12, August 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[3].Wilson Blake S., Finley Charles C., Lawson Dewey T., Wolford Robert D., Eddington Donald K., and Rabinowitz William M.. Better speech recognition with cochlear implants. Nature, 352(6332):236–238, July 1991. Number: 6332 Publisher: Nature Publishing Group. [DOI] [PubMed] [Google Scholar]
[4].Fernández Eduardo, Alfaro Arantxa, Soto-Sánchez Cristina, Gonzalez-Lopez Pablo, Lozano Antonio M., Peña Sebastian, Grima Maria Dolores, Rodil Alfonso, Gómez Bernardeta, Chen Xing, Roelfsema Pieter R., Rolston John D., Davis Tyler S., and Normann Richard A.. Visual percepts evoked with an intracortical 96-channel microelectrode array inserted in human occipital cortex. Journal of Clinical Investigation, 131(23):e151331, December 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
[5].Luo Yvonne Hsu-Lin and da Cruz Lyndon. The Argus^® II Retinal Prosthesis System. Progress in Retinal and Eye Research, 50:89–107, January 2016. [DOI] [PubMed] [Google Scholar]
[6].Tan Daniel W., Schiefer Matthew A., Keith Michael W., Anderson James Robert, Tyler Joyce, and Tyler Dustin J.. A neural interface provides long-term stable natural touch perception. Science Translational Medicine, 6(257):257ra138–257ra138, October 2014. Publisher: American Association for the Advancement of Science. [DOI] [PMC free article] [PubMed] [Google Scholar]
[7].Erickson-Davis Cordelia and Korzybska Helma. What do blind people “see” with retinal prostheses? Observations and qualitative reports of epiretinal implant users. PLOS ONE, 16(2):e0229189, February 2021. Publisher: Public Library of Science. [DOI] [PMC free article] [PubMed] [Google Scholar]
[8].Murray Craig D.. Embodiment and Prosthetics. In Gallagher Pamela, Desmond Deirdre, and MacLachlan Malcolm, editors, Psychoprosthetics, pages 119–129. Springer, London, 2008. [Google Scholar]
[9].Brunton Bingni W. and Beyeler Michael. Data-driven models in human neuroscience and neuroengineering. Current Opinion in Neurobiology, 58:21–29, October 2019. [DOI] [PubMed] [Google Scholar]
[10].Relic Lucas, Zhang Bowen, Tuan Yi-Lin, and Beyeler Michael. Deep Learning–Based Perceptual Stimulus Encoder for Bionic Vision. In Augmented Humans 2022, AHs 2022, pages 323–325, New York, NY, USA, March 2022. Association for Computing Machinery. [Google Scholar]
[11].de Oca Zapiain David Montes, Stewart James A., and Dingreville Rémi. Accelerating phase-field-based microstructure evolution predictions via surrogate models trained by machine learning methods. npj Computational Materials, 7(1):1–11, January 2021. Number: 1 Publisher: Nature Publishing Group. [Google Scholar]
[12].Nabian Mohammad Amin, and Meidani Hadi. A Deep Neural Network Surrogate for High-Dimensional Random Partial Differential Equations. Probabilistic Engineering Mechanics, 57:14–25, July 2019. arXiv:1806.02957 [physics, stat]. [Google Scholar]
[13].Nikolopoulos Stefanos, Kalogeris Ioannis, and Papadopoulos Vissarion. Non-intrusive surrogate modeling for parametrized time-dependent partial differential equations using convolutional autoencoders. Engineering Applications of Artificial Intelligence, 109:104652, March 2022. [Google Scholar]
[14].Tabot Gregg A., Dammann John F., Berg Joshua A., Tenore Francesco V., Boback Jessica L., Vogelstein R. Jacob, and Bensmaia Sliman J.. Restoring the sense of touch with a prosthetic hand through a brain interface. Proceedings of the National Academy of Sciences, 110(45):18279–18284, November 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
[15].Capogrosso Marco, Milekovic Tomislav, Borton David, Wagner Fabien, Moraud Eduardo Martin, Mignardot Jean-Baptiste, Buse Nicolas, Gandar Jerome, Barraud Quentin, Xing David, Rey Elodie, Duis Simone, Jianzhong Yang, Ko Wai Kin D., Li Qin, Detemple Peter, Denison Tim, Micera Silvestro, Bezard Erwan, Bloch Jocelyne, and Courtine Grégoire. A brain–spine interface alleviating gait deficits after spinal cord injury in primates. Nature, 539(7628):284–288, November 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
[16].Chapman Christopher A. R., Goshi Noah, and Seker Erkin. Multifunctional Neural Interfaces for Closed-Loop Control of Neural Activity. Advanced Functional Materials, 28(12):1703523, 2018. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/adfm.201703523. [Google Scholar]
[17].Wagner Fabien B., Mignardot Jean-Baptiste, Le Goff-Mignardot Camille G., Demesmaeker Robin, Komi Salif, Capogrosso Marco, Rowald Andreas, Seáñez Ismael, Caban Miroslav, Pirondini Elvira, Vat Molywan, McCracken Laura A., Heimgartner Roman, Fodor Isabelle, Watrin Anne, Seguin Perrine, Paoles Edoardo, Van Den Keybus Katrien, Eberle Grégoire, Schurch Brigitte, Pralong Etienne, Becce Fabio, Prior John, Buse Nicholas, Buschman Rik, Neufeld Esra, Kuster Niels, Carda Stefano, von Zitzewitz Joachim, Delattre Vincent, Denison Tim, Lambert Hendrik, Minassian Karen, Bloch Jocelyne, and Courtine Grégoire. Targeted neurotechnology restores walking in humans with spinal cord injury. Nature, 563(7729):65–71, November 2018. [DOI] [PubMed] [Google Scholar]
[18].Ayton Lauren N., Barnes Nick, Dagnelie Gislin, Fujikado Takashi, Goetz Georges, Hornig Ralf, Jones Bryan W., Muqit Mahiul M. K., Rathbun Daniel L., Stingl Katarina, Weiland James D., and Petoe Matthew A.. An update on retinal prostheses. Clinical Neurophysiology, 131(6):1383–1398, June 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
[19].Stingl Katarina, Schippert Ruth, Bartz-Schmidt Karl U., Besch Dorothea, Cottriall Charles L., Edwards Thomas L., Gekeler Florian, Greppmaier Udo, Kiel Katja, Koitschev Assen, Kühlewein Laura, MacLaren Robert E., Ramsden James D., Roider Johann, Rothermel Albrecht, Sachs Helmut, Schröder Greta S., Tode Jan, Troelenberg Nicole, and Zrenner Eberhart. Interim Results of a Multicenter Trial with the New Electronic Subretinal Implant Alpha AMS in 15 Patients Blind from Inherited Retinal Degenerations. Frontiers in Neuroscience, 11, 2017. Publisher: Frontiers. [DOI] [PMC free article] [PubMed] [Google Scholar]
[20].Karapanos Lewis, Abbott Carla J., Ayton Lauren N., Kolic Maria, McGuinness Myra B., Baglin Elizabeth K., Titchener Samuel A., Kvansakul Jessica, Johnson Dean, Kentler William G., Barnes Nick, Nayagam David A. X., Allen Penelope J., and Petoe Matthew A.. Functional Vision in the Real-World Environment With a Second-Generation (44-Channel) Suprachoroidal Retinal Prosthesis. Translational Vision Science & Technology, 10(10):7–7, August 2021. Publisher: The Association for Research in Vision and Ophthalmology. [DOI] [PMC free article] [PubMed] [Google Scholar]
[21].Dobelle Wm H.. Artificial Vision for the Blind by Connecting a Television Camera to the Visual Cortex. ASAIO Journal, 46(1):3–9, February 2000. [DOI] [PubMed] [Google Scholar]
[22].Beyeler Michael, Nanduri Devyani, Weiland James D., Rokem Ariel, Boynton Geoffrey M., and Fine Ione. A model of ganglion axon pathways accounts for percepts elicited by retinal implants. Scientific Reports, 9(1):1–16, June 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[23].Rizzo JF, Wyatt J, Loewenstein J, Kelly S, and Shire D. Perceptual efficacy of electrical stimulation of human retina with a microelectrode array during short-term surgical trials. Invest Ophthalmol Vis Sci, 44(12):5362–9, December 2003. [DOI] [PubMed] [Google Scholar]
[24].Granley Jacob and Beyeler Michael. A Computational Model of Phosphene Appearance for Epiretinal Prostheses. In 2021 43rd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC), pages 4477–4481, November 2021. ISSN: 2694-0604. [DOI] [PMC free article] [PubMed] [Google Scholar]
[25].Luo Yvonne H-L., Zhong Joe Jiangjian, Clemo Monica, and da Cruz Lyndon. Long-term Repeatability and Reproducibility of Phosphene Characteristics in Chronically Implanted Argus II Retinal Prosthesis Subjects. American Journal of Ophthalmology, 170:100–109, October 2016. [DOI] [PubMed] [Google Scholar]
[26].Beyeler M, Rokem A, Boynton GM, and Fine I. Learning to see again: biological constraints on cortical plasticity and the implications for sight restoration technologies. J Neural Eng, 14(5):051003, June 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[27].Chen SC, Suaning GJ, Morley JW, and Lovell NH. Simulating prosthetic vision: I. Visual models of phosphenes. Vision Research, 49(12):1493–506, June 2009. [DOI] [PubMed] [Google Scholar]
[28].Tong Wei, Meffin Hamish, Garrett David J, and Ibbotson Michael R. Stimulation strategies for improving the resolution of retinal prostheses. Frontiers in neuroscience, 14:262, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
[29].Shah Nishal P., Madugula Sasidhar, Grosberg Lauren, Mena Gonzalo, Tandon Pulkit, Hottowy Pawel, Sher Alexander, Litke Alan, Mitra Subhasish, and Chichilnisky EJ. Optimization of Electrical Stimulation for a High-Fidelity Artificial Retina. In 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER), pages 714–718, March 2019. ISSN: 1948-3554. [Google Scholar]
[30].Ghaffari Dorsa Haji, Chang Yao-Chuan, Mirzakhalili Ehsan, and Weiland James D.. Closed-loop Optimization of Retinal Ganglion Cell Responses to Epiretinal Stimulation: A Computational Study. In 2021 10th International IEEE/EMBS Conference on Neural Engineering (NER), pages 597–600, May 2021. ISSN: 1948-3554. [Google Scholar]
[31].Fauvel Tristan and Chalk Matthew. Human-in-the-loop optimization of visual prosthetic stimulation. preprint, Neuroscience, November 2021. [DOI] [PubMed] [Google Scholar]
[32].Spencer Martin J., Kameneva Tatiana, Grayden David B., Meffin Hamish, and Burkitt Anthony N.. Global activity shaping strategies for a retinal implant. Journal of Neural Engineering, 16(2):026008, January 2019. Publisher: IOP Publishing. [DOI] [PubMed] [Google Scholar]
[33].de Ruyter van Steveninck Jaap, Güçlü Umut, van Wezel Richard, and van Gerven Marcel. End-to-end optimization of prosthetic vision. Journal of Vision, 22(2):20, February 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
[34].Marc Robert E, Jones Bryan W, Watt Carl B, and Strettoi Enrica. Neural remodeling in retinal degeneration. Progress in Retinal and Eye Research, 22(5):607–655, 2003. [DOI] [PubMed] [Google Scholar]
[35].Wang Zhou and Bovik Alan C.. Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures. IEEE Signal Processing Magazine, 26(1):98–117, January 2009. Conference Name: IEEE Signal Processing Magazine. [Google Scholar]
[36].Zhai Guangtao and Min Xiongkuo. Perceptual image quality assessment: a survey. Science China Information Sciences, 63(11):211301, November 2020. [Google Scholar]
[37].Ioffe Sergey and Szegedy Christian. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Technical Report arXiv:1502.03167, arXiv, March 2015. arXiv:1502.03167 [cs] type: article. [Google Scholar]
[38].Dorman Michael F, Spahr Anthony J, Loizou Philipos C, Dana Cindy J, and Schmidt Jennifer S. Acoustic simulations of combined electric and acoustic hearing (eas). Ear and Hearing, 26(4):371–380, 2005. [DOI] [PubMed] [Google Scholar]
[39].Svirsky Mario A, Ding Nai, Sagi Elad, Tan Chin-Tuan, Fitzgerald Matthew, Glassman E Katelyn, Seward Keena, and Neuman Arlene C. Validation of acoustic models of auditory neural prostheses. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 8629–8633. IEEE, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
[40].Dorman MF, Loizou PC, Spahr A, and Dana CJ. Simulations of combined acoustic/electric hearing. In Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No. 03CH37439), volume 3, pages 1999–2001. IEEE, 2003. [Google Scholar]
[41].Dorman Michael F, Loizou Philipos C, and Rainey Dawne. Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs. The Journal of the Acoustical Society of America, 102(4):2403–2411, 1997. [DOI] [PubMed] [Google Scholar]
[42].Cooper William B, Tobey Emily, and Loizou Philipos C. Music perception by cochlear implant and normal hearing listeners as measured by the montreal battery for evaluation of amusia. Ear and hearing, 29(4):618, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
[43].Loizou Philipos C, Dorman Michael, Poroy Oguz, and Spahr Tony. Speech recognition by normal-hearing and cochlear implant listeners as a function of intensity resolution. The Journal of the Acoustical Society of America, 108(5):2377–2387, 2000. [DOI] [PubMed] [Google Scholar]
[44].Saal Hannes P, Delhaye Benoit P, Rayhaun Brandon C, and Bensmaia Sliman J. Simulating tactile signals from the whole hand with millisecond precision. Proceedings of the National Academy of Sciences, 114(28):E5693–E5702, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[45].Okorokova Elizaveta V, He Qinpu, and Bensmaia Sliman J. Biomimetic encoding model for restoring touch in bionic hands through a nerve interface. Journal of neural engineering, 15(6):066033, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[46].Weber Douglas J, Friesen Rebecca, and Miller Lee E. Interfacing the somatosensory system to restore touch and proprioception: essential considerations. Journal of motor behavior, 44(6):403–418, 2012. [DOI] [PubMed] [Google Scholar]
[47].Kim Sung Soo, Sripati Arun P, and Bensmaia Sliman J. Predicting the timing of spikes evoked by tactile stimulation of the hand. Journal of neurophysiology, 104(3):1484–1496, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
[48].Mileusnic Milana P, Brown Ian E, Lan Ning, and Loeb Gerald E. Mathematical models of proprioceptors. i. control and transduction in the muscle spindle. Journal of neurophysiology, 96(4):1772–1788, 2006. [DOI] [PubMed] [Google Scholar]
[49].Jansonius NM, Nevalainen J, Selig B, Zangwill LM, Sample PA, Budde WM, Jonas JB, Lagrèze WA, Airaksinen PJ, Vonthein R, Levin LA, Paetzold J, and Schiefer U. A mathematical description of nerve fiber bundle trajectories and their variability in the human retina. Vision Research, 49(17):2157–2163, August 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
[50].Abadi Martín, Agarwal Ashish, Barham Paul, Brevdo Eugene, Chen Zhifeng, Citro Craig, Corrado Greg S., Davis Andy, Dean Jeffrey, Devin Matthieu, Ghemawat Sanjay, Goodfellow Ian, Harp Andrew, Irving Geoffrey, Isard Michael, Jia Yangqing, Jozefowicz Rafal, Kaiser Lukasz, Kudlur Manjunath, Levenberg Josh, Mané Dandelion, Monga Rajat, Moore Sherry, Murray Derek, Olah Chris, Schuster Mike, Shlens Jonathon, Steiner Benoit, Sutskever Ilya, Talwar Kunal, Tucker Paul, Vanhoucke Vincent, Vasudevan Vijay, Viégas Fernanda, Vinyals Oriol, Warden Pete, Wattenberg Martin, Wicke Martin, Yu Yuan, and Zheng Xiaoqiang. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org. [Google Scholar]
[51].Simonyan Karen and Zisserman Andrew. Very Deep Convolutional Networks for Large-Scale Image Recognition. Technical Report arXiv:1409.1556, arXiv, April 2015. arXiv:1409.1556 [cs] type: article. [Google Scholar]
[52].Li Yijun, Fang Chen, Yang Jimei, Wang Zhaowen, Lu Xin, and Yang Ming-Hsuan. Universal Style Transfer via Feature Transforms. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. [Google Scholar]
[53].Deng Jia, Dong Wei, Socher Richard, Li Li-Jia, Li Kai, and Fei-Fei Li. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, June 2009. ISSN: 1063-6919. [Google Scholar]
[54].Paris Sylvain, Hasinoff Samuel W, and Kautz Jan. Local Laplacian Filters: Edge-aware Image Processing with a Laplacian Pyramid. Communications of the ACM, 58:11, 2015. [Google Scholar]
[55].Deng Li. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141–142, 2012. [Google Scholar]
[56].Lin Tsung-Yi, Maire Michael, Belongie Serge, Bourdev Lubomir, Girshick Ross, Hays James, Perona Pietro, Ramanan Deva, Zitnick C. Lawrence, and Dollár Piotr. Microsoft COCO: Common Objects in Context. Technical Report arXiv:1405.0312, arXiv, February 2015. arXiv:1405.0312 [cs] type: article. [Google Scholar]
[57].Han Nicole, Srivastava Sudhanshu, Xu Aiwen, Klein Devi, and Beyeler Michael. Deep Learning–Based Scene Simplification for Bionic Vision. In Augmented Humans Conference 2021, AHs’21, pages 45–54, New York, NY, USA, February 2021. Association for Computing Machinery. [Google Scholar]
[58].Van der Maaten Laurens and Hinton Geoffrey. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008. [Google Scholar]
[59].Peli Eli. Testing Vision Is Not Testing For Vision. Translational Vision Science & Technology, 9(13):32–32, December 2020. Publisher: The Association for Research in Vision and Ophthalmology. [DOI] [PMC free article] [PubMed] [Google Scholar]
[60].Tramèr Florian, Kurakin Alexey, Papernot Nicolas, Goodfellow Ian, Boneh Dan, and McDaniel Patrick. Ensemble adversarial training: Attacks and defenses. arXiv preprint arXiv:1705.07204, 2017. [Google Scholar]
[61].Benabid Alim Louis. Deep brain stimulation for Parkinson’s disease. Current Opinion in Neurobiology, 13(6):696–706, December 2003. [DOI] [PubMed] [Google Scholar]
[62].Bozorgzadeh Bardia, Schuweiler Douglas R., Bobak Martin J., Garris Paul A., and Mohseni Pedram. Neurochemostat: A Neural Interface SoC With Integrated Chemometrics for Closed-Loop Regulation of Brain Dopamine. IEEE Transactions on Biomedical Circuits and Systems, 10(3):654–667, June 2016. Conference Name: IEEE Transactions on Biomedical Circuits and Systems. [DOI] [PMC free article] [PubMed] [Google Scholar]
[63].Beyeler M, Boynton GM, Fine I, and Rokem A. pulse2percept: A Python-based simulation framework for bionic vision. In Huff K, Lippa D, Niederhut D, and Pacer M, editors, Proceedings of the 16th Science in Python Conference, pages 81–88, 2017. [Google Scholar]
[64].Loshchilov Ilya and Hutter Frank. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix

NIHMS1928640-supplement-Appendix.pdf^{(7.4MB, pdf)}

[R1] [1].Cinel Caterina, Valeriani Davide, and Poli Riccardo. Neurotechnologies for Human Cognitive Augmentation: Current State of the Art and Future Prospects. Frontiers in Human Neuroscience, 13:13, January 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] [2].Fernandez Eduardo. Development of visual Neuroprostheses: trends and challenges. Bioelectronic Medicine, 4(1):12, August 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] [3].Wilson Blake S., Finley Charles C., Lawson Dewey T., Wolford Robert D., Eddington Donald K., and Rabinowitz William M.. Better speech recognition with cochlear implants. Nature, 352(6332):236–238, July 1991. Number: 6332 Publisher: Nature Publishing Group. [DOI] [PubMed] [Google Scholar]

[R4] [4].Fernández Eduardo, Alfaro Arantxa, Soto-Sánchez Cristina, Gonzalez-Lopez Pablo, Lozano Antonio M., Peña Sebastian, Grima Maria Dolores, Rodil Alfonso, Gómez Bernardeta, Chen Xing, Roelfsema Pieter R., Rolston John D., Davis Tyler S., and Normann Richard A.. Visual percepts evoked with an intracortical 96-channel microelectrode array inserted in human occipital cortex. Journal of Clinical Investigation, 131(23):e151331, December 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] [5].Luo Yvonne Hsu-Lin and da Cruz Lyndon. The Argus^® II Retinal Prosthesis System. Progress in Retinal and Eye Research, 50:89–107, January 2016. [DOI] [PubMed] [Google Scholar]

[R6] [6].Tan Daniel W., Schiefer Matthew A., Keith Michael W., Anderson James Robert, Tyler Joyce, and Tyler Dustin J.. A neural interface provides long-term stable natural touch perception. Science Translational Medicine, 6(257):257ra138–257ra138, October 2014. Publisher: American Association for the Advancement of Science. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] [7].Erickson-Davis Cordelia and Korzybska Helma. What do blind people “see” with retinal prostheses? Observations and qualitative reports of epiretinal implant users. PLOS ONE, 16(2):e0229189, February 2021. Publisher: Public Library of Science. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] [8].Murray Craig D.. Embodiment and Prosthetics. In Gallagher Pamela, Desmond Deirdre, and MacLachlan Malcolm, editors, Psychoprosthetics, pages 119–129. Springer, London, 2008. [Google Scholar]

[R9] [9].Brunton Bingni W. and Beyeler Michael. Data-driven models in human neuroscience and neuroengineering. Current Opinion in Neurobiology, 58:21–29, October 2019. [DOI] [PubMed] [Google Scholar]

[R10] [10].Relic Lucas, Zhang Bowen, Tuan Yi-Lin, and Beyeler Michael. Deep Learning–Based Perceptual Stimulus Encoder for Bionic Vision. In Augmented Humans 2022, AHs 2022, pages 323–325, New York, NY, USA, March 2022. Association for Computing Machinery. [Google Scholar]

[R11] [11].de Oca Zapiain David Montes, Stewart James A., and Dingreville Rémi. Accelerating phase-field-based microstructure evolution predictions via surrogate models trained by machine learning methods. npj Computational Materials, 7(1):1–11, January 2021. Number: 1 Publisher: Nature Publishing Group. [Google Scholar]

[R12] [12].Nabian Mohammad Amin, and Meidani Hadi. A Deep Neural Network Surrogate for High-Dimensional Random Partial Differential Equations. Probabilistic Engineering Mechanics, 57:14–25, July 2019. arXiv:1806.02957 [physics, stat]. [Google Scholar]

[R13] [13].Nikolopoulos Stefanos, Kalogeris Ioannis, and Papadopoulos Vissarion. Non-intrusive surrogate modeling for parametrized time-dependent partial differential equations using convolutional autoencoders. Engineering Applications of Artificial Intelligence, 109:104652, March 2022. [Google Scholar]

[R14] [14].Tabot Gregg A., Dammann John F., Berg Joshua A., Tenore Francesco V., Boback Jessica L., Vogelstein R. Jacob, and Bensmaia Sliman J.. Restoring the sense of touch with a prosthetic hand through a brain interface. Proceedings of the National Academy of Sciences, 110(45):18279–18284, November 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] [15].Capogrosso Marco, Milekovic Tomislav, Borton David, Wagner Fabien, Moraud Eduardo Martin, Mignardot Jean-Baptiste, Buse Nicolas, Gandar Jerome, Barraud Quentin, Xing David, Rey Elodie, Duis Simone, Jianzhong Yang, Ko Wai Kin D., Li Qin, Detemple Peter, Denison Tim, Micera Silvestro, Bezard Erwan, Bloch Jocelyne, and Courtine Grégoire. A brain–spine interface alleviating gait deficits after spinal cord injury in primates. Nature, 539(7628):284–288, November 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] [16].Chapman Christopher A. R., Goshi Noah, and Seker Erkin. Multifunctional Neural Interfaces for Closed-Loop Control of Neural Activity. Advanced Functional Materials, 28(12):1703523, 2018. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/adfm.201703523. [Google Scholar]

[R17] [17].Wagner Fabien B., Mignardot Jean-Baptiste, Le Goff-Mignardot Camille G., Demesmaeker Robin, Komi Salif, Capogrosso Marco, Rowald Andreas, Seáñez Ismael, Caban Miroslav, Pirondini Elvira, Vat Molywan, McCracken Laura A., Heimgartner Roman, Fodor Isabelle, Watrin Anne, Seguin Perrine, Paoles Edoardo, Van Den Keybus Katrien, Eberle Grégoire, Schurch Brigitte, Pralong Etienne, Becce Fabio, Prior John, Buse Nicholas, Buschman Rik, Neufeld Esra, Kuster Niels, Carda Stefano, von Zitzewitz Joachim, Delattre Vincent, Denison Tim, Lambert Hendrik, Minassian Karen, Bloch Jocelyne, and Courtine Grégoire. Targeted neurotechnology restores walking in humans with spinal cord injury. Nature, 563(7729):65–71, November 2018. [DOI] [PubMed] [Google Scholar]

[R18] [18].Ayton Lauren N., Barnes Nick, Dagnelie Gislin, Fujikado Takashi, Goetz Georges, Hornig Ralf, Jones Bryan W., Muqit Mahiul M. K., Rathbun Daniel L., Stingl Katarina, Weiland James D., and Petoe Matthew A.. An update on retinal prostheses. Clinical Neurophysiology, 131(6):1383–1398, June 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] [19].Stingl Katarina, Schippert Ruth, Bartz-Schmidt Karl U., Besch Dorothea, Cottriall Charles L., Edwards Thomas L., Gekeler Florian, Greppmaier Udo, Kiel Katja, Koitschev Assen, Kühlewein Laura, MacLaren Robert E., Ramsden James D., Roider Johann, Rothermel Albrecht, Sachs Helmut, Schröder Greta S., Tode Jan, Troelenberg Nicole, and Zrenner Eberhart. Interim Results of a Multicenter Trial with the New Electronic Subretinal Implant Alpha AMS in 15 Patients Blind from Inherited Retinal Degenerations. Frontiers in Neuroscience, 11, 2017. Publisher: Frontiers. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] [20].Karapanos Lewis, Abbott Carla J., Ayton Lauren N., Kolic Maria, McGuinness Myra B., Baglin Elizabeth K., Titchener Samuel A., Kvansakul Jessica, Johnson Dean, Kentler William G., Barnes Nick, Nayagam David A. X., Allen Penelope J., and Petoe Matthew A.. Functional Vision in the Real-World Environment With a Second-Generation (44-Channel) Suprachoroidal Retinal Prosthesis. Translational Vision Science & Technology, 10(10):7–7, August 2021. Publisher: The Association for Research in Vision and Ophthalmology. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] [21].Dobelle Wm H.. Artificial Vision for the Blind by Connecting a Television Camera to the Visual Cortex. ASAIO Journal, 46(1):3–9, February 2000. [DOI] [PubMed] [Google Scholar]

[R22] [22].Beyeler Michael, Nanduri Devyani, Weiland James D., Rokem Ariel, Boynton Geoffrey M., and Fine Ione. A model of ganglion axon pathways accounts for percepts elicited by retinal implants. Scientific Reports, 9(1):1–16, June 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] [23].Rizzo JF, Wyatt J, Loewenstein J, Kelly S, and Shire D. Perceptual efficacy of electrical stimulation of human retina with a microelectrode array during short-term surgical trials. Invest Ophthalmol Vis Sci, 44(12):5362–9, December 2003. [DOI] [PubMed] [Google Scholar]

[R24] [24].Granley Jacob and Beyeler Michael. A Computational Model of Phosphene Appearance for Epiretinal Prostheses. In 2021 43rd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC), pages 4477–4481, November 2021. ISSN: 2694-0604. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] [25].Luo Yvonne H-L., Zhong Joe Jiangjian, Clemo Monica, and da Cruz Lyndon. Long-term Repeatability and Reproducibility of Phosphene Characteristics in Chronically Implanted Argus II Retinal Prosthesis Subjects. American Journal of Ophthalmology, 170:100–109, October 2016. [DOI] [PubMed] [Google Scholar]

[R26] [26].Beyeler M, Rokem A, Boynton GM, and Fine I. Learning to see again: biological constraints on cortical plasticity and the implications for sight restoration technologies. J Neural Eng, 14(5):051003, June 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] [27].Chen SC, Suaning GJ, Morley JW, and Lovell NH. Simulating prosthetic vision: I. Visual models of phosphenes. Vision Research, 49(12):1493–506, June 2009. [DOI] [PubMed] [Google Scholar]

[R28] [28].Tong Wei, Meffin Hamish, Garrett David J, and Ibbotson Michael R. Stimulation strategies for improving the resolution of retinal prostheses. Frontiers in neuroscience, 14:262, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] [29].Shah Nishal P., Madugula Sasidhar, Grosberg Lauren, Mena Gonzalo, Tandon Pulkit, Hottowy Pawel, Sher Alexander, Litke Alan, Mitra Subhasish, and Chichilnisky EJ. Optimization of Electrical Stimulation for a High-Fidelity Artificial Retina. In 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER), pages 714–718, March 2019. ISSN: 1948-3554. [Google Scholar]

[R30] [30].Ghaffari Dorsa Haji, Chang Yao-Chuan, Mirzakhalili Ehsan, and Weiland James D.. Closed-loop Optimization of Retinal Ganglion Cell Responses to Epiretinal Stimulation: A Computational Study. In 2021 10th International IEEE/EMBS Conference on Neural Engineering (NER), pages 597–600, May 2021. ISSN: 1948-3554. [Google Scholar]

[R31] [31].Fauvel Tristan and Chalk Matthew. Human-in-the-loop optimization of visual prosthetic stimulation. preprint, Neuroscience, November 2021. [DOI] [PubMed] [Google Scholar]

[R32] [32].Spencer Martin J., Kameneva Tatiana, Grayden David B., Meffin Hamish, and Burkitt Anthony N.. Global activity shaping strategies for a retinal implant. Journal of Neural Engineering, 16(2):026008, January 2019. Publisher: IOP Publishing. [DOI] [PubMed] [Google Scholar]

[R33] [33].de Ruyter van Steveninck Jaap, Güçlü Umut, van Wezel Richard, and van Gerven Marcel. End-to-end optimization of prosthetic vision. Journal of Vision, 22(2):20, February 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] [34].Marc Robert E, Jones Bryan W, Watt Carl B, and Strettoi Enrica. Neural remodeling in retinal degeneration. Progress in Retinal and Eye Research, 22(5):607–655, 2003. [DOI] [PubMed] [Google Scholar]

[R35] [35].Wang Zhou and Bovik Alan C.. Mean squared error: Love it or leave it? A new look at Signal Fidelity Measures. IEEE Signal Processing Magazine, 26(1):98–117, January 2009. Conference Name: IEEE Signal Processing Magazine. [Google Scholar]

[R36] [36].Zhai Guangtao and Min Xiongkuo. Perceptual image quality assessment: a survey. Science China Information Sciences, 63(11):211301, November 2020. [Google Scholar]

[R37] [37].Ioffe Sergey and Szegedy Christian. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Technical Report arXiv:1502.03167, arXiv, March 2015. arXiv:1502.03167 [cs] type: article. [Google Scholar]

[R38] [38].Dorman Michael F, Spahr Anthony J, Loizou Philipos C, Dana Cindy J, and Schmidt Jennifer S. Acoustic simulations of combined electric and acoustic hearing (eas). Ear and Hearing, 26(4):371–380, 2005. [DOI] [PubMed] [Google Scholar]

[R39] [39].Svirsky Mario A, Ding Nai, Sagi Elad, Tan Chin-Tuan, Fitzgerald Matthew, Glassman E Katelyn, Seward Keena, and Neuman Arlene C. Validation of acoustic models of auditory neural prostheses. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 8629–8633. IEEE, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] [40].Dorman MF, Loizou PC, Spahr A, and Dana CJ. Simulations of combined acoustic/electric hearing. In Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No. 03CH37439), volume 3, pages 1999–2001. IEEE, 2003. [Google Scholar]

[R41] [41].Dorman Michael F, Loizou Philipos C, and Rainey Dawne. Speech intelligibility as a function of the number of channels of stimulation for signal processors using sine-wave and noise-band outputs. The Journal of the Acoustical Society of America, 102(4):2403–2411, 1997. [DOI] [PubMed] [Google Scholar]

[R42] [42].Cooper William B, Tobey Emily, and Loizou Philipos C. Music perception by cochlear implant and normal hearing listeners as measured by the montreal battery for evaluation of amusia. Ear and hearing, 29(4):618, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] [43].Loizou Philipos C, Dorman Michael, Poroy Oguz, and Spahr Tony. Speech recognition by normal-hearing and cochlear implant listeners as a function of intensity resolution. The Journal of the Acoustical Society of America, 108(5):2377–2387, 2000. [DOI] [PubMed] [Google Scholar]

[R44] [44].Saal Hannes P, Delhaye Benoit P, Rayhaun Brandon C, and Bensmaia Sliman J. Simulating tactile signals from the whole hand with millisecond precision. Proceedings of the National Academy of Sciences, 114(28):E5693–E5702, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] [45].Okorokova Elizaveta V, He Qinpu, and Bensmaia Sliman J. Biomimetic encoding model for restoring touch in bionic hands through a nerve interface. Journal of neural engineering, 15(6):066033, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] [46].Weber Douglas J, Friesen Rebecca, and Miller Lee E. Interfacing the somatosensory system to restore touch and proprioception: essential considerations. Journal of motor behavior, 44(6):403–418, 2012. [DOI] [PubMed] [Google Scholar]

[R47] [47].Kim Sung Soo, Sripati Arun P, and Bensmaia Sliman J. Predicting the timing of spikes evoked by tactile stimulation of the hand. Journal of neurophysiology, 104(3):1484–1496, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R48] [48].Mileusnic Milana P, Brown Ian E, Lan Ning, and Loeb Gerald E. Mathematical models of proprioceptors. i. control and transduction in the muscle spindle. Journal of neurophysiology, 96(4):1772–1788, 2006. [DOI] [PubMed] [Google Scholar]

[R49] [49].Jansonius NM, Nevalainen J, Selig B, Zangwill LM, Sample PA, Budde WM, Jonas JB, Lagrèze WA, Airaksinen PJ, Vonthein R, Levin LA, Paetzold J, and Schiefer U. A mathematical description of nerve fiber bundle trajectories and their variability in the human retina. Vision Research, 49(17):2157–2163, August 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R50] [50].Abadi Martín, Agarwal Ashish, Barham Paul, Brevdo Eugene, Chen Zhifeng, Citro Craig, Corrado Greg S., Davis Andy, Dean Jeffrey, Devin Matthieu, Ghemawat Sanjay, Goodfellow Ian, Harp Andrew, Irving Geoffrey, Isard Michael, Jia Yangqing, Jozefowicz Rafal, Kaiser Lukasz, Kudlur Manjunath, Levenberg Josh, Mané Dandelion, Monga Rajat, Moore Sherry, Murray Derek, Olah Chris, Schuster Mike, Shlens Jonathon, Steiner Benoit, Sutskever Ilya, Talwar Kunal, Tucker Paul, Vanhoucke Vincent, Vasudevan Vijay, Viégas Fernanda, Vinyals Oriol, Warden Pete, Wattenberg Martin, Wicke Martin, Yu Yuan, and Zheng Xiaoqiang. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org. [Google Scholar]

[R51] [51].Simonyan Karen and Zisserman Andrew. Very Deep Convolutional Networks for Large-Scale Image Recognition. Technical Report arXiv:1409.1556, arXiv, April 2015. arXiv:1409.1556 [cs] type: article. [Google Scholar]

[R52] [52].Li Yijun, Fang Chen, Yang Jimei, Wang Zhaowen, Lu Xin, and Yang Ming-Hsuan. Universal Style Transfer via Feature Transforms. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. [Google Scholar]

[R53] [53].Deng Jia, Dong Wei, Socher Richard, Li Li-Jia, Li Kai, and Fei-Fei Li. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, June 2009. ISSN: 1063-6919. [Google Scholar]

[R54] [54].Paris Sylvain, Hasinoff Samuel W, and Kautz Jan. Local Laplacian Filters: Edge-aware Image Processing with a Laplacian Pyramid. Communications of the ACM, 58:11, 2015. [Google Scholar]

[R55] [55].Deng Li. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141–142, 2012. [Google Scholar]

[R56] [56].Lin Tsung-Yi, Maire Michael, Belongie Serge, Bourdev Lubomir, Girshick Ross, Hays James, Perona Pietro, Ramanan Deva, Zitnick C. Lawrence, and Dollár Piotr. Microsoft COCO: Common Objects in Context. Technical Report arXiv:1405.0312, arXiv, February 2015. arXiv:1405.0312 [cs] type: article. [Google Scholar]

[R57] [57].Han Nicole, Srivastava Sudhanshu, Xu Aiwen, Klein Devi, and Beyeler Michael. Deep Learning–Based Scene Simplification for Bionic Vision. In Augmented Humans Conference 2021, AHs’21, pages 45–54, New York, NY, USA, February 2021. Association for Computing Machinery. [Google Scholar]

[R58] [58].Van der Maaten Laurens and Hinton Geoffrey. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008. [Google Scholar]

[R59] [59].Peli Eli. Testing Vision Is Not Testing For Vision. Translational Vision Science & Technology, 9(13):32–32, December 2020. Publisher: The Association for Research in Vision and Ophthalmology. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R60] [60].Tramèr Florian, Kurakin Alexey, Papernot Nicolas, Goodfellow Ian, Boneh Dan, and McDaniel Patrick. Ensemble adversarial training: Attacks and defenses. arXiv preprint arXiv:1705.07204, 2017. [Google Scholar]

[R61] [61].Benabid Alim Louis. Deep brain stimulation for Parkinson’s disease. Current Opinion in Neurobiology, 13(6):696–706, December 2003. [DOI] [PubMed] [Google Scholar]

[R62] [62].Bozorgzadeh Bardia, Schuweiler Douglas R., Bobak Martin J., Garris Paul A., and Mohseni Pedram. Neurochemostat: A Neural Interface SoC With Integrated Chemometrics for Closed-Loop Regulation of Brain Dopamine. IEEE Transactions on Biomedical Circuits and Systems, 10(3):654–667, June 2016. Conference Name: IEEE Transactions on Biomedical Circuits and Systems. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R63] [63].Beyeler M, Boynton GM, Fine I, and Rokem A. pulse2percept: A Python-based simulation framework for bionic vision. In Huff K, Lippa D, Niederhut D, and Pacer M, editors, Proceedings of the 16th Science in Python Conference, pages 81–88, 2017. [Google Scholar]

[R64] [64].Loshchilov Ilya and Hutter Frank. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017. [Google Scholar]

PERMALINK

Hybrid Neural Autoencoders for Stimulus Encoding in Visual and Other Sensory Neuroprostheses

Jacob Granley

Lucas Relic

Michael Beyeler

Abstract

1. Introduction

Figure 1:

2. Background

Sensory Neuroprostheses

Restoring Vision to the Blind

Figure 2:

3. Related Work

4. Methods

Problem Formulation

Encoder

Decoder

Metrics

Training/Optimization

Datasets

5. Results

5.1. MNIST

Figure 3:

Table 1:

5.2. COCO

Figure 4:

5.3. Modeling Patient-to-Patient Variations

Figure 5:

5.4. Joint Perceptual Error Ablation Study

Figure 6:

6. Discussion

Visual Prostheses

Deployment

Broader Impacts

Limitations

7. Conclusion

Supplementary Material

Acknowledgements

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases