

Published in final edited form as:

IEEE J Solid-State Circuits. 2015 January 1; 50(1): 214–229. doi:10.1109/JSSC.2014.2355822.

# A Fully-Implantable Cochlear Implant SoC with Piezoelectric Middle-Ear Sensor and Arbitrary Waveform Neural Stimulation

#### Marcus Yip,

Microsystems Technology Laboratories, Massachusetts Institute of Technology, Cambridge, MA 02139 USA

## Rui Jin [Student Member, IEEE],

Microsystems Technology Laboratories, Massachusetts Institute of Technology, Cambridge, MA 02139 USA

### Hideko Heidi Nakajima,

Harvard Medical School, Boston, MA 02115 USA, and Massachusetts Eye and Ear Infimary, Boston, MA 02114 USA

#### Konstantina M. Stankovic, and

Harvard Medical School, Boston, MA 02115 USA, and Massachusetts Eye and Ear Infimary, Boston, MA 02114 USA

#### Anantha P. Chandrakasan [Fellow, IEEE]

Microsystems Technology Laboratories, Massachusetts Institute of Technology, Cambridge, MA 02139 USA

Anantha P. Chandrakasan: anantha@mtl.mit.edu

## **Abstract**

A system-on-chip for an invisible, fully-implantable cochlear implant is presented. Implantable acoustic sensing is achieved by interfacing the SoC to a piezoelectric sensor that detects the sound-induced motion of the middle ear. Measurements from human cadaveric ears demonstrate that the sensor can detect sounds between 40 and 90 dB SPL over the speech bandwidth. A highly-reconfigurable digital sound processor enables system power scalability by reconfiguring the number of channels, and provides programmable features to enable a patient-specific fit. A mixed-signal arbitrary waveform neural stimulator enables energy-optimal stimulation pulses to be delivered to the auditory nerve. The energy-optimal waveform is validated with *in-vivo* measurements from four human subjects which show a 15% to 35% energy saving over the conventional rectangular waveform. Prototyped in a 0.18  $\mu$ m high-voltage CMOS technology, the SoC in 8-channel mode consumes 572  $\mu$ W of power including stimulation. The SoC integrates implantable acoustic sensing, sound processing, and neural stimulation on one chip to minimize the implant size, and proof-of-concept is demonstrated with measurements from a human cadaver ear.

#### **Index Terms**

Arbitrary waveform; cochlear implant; energy-efficient; hearing loss; implantable; low-voltage; microphone; middle ear; piezoelectric; reconfigurable; SoC; stimulation; ultra-low-power

#### I. Introduction

As of 2010, over 30 million people in the United States suffer from sensorineural hearing loss [1] which arises from disease in the inner ear or auditory nerve. For mild cases of hearing loss, a hearing aid may provide adequate compensation. However, for profound cases (i.e., greater than 90 dB of loss), a cochlear implant (CI) is necessary to restore hearing.

CIs use electronics to directly stimulate the auditory nerve fibers, thus bypassing the damaged hair cells in the cochlea. Today's state-of-the-art CIs consist of an external and internal unit as shown in Fig. 1. The external unit comprises a microphone to pick up sound, a sound processor to digitize and compress the sound into coded signals, and a transmitter to send data wirelessly to the internal unit via a coil. The external unit also houses the battery which supplies power wirelessly to the implanted unit via the same coil that is used for data. Without a continuous transfer of power (e.g., if the external unit is removed), the implanted unit is unpowered and does not function. The implanted unit comprises a receiver and stimulator unit embedded in the skull, and an electrode array implanted in the cochlea. Pulses of electrical current are modulated by the received codes and delivered to the electrode array, triggering action potentials in the auditory nerve which are interpreted by the brain as sound.

Although today's CIs are very successful in restoring hearing for many of the profoundly deaf, the external component results in several limitations. The device is cumbersome to wear, and it cannot be worn in the shower or while participating in water sports. It also raises concerns with aesthetics and social stigma. These reasons motivate the development of a fully-implantable cochlear implant (FICI) that is internalized and invisible.

A FICI has three main requirements that are distinct from a conventional CI. First, a FICI that is completely untethered (i.e., without a coil that continuously provides power) requires an implanted battery that is rechargeable because of the volume and power consumption constraints of the FICI, as well as the need to avoid future surgeries for battery replacement. The number of charges per day must also be limited to once or twice a day to minimize user impact. As a result, a FICI requires ultra-low-power sound processing and energy-efficient neural stimulation. Secondly, recent state-of-the-art ICs are typically designed for external microphone-based CIs and do not require the neural stimulator to be on the same chip [2], [3]. In contrast, the size of a FICI could benefit from monolithic integration of the signal processing and stimulation circuits. Thirdly, an implantable acoustic sensor with adequate sensitivity and bandwidth is required to replace the external microphone.

This paper presents a system-on-chip (SoC) for a FICI that addresses the above issues. First, low-power implantable acoustic sensing is achieved by interfacing the SoC to a piezoelectric

sensor that is mounted at the umbo of the malleus within the middle ear, and this is demonstrated with measurements from human cadaveric temporal bones. Second, a highly-reconfigurable sound processor enables system power scalability by scaling the number of spectral channels. Third, simulations with a computational auditory nerve fiber model from [4] are used to determine energy-optimal biphasic stimulation waveforms [5], which are delivered to the nerves in the cochlea with an arbitrary waveform neural stimulator. The energy savings from alternate stimulation waveforms are validated *in-vivo* in four human CI subjects with loudness perception tasks. The resulting stimulation power savings over the conventional rectangular pulse transfer directly to overall system power savings because stimulation power typically dominates [5]. The SoC integrates an implantable acoustic sensor front-end, sound processor, and neural stimulator on one chip to minimize the implant size and demonstrate proof-of-concept for a FICI. It should be noted that the wireless charger and power management unit is not included in the scope of this work.

Section II presents the requirements and architecture of the FICI. Section III describes the implantable acoustic sensor and the front-end circuit implementation, and Section IV discusses the reconfigurable sound processor design. Section V discusses energy-efficient neural stimulation waveforms, and presents details of the stimulator architecture. Section VI presents prototype measurement results, Section VII discusses potential use cases and future work, and Section VIII concludes the paper.

# II. System Requirements and Architecture Description

This section highlights the main high-level requirements and specifications of the system, and describes the system architecture of the FICI.

### A. System Requirements

- 1) Power Consumption—Ultra-low power consumption can ease energy storage requirements which can translate to possible size reduction of the implant, or increased time between recharge. Lithium-ion batteries possess high energy density but suffer from limited (on the order of 1000's) recharge cycles which would eventually necessitate replacement via surgery. Ultra-capacitors, on the other hand, can be cycled on the order of  $10^6$  times and may be more suitable for an implantable system provided that the limited energy density still permits a one-charge-per-day usage model. For a 5 gram ultra-capacitor with an energy density of 5 W·hr/kg, 12 hours of continuous usage would require the FICI to consume just 1 mW of power assuming 50% conversion efficiency in the power management unit. The majority of the 1 mW power budget will be consumed by the electrode impedance and electrode drivers [6] during the process of electrically stimulating the auditory nerve. Assuming typical stimulation power of approximately  $750 \,\mu$ W [2], [3], this leaves approximately  $250 \,\mu$ W for the implantable sensor front-end circuits and sound processor.
- **2) Implantable Acoustic Sensor**—A key enabler for a FICI system is an implantable acoustic sensor that is able to sense external sound pressure waves from within the body. Recently, totally invisible middle ear implants (MEIs) have been developed to treat conductive hearing loss <sup>1</sup> [7], [8]. MEIs typically use an implantable sensor to detect the mechanical motion of the ossicles, using the ear as a natural microphone. The sensor readout

can be amplified and fed to an output transducer which drives the stapes (the stirrup-shaped bone of the ossicles) with increased vibration to compensate for hearing loss [7]. In this work, we apply the sensors found in MEIs to a FICI system; instead of driving the output transducer of a MEI, we use the sensor readout as an input to the sound processor of a FICI which stimulates the cochlea directly.

Prior work on implantable sensors has looked at MEMS accelerometers, but they have limited sensitivity and require milli-Watt power consumption [1]. Alternative approaches include magnetic sensors [9] and subcutaneous microphones [10], but they suffer from incompatibility with magnetic resonance imaging (MRI) and unwanted body noise respectively. This work leverages a piezoelectric sensor [8] because of its small size, low-power operation, and superior sensitivity.

Considering that the loudness of normal conversational speech is around 60 to 70 dB SPL<sup>2</sup> [1], [11] and that the dynamic range of speech can be up to 50 dB [12], the sensor should be able to detect sounds from 40 dB SPL (quiet library) to 90 dB SPL (busy roadside). Furthermore, the sensor should also support a bandwidth that spans the frequencies specific to cochlear implant hearing from a few hundred Hz up to 5 kHz [13], [14].

3) Number of Spectral Channels—The choice of the number of spectral channels should strike a balance between the speech recognition performance of CI users and the hardware complexity and power consumption in a FICI. In [15], Shannon et al. used acoustic simulations with normal hearing listeners in quiet and showed that as few as 3 to 4 channels of spectral information can result in good speech recognition performance. A similar study with CI users in quiet was presented in [16] where the authors found that the average performance improved as the number of stimulation electrodes was increased from 1 to 4, but no differences were observed between 7-, 10-, or 20-electrode processors. For CI users in noise, the best CI listeners improved their performance only up to 7 electrodes, while CI users with low levels of speech recognition did not benefit from more than 4 electrodes [17]. Although user performance depends also on the speech processing strategy, the effective number of spectral channels possible with electrically-evoked hearing is usually limited by the current spreading between electrodes [18]. In this work, the target number of channels is 8 which provides adequate spectral resolution while minimizing the hardware complexity for a proof-of-concept system.

#### **B.** Architecture Description

A block diagram of the proposed SoC is shown in Fig. 2 [19]. The system is separated into three main subsystems: 1) a piezoelectric sensor front-end (PZFE), 2) a low-voltage reconfigurable sound processor, and 3) an energy-efficient arbitrary waveform neural stimulator and high-voltage electrode switch matrix.

<sup>&</sup>lt;sup>1</sup>Conductive hearing loss occurs when there is damage to the middle ear which blocks the conduction of sound waves toward the inner ear (cochlea). In contrast, sensorineural hearing loss occurs when there is damage to the cochlea. MEIs treat conductive hearing loss only and require the cochlea to be intact and functional, whereas CIs are used to treat sensorineural hearing loss.

<sup>2</sup>Sound pressure level (SPL) in units of dB SPL is a logarithmic measure of sound pressure with respect to a reference of  $P_{ref} = 20$ 

<sup>&</sup>lt;sup>2</sup>Sound pressure level (SPL) in units of dB SPL is a logarithmic measure of sound pressure with respect to a reference of  $P_{ref} = 20 \mu Pa_{rms}$ .

The PZFE conditions the signal from the sensor which is a measure of the sound-induced motion of the umbo. The PZFE operates from a 1.5 V analog supply and comprises three stages: a charge amplifier (CA) to interface to the piezoelectric sensor, a programmable-gain amplifier (PGA), and a single-ended to differential ADC driver. A mid-rail reference voltage  $V_{REF}$  biases the sensor and sets the DC operating point of the PZFE. The ADC driver stage also provides DC level shifting from  $V_{REF} = 750$  mV down to  $V_{CM} = 300$  mV which is the input common-mode of the ADC. The signal is then digitized by a fully-differential low-power 16 kS/s 9-bit successive approximation register (SAR) ADC operating from a digital supply voltage of 0.6 V [20].

The ADC output is processed by a 0.6 V reconfigurable sound processor that implements the well-known Continuous Interleaved Sampling (CIS) sound processing strategy [21] in which each electrode is stimulated synchronously in an interleaved manner with an amplitude determined by the output of each channel of the processor. The interleaved stimulation drastically reduces interaction between electrodes which allows for a high rate of stimulation that is important for preserving temporal information in speech [15].

The processor in this work is designed to be extremely reconfigurable in order to enable system power scalability as well as patient-specific fitting capability. First, the number of channels can be configured to 8, 6, or 4 to enable a power-performance tradeoff. The filter bank has reconfigurable coefficients to adjust the filter bandwidths for the three modes of operation, and multi-rate signal processing is leveraged to reduce power and area. The bandwidth of the processor covers 300 Hz to 5.5 kHz, and the filter cut-off frequencies of the logarithmically-spaced channels are based on [13], [14] to emulate the tonotopic structure of hearing. Furthermore, processor settings like global channel gain, type of rectification, and amount of compression are all programmable. The processor also has the capability to adjust the volume level for each channel individually to provide additional patient fitting capability. The processor outputs 6-bit data at an analysis rate of 1 kHz, which represents the logarithmically compressed energy in each frequency band. This value is used to modulate a train of electrical current pulses (1,000 pulses/sec per electrode) that is delivered to the corresponding electrode.

The interleaved operation of the CIS strategy conveniently allows for a single current source to be interleaved between all electrodes. This is accomplished by using a high-voltage electrode switch matrix to select the active electrode and control the direction of current flow. The SoC is designed to be used with monopolar electrode arrays, where a common return electrode is used to provide a return path for all electrodes. Furthermore, a 0.6 V digital controller provides the control signals for the current DAC and switch matrix which allows the waveform of the stimulation pulses to be programmed to any arbitrary shape.

The stimulator has high voltage compliance in order to accommodate up to 1 mA of stimulation current through the electrode-tissue interface which may have a few kilo-Ohms of impedance. The stimulation current is drawn from a high voltage supply ( $V_{MID} = 5 \text{ V}$  to 10 V), and the switch matrix is driven with high-voltage logic operating from  $V_{DDG} = 7 \text{ V}$  to 12 V. Level shifters are used to interface between the 0.6 V and  $V_{DDG}$  domains. Lastly, the current DAC circuits operate from a supply voltage of 3.3 V.

# III. Implantable Piezoelectric Acoustic Sensor

This section first describes the characterization of the piezoelectric sensor on the middle ear of a human cadaveric temporal bone using a discrete prototype, followed by the design and analysis of the sensor front-end of the SoC.

#### A. Sensor Characterization

In order to investigate the performance of the middle-ear mounted piezoelectric sensor, human cadaveric temporal bones were provided by the Eaton-Peabody Laboratory at the Massachusetts Eye & Ear Infirmary (Boston, MA), and PZT-5A piezoceramic material (Piezo Systems Inc., Woburn, MA) was used. A block diagram of the prototype and measurement setup is shown in Fig. 3(a). Swept-sine measurements were made using an audio amplifier and speaker connected to a probe tube that funnels sound into the ear canal of the temporal bone. Ear canal pressure ( $P_{EC}$ ) and umbo velocity ( $v_{UMBO}$ ) were monitored with a probe microphone and laser Doppler vibrometer (Polytec) respectively. A discrete prototype was used to record the output voltage of the sensor ( $V_{PZ}$ ). All three outputs were recorded by LabVIEW and the transfer characteristics from  $P_{EC}$  to  $v_{UMBO}$  to  $V_{PZ}$  were calculated.

Fig. 3(b) shows the photograph of the measurement setup. The temporal bone is held by a specimen holder and the sensor is positioned by a micro-manipulator external to the bone. The sensor is clamped at one end like a cantilever, while the other end is placed at the umbo of the malleus. As the umbo vibrates back and forth, it exerts a force that bends the sensor which in turn generates a charge across its terminals which is converted to an output voltage by a charge amplifier.

Figs. 4(a) and (b) show that the measured umbo velocity and sensor readout are very linear with the sound pressure level in the ear canal. Fig. 4(c) shows the measured output spectrum of the charge amplifier from 200 Hz to 10 kHz and it can be seen that the sensor is able to detect sounds over a 50 dB dynamic range from 40 to 90 dB SPL.

#### B. Piezoelectric Sensor Front-End

The details of the 3-stage PZFE of the SoC are shown at the top of Fig. 2, where the piezoelectric sensor is modeled as a Thevenin voltage source  $V_P$  and series capacitance  $C_P$ . Since the charge amplifier of the first stage dominates the noise performance of the frontend, its signal transfer function and noise analysis is provided next.

1) Charge Amplifier Transfer Function—Fig. 5(a) shows the charge amplifier (stage 1) of the PZFE with noise sources, and Fig. 5(b) shows the equivalent block diagram, where  $v_{n;i}$ ,  $v_{n;f}$ , and  $v_{n;a}$  are the noise from  $R_i$ ,  $R_f$ , and the op-amp respectively, and A(s) is the open-loop transfer function of the op-amp. In order to determine the transfer function from the piezoelectric sensor voltage  $V_P$  to the charge amplifier output  $v_o$ , the charge amplifier input voltage  $v_i$  can be referred back to  $V_P$  through the following transfer function,  $H_P(s) = V_P(s)$ 

 $v_i/V_P = s/(s + \omega_i)$ , where  $\omega_i = \frac{1}{R_i C_p}$ . For the frequencies of interest where the loop gain is

large, the closed-loop transfer function is given by  $H_{\scriptscriptstyle CL}(s) \approx Y_f^{-1} = [C_f(s+\omega_f)]^{-1}$  where  $\omega_f = \frac{1}{R_f C_f}$ . Therefore, the transfer function of stage 1 from  $V_P$  to  $v_o$  is given by

$$H_{\scriptscriptstyle STG1}(s) = G_i H_{\scriptscriptstyle CL}(s) H_{\scriptscriptstyle P}(s) = \frac{C_{\scriptscriptstyle P}}{C_f} \frac{s \omega_i}{(s + \omega_f)(s + \omega_i)}, \quad \text{(1)}$$

which gives the desired band-pass characteristic necessary to pass only the frequencies relevant to speech for a CI (300 Hz to 6 kHz). The high-pass and low-pass corners are set by  $\omega_f$  and  $\omega_i$  respectively, and the mid-band gain is simply  $C_P/C_f$  for  $\omega_i \gg \omega_f$ . Note that the negative polarity of the charge amplifier has been ignored here for simplicity. To accommodate a range of typical sensor sizes ( $C_P = 0.2$  to 3 nF),  $C_f$  is a 3-bit switched-capacitor tunable from 6 pF to 66 pF which provides programmable gain in 3 dB steps, and  $R_f$  is set to 88.4 M $\Omega$  such that  $\omega_f < 2\pi(300 \text{ Hz})$  for all values of  $C_f$ . Finally,  $R_i$  is a 4-bit switched-resistor with logarithmically-spaced values from 1 k $\Omega$  to 100 k $\Omega$  to ensure  $\omega_i > 2\pi(6 \text{ kHz})$  for typical values of  $C_P$ .

**2) Charge Amplifier Noise Analysis**—The noise transfer functions referred to  $V_P$  can be determined by calculating the noise transfer function to the output  $v_o$  and then dividing

by  $H_{STG1}(s)$ . For  $R_i$ , the noise transfer function referred to  $V_P$  is  $H_{np,i}(s) = \frac{V_P}{v_{n,i}} = \frac{v_o}{v_{n,i}} \frac{V_P}{v_o}$  which evaluates to

$$H_{np,i}(s) = G_i H_{CL}(s) \cdot \frac{1}{H_{STGI}(s)} = \frac{s + \omega_i}{s} \approx \frac{\omega_i}{s}$$
 (2)

for  $\omega \ll \omega_i$ . As a result, the noise spectral density of  $R_i$  referred to  $V_P$  is

$$V_{np,i}^{2}(f) = |H_{np,i}(f)|^{2} (4kTR_{i}) = \frac{4kT}{R_{i}} \left(\frac{1}{2\pi f C_{p}}\right)^{2}.$$
 (3)

Following similar analysis, the noise spectral density of  $R_f$  referred to  $V_P$  is

 $V_{np,f}^2(f) = \frac{4kT}{R_f} \left( \frac{1}{2\pi f C_P} \right)^2$ . Finally, the noise transfer function of the op-amp referred to  $V_P$  is

$$H_{np,a}(s) = Y_{eff} H_{CL}(s) \cdot \frac{1}{H_{STG1}(s)} = \frac{R_i C_f(s + \omega_{eff})(s + \omega_i)}{s} \approx \frac{\omega_i}{s} = \frac{1}{s R_i C_P} \quad \text{(4)}$$

for  $\omega \ll \omega_i \ll \omega_{eff}$ , where  $\omega_{eff} = \frac{1}{R_{eff}C_f} \approx \frac{1}{R_iC_f}$  since  $R_f \gg R_i$ . Therefore, the noise of the opamp is magnified by  $H_{np;a}(s)$  which is more dominant at lower frequencies.

From the above analysis, it can be seen that the noise spectral density of  $R_i$ ,  $R_f$ , and the opamp thermal noise all have a  $1/f^2$  characteristic, and that the resistor noise is reduced for larger values of  $C_P$ . However, the op-amp noise is independent of  $C_P$  because it depends on

 $\omega_i = \frac{1}{R_i C_D}$  which is generally chosen to be fixed. For typical values of  $C_P$ , the noise from  $R_f$  is negligible because of its large value, and the relative contributions of noise from  $R_i$  and the op-amp vary depending on  $C_P$ . The total integrated noise from simulation for  $C_P = 0.5$  nF and 3 nF are 2.5  $\mu$ V<sub>rms</sub> and 1.7  $\mu$ V<sub>rms</sub> respectively, which are lower than the minimum expected signal of approximately 3  $\mu$ V<sub>rms</sub> at 40 dB SPL as determined by the discrete prototype.

# IV. Reconfigurable Sound Processing

The sound processor in this work implements the CIS sound processing strategy [21] because it is the most ubiquitous strategy among CI manufacturers. The two main objectives for the design of the sound processor are 1) ultra-low-power operation and 2) highly-reconfigurable features to enable system power scalability and patient-specific fitting capability. The first goal is accomplished by leveraging ultra-low-voltage digital processing at 0.6 V to maximize energy-efficiency. The second goal is addressed with a flexible digital architecture featuring a multi-rate reconfigurable filter bank and highly-programmable processor parameters.

The block diagram of the reconfigurable CIS sound processor is shown in Fig. 6. The processor spectrally decomposes the signal with a logarithmically-spaced multi-rate filter bank to emulate natural hearing. The envelope of each filter output is extracted, downsampled, and logarithmically-compressed to fit the patient's electric hearing dynamic range, and each channel has patient-specific volume settings. By clock-gating the unused channels, both the processor power and stimulator power (which dominates the SoC power) scale linearly with the number of channels.

#### A. Programmable Features

Aside from being able to select between 8-, 6-, or 4-channel modes, other programmable processor parameters can affect the user performance and power consumption of the CI and they are highlighted next.

- Rectification: The type of rectification used in the envelope detector can affect
  speech recognition scores and sound quality [14], and therefore both full-wave
  (default) and half-wave rectification are possible. Half-wave rectification has the
  potential to reduce stimulation power since the average envelope is smaller, but at
  the cost of potentially lower sound quality [14].
- Channel gain and compression factor: Following envelope detection, global gain and dynamic range compression are applied. The global gain can be set in octaves from  $2^{-4}$  to  $2^3$  with 3 bits, and the signal is compressed according to  $Y = \ln(1 + CX) / \ln(1 + C)$ , where C is the compression factor. Logarithmic compression is needed because of the well-known loudness growth function of electrical hearing which shows a linear relationship between sound intensity in dB SPL and electrical stimulation intensity in Amperes. Evidence shows that different amounts of compression can be beneficial [12], and therefore three settings are available: C = 1024 (default), 128, and 16. For low compression (C = 16), the stimulation power

is roughly linear with gain, while for high compression (C = 1024), the stimulation power scales linearly with the logarithm of the gain.

Volume settings: Each channel has individual threshold (THR) and most-comfortable-level (MCL) settings that can be used to fit the dynamic range of the stimulus current for each electrode based on the user. Each channel has 3-bit programmability in both THR and MCL, and stimulation power increases with higher THR and MCL settings.

#### B. Reconfigurable Filter Bank Architecture

A logarithmically-spaced filter bank requires higher frequency channels to be wider in bandwidth, while low frequency channels need to be narrow and more selective (i.e., higher filter order). In this work, high order filters are avoided by using multi-rate signal processing to achieve the narrow low frequency filters in a power- and area-efficient manner. As shown in Fig. 6, the ADC data is decimated in 3 stages using efficient 19-tap half-band FIR filters, resulting in data rates of 2, 4, 8, and 16 kHz which are used for channels A/B, C/D, E/F, and G/H respectively.

The filter bank is implemented with FIR filters for their linear phase which can have a positive effect on sound quality and speech intelligibility. When the sound processor is configured in 8-channel mode, all channels are active. In 6- and 4-channel modes, the subsets of channels (D, H) and channels (B, D, F, H) are clock-gated respectively. The cut-off frequencies of the individual filters vary with the channel mode and they are based on [13], [14].

In order to achieve reconfigurability in the filter responses, the filter bank leverages three types of FIR filters with different levels of reconfigurability. Fig. 7 shows the most reconfigurable filter (Type 3) which can be programmed to three different filter lengths: 14, 16, or 20 taps used in the 4-, 6-, or 8-channel mode (using control signals *mode*8, *mode*60008, and *mode*4) to adjust the frequency selectivity. Since the filters are symmetric, the filter is folded to reduce the number of multiplications by half. The coefficients are quantized to 8-bit precision which is the minimum possible without significantly affecting the desired frequency response, and word lengths are optimized for the given coefficients.

The Type 3 FIR filter is used for channels A, C, E, and G which are active for all three modes. The Type 2 FIR filter used for channels B and F has two levels of reconfigurability and can be programmed to have either 16 or 20 taps used in the 6- or 8-channel modes only. Finally, the Type 1 FIR filter used for channels D and H is a fixed 20-tap filter because it is used in the 8-channel mode only.

Finally, taking channel A for example, it can be shown that its effective filter response at  $f_S$  = 16 kHz is given by

$$G_{A,eff}(z) = H_{HB1}(z)H_{HB2}(z^2)H_{HB3}(z^4)G_A(z^8),$$
 (5)

where  $H_{HBi}(z)$  and  $G_A(z)$  represent the half-band and channel A FIR filters at their respective downsampled data rates. Although  $G_A(z)$  is only a 20-tap filter (in 8-channel mode), the effective filter order is much higher because of multi-rate signal processing. The effective frequency responses of the filter bank (at  $f_S = 16$  kHz) reconfigured in 4-, 6-, and 8-channel modes are shown in Fig. 8.

## V. Energy-Efficient Arbitrary Waveform Neural Stimulator

In this section, we consider the design of the neural stimulator which delivers electrical current to the nerve fibers of the cochlea. The typical power consumption of the stimulator can often be a few milli-Watts [3], [22] which can represent greater than 90% of the total SoC power given that the PZFE and sound processor have micro-Watt power consumption. Therefore, any power savings in the stimulator translate directly to overall system power savings.

## A. Computational Nerve Fiber Simulations

Most neural stimulators today deliver charge-balanced biphasic rectangular current pulses as shown by the green curve in Fig. 9(a), where the first (cathodic) phase excites the nerve fiber and the second (anodic) phase provides charge balancing. The rectangular waveform has been widely adopted for its simplicity and ease of generation with a simple current source. However, studies have shown that alternate waveforms have the potential to excite nerves with reduced energy [5], [23]. Based on [5], a heuristic search was applied to a computational model of an auditory nerve [4] to seek out an energy-optimal waveform with CI-specific parameters<sup>3</sup>. The waveform was constrained to be charge-balanced with 10 time steps/phase, but unlike [5], no constraint was placed on the shape of either phase of the pulse to allow both phases to be co-optimized. Fig. 9(a) (blue curve) shows the energy-optimal waveform after 10,000 search iterations at a phase width of 25  $\mu$ s, and it is 28% more energy-efficient than the conventional rectangular waveform at threshold. The energy-optimal waveform in this work is somewhat different from the truncated Gaussian shapes in [5] which may be attributed to a different nerve fiber model as well as the lack of constraint on the shape of the anodic phase.

### B. Validation with In-vivo Measurements

In order to validate the modeling results and determine the impact of alternate waveforms on auditory perception in humans, the rectangular waveform was compared against the exponential waveform shown in Fig. 9(a) by conducting a psychophysical loudness perception test on four subjects with Advanced Bionics CIs<sup>4</sup>. The exponential waveform was used because it mimics the optimal waveform closely in the cathodic phase, but the anodic phase was constrained to be symmetric to the cathodic phase due to test limitations. The waveforms were alternated in pseudo-random order, and the amplitude was swept from threshold to just beyond the maximum comfortable level (CL) in  $50 \,\mu\text{A}$  steps on a middle electrode. The subjects were asked to rate the loudness on a scale of 0 to 25, with 8 and 22

<sup>&</sup>lt;sup>3</sup>Typical CIs use biphasic waveforms with a phase width between 25 and 50  $\mu$ s.

<sup>&</sup>lt;sup>4</sup>Tests were conducted under the Massachusetts Eye & Ear Infirmary IRB protocol #94-01-003.

being the minimum and maximum CL respectively. Fig. 9(b) shows the average perceived loudness of both waveforms versus the energy delivered (per Ohm of the electrode resistance) for each of the 4 subjects. Overall, to achieve the same loudness within the comfortable range between 8 and 22, the alternate exponential waveform requires approximately 15% to 35% less energy than the rectangular waveform.

#### C. Arbitrary Waveform Stimulator Architecture

Based on the nerve fiber simulations and *in-vivo* measurement validation of alternate waveforms, this section describes the architecture of the arbitrary waveform stimulator in the SoC.

1) High-Voltage Electrode Switch Matrix—The CIS sound processing strategy permits a single current DAC to be interleaved among all electrodes using the high-voltage switch matrix shown in Fig. 10(a). The terminals of the intracochlear electrodes are designated by  $E_i$  (for i=1 to 8), and  $E_{com}$  is the common return electrode of a monopolar electrode array. A high-frequency  $R_sC_d$  electrode model between  $E_i$  and  $E_{com}$  models the impedance of the electrode-tissue-electrode interface. Electrode  $E_i$  is active when  $S_i$  is asserted, and the switches  $S_C$ ,  $S_A$ ,  $S_{iC}$ , and  $S_{iA}$  are used to control the direction of current flow during the cathodic and anodic phases for each electrode. During each phase, the value of the current is determined by a 6-bit current DAC ( $I_{DAC}[5:0]$ ) which is driven by a digital waveform controller to realize any arbitrary waveform. Since the switch matrix works like a H-bridge, current always flows from  $V_{MID}$  to ground. In between the cathodic and anodic phases, an optional switch ( $S_{IPG}$ ) can be used to insert an inter-phase gap. Upon completion of each pulse,  $S_i$  shorts the electrode to remove any residual charge. Although not pictured, a DC blocking capacitor (220 nF) is placed in series with the electrodes to ensure that no DC current flows into the tissue for safety reasons.

Fig. 11 shows the timing diagram of the switch matrix control over a complete stimulation cycle. The start of each cycle begins on the rising edge of  $\varphi_{LO}$  (1 kHz) which generates a  $stim\_start$  pulse and also asserts en33 which is used to enable the 3.3 V supply from which the current DAC circuits operate. Stimulation is enabled when  $stim\_en$  is asserted, and the electrode selection signals  $S_i$  are generated by a state machine that is clocked by  $\varphi_{HI}$ . Note that en33 rises half a cycle before  $stim\_en$  in order to provide adequate time for the current DAC circuits to power up. Stimulation is complete on the positive edge of  $stim\_done$  which de-asserts  $stim\_en$  and en33.

**2) Current Steering DAC**—The current DAC shown in Fig. 10(b) is based on the voltage-controlled resistor (VCR) topology [24] which is chosen for its high output impedance and large voltage compliance. It provides 6 bits of resolution at a full-scale of 1 mA which is typically sufficient for CIs. Feedback ensures that the DAC current is simply  $I_{DAC} = V_{DSREF}/R_{VCR}$ . Detailed analysis of the VCR in [24] shows that  $I_{DAC}$  is linear with  $V_G[0]$  which drives the gate of the main triode device  $M_0$ .  $M_1$  to  $M_3$  are auxiliary devices controlled by  $V_G[1:3]$  and are used to linearize  $M_0$ . All devices are 3.3 V transistors except for  $M_{HV}$  which is a high-voltage device that connects to the high-voltage switch matrix.

In this work, current-steering is used to achieve the settling time required to generate arbitrary waveforms at a 25  $\mu$ s phase duration with 10 steps/phase. The input code D[5:0] from the digital waveform controller steers binary-weighted currents to  $I_P$  which are then mirrored to the resistor string on the output branch. The generated control voltages  $V_G[3:0]$  are linear with D[5:0] as desired. Finally, the current DAC is power-gated from 3.3 V after all electrodes have been stimulated so that its power scales linearly with the number of channels.

3) Low-Voltage Digital Waveform Controller—In order to minimize the power overhead of arbitrary waveform generation, the digital waveform controller operates at 0.6 V, and level shifters are used to interface to the high-voltage logic which drives the switch matrix. Fig. 12(a) shows the electrode selection state machine that generates  $S_1$  to  $S_8$  and other signals that govern the stimulation cycle. Control signals mode6or8 and mode8 are used to reconfigure the state machine between channel modes. The state machine is triggered on  $\varphi_{LO}$ , and the signals  $S_i$  are shifted out serially with  $\varphi_{HI;D}$  which is a delayed version of  $\varphi_{HI}$ . This ensures that  $S_i$  transitions just after the negative edges of  $S_A$  and  $S_{iA}$  to avoid glitching in the switches. The control signals for the switch matrix are generated by gating  $S_i$  and  $stim\_en$  with non-overlapping clocks  $\varphi_R$ ,  $\varphi_P$ , and  $\varphi_M$  as shown in Fig. 12(b).

Finally, Fig. 12(c) shows the digital arbitrary waveform interface that controls the shape of the pulses delivered to the electrodes. The channel select block selects the appropriate sound processor output (dstimX for X = A to H) based on  $S_i$ . While each  $S_i$  is asserted, a step counter (cnt) running at  $\phi_W$  (a high-frequency waveform clock) keeps track of the time step within the pulse. The value of cnt determines which waveform weight (w00 to w15) is multiplied with dstimX to generate D[5:0] which drives the current DAC. The shape of the cathodic and anodic phases are determined by w00 to w07 and w08 to w15 respectively, for a maximum of 8 steps/phase.

#### VI. Measurement Results

A prototype SoC was fabricated in a 0.18  $\mu$ m high-voltage CMOS process, and the die micrograph is shown in Fig. 13. The chip including pads measures 3.6 mm  $\times$  3.6 mm, while the active area is 3.36 mm<sup>2</sup>. This section presents the measured results, and a summary of the performance is provided in Table I.

#### A. Piezoelectric Sensor Front-End

Figs. 14(a) and (b) show the measured gain response of the charge amplifier for  $C_P = 3.2$  nF and 0.56 nF which span the expected values for reasonable sizes of the sensor. Fig. 14(c) shows the combined response of the charge amplifier and PGA with  $C_P = 0.56$  nF and  $C_{1f} = 12$  pF for various PGA settings. Simulation results are shown with dotted lines and show good agreement with measured results. Furthermore, the integrated noise over the sound processor bandwidth is  $2.81~\mu V_{rms}$  and  $1.93~\mu V_{rms}$  for  $C_P = 0.56$  nF and 3.2 nF respectively, which is less than the minimum expected signal at 40 dB SPL.

## **B. Reconfigurable Sound Processor**

To demonstrate the reconfigurability in the number of channels of the processor, a logarithmic chirp signal was applied at the input of the ADC. Fig. 15(a) shows the measured spectrogram at the output of the ADC, and Figs. 15(b), (c), and (d) show the measured spectrogram of the processor configured in 4-, 6-, and 8-channel modes. Since the processor features a logarithmically-spaced filter bank, its spectrogram looks linear as expected. A Matlab simulation of the 8-channel processor is shown in Fig. 15(e), showing good agreement with the measured results.

#### C. Energy-Efficient Arbitrary Waveform Stimulator

The measured INL and DNL are -0.21/+1.25 LSB and -0.14/+0.16 LSB respectively which is adequate for neural stimulation applications. Fig. 16 shows the measured current and voltage from a model electrode ( $R_s = 3 \text{ k}\Omega$ ,  $C_d = 10 \text{ nF}$ ) for (a) a rectangular waveform and (b) the energy-optimal waveform for phase widths of 25  $\mu$ s and 50  $\mu$ s with 8 steps/phase. Even though the energy-optimal waveform requires a higher peak current than the rectangular waveform, it consumes less energy and generates a smaller electrode voltage.

Fig. 16(c) shows a measurement of the interleaved current pulse trains at 1,000 pulses/sec per electrode through all 8 electrodes. In this measurement, the pulses are programmed to be rectangular with a phase width of 31.25  $\mu$ s such that the current DAC is active for 50% of the 1 ms period, and power-gated for the remainder of the period.

## D. Power Consumption

- 1) Stimulator Power—Fig. 17 summarizes the measured stimulator power in 8-, 6-, and 4-channel modes while processing a clip of speech with  $V_{MID} = 7$  V. The measurement also includes the power consumption of the low-voltage digital waveform controller, level shifters, current DAC, and high-voltage switch matrix control circuits. The power increases with the duty cycle of the current DAC which scales with the phase width and number of channels. Finally, the energy-optimal waveform provides power savings of approximately 22% and 29% at phase widths of 25  $\mu$ s and 50  $\mu$ s respectively.
- **2) Overall SoC Power**—Table II summarizes the total SoC power consumption with typical speech input using the energy-optimal waveform at 31.25  $\mu$ s/phase and nominal processor settings. Reconfigurability in the number of channels allows for a power-performance tradeoff, and the SoC consumes 572, 425, and 281  $\mu$ W in 8-, 6-, and 4-channel modes respectively, meeting the 1 mW requirement. In 8-channel mode, the PZFE, SAR ADC, and sound processor consume only 2% of the the total power, while 98% of the power is consumed by the stimulator circuits. Therefore, the stimulation power savings from the energy-optimal waveform transfer directly to the overall system.

#### E. System Demonstration with a Human Cadaveric Temporal Bone

The FICI SoC was tested with a piezoelectric sensor mounted at the umbo of a human cadaveric temporal bone. A function generator and audio amplifier were used to generate a clip of speech ("her husband brought some flowers") which was played into the ear canal of the temporal bone with a speaker at 70 dB SPL. The signal from the umbo-mounted sensor

was detected by the PZFE and processed by sound processor. Fig. 18(a) shows the spectrogram and time-domain waveform of the input speech signal in the ear canal, and Fig. 18(b) shows the measured spectrogram and reconstructed sound (the reconstruction process is based on sound synthesis in vocoder applications) from the processor in 8-channel mode. The output from the umbo-mounted sensor and SoC preserves the temporal envelope information of the speech signal with the exception of some high frequency content, demonstrating hearing with a human cadaveric ear.

### VII. Discussion and Future Work

The eventual goal of a completely invisible CI will require the development of other subsystems. Most notably, the user will need an external device (used a few times per day) capable of two key functions. First, a wireless charging system is needed to recharge the implanted power management unit (PMU). The charging system should be able to bring the implanted unit from empty to fully charged in just a few minutes to minimize the inconvenience to the user. In order to increase the time between recharge, the implanted PMU must be as efficient as possible which will require clever design of DC-DC converters and regulators to generate the required supply voltages for the SoC. If an ultra-capacitor is used as the energy storage element, the PMU would have the added requirement of being able to accommodate a terminal voltage that will decrease with use.

The second key function of the external device would be to provide wireless data transfer to the implanted unit to allow the user to select from a range of programs as determined by their audiologist. Each program could have a different combination of front-end and sound processor settings. Furthermore, a wireless data link would also allow the user to manually select high-performance or low-power modes corresponding to 8-channel or 6/4-channel modes respectively. Alternatively, a smart power management system could monitor the charge of a battery or terminal voltage of an ultra-capacitor and adjust the number of channels in the processor to extend the time before a recharge is required.

#### VIII. Conclusion

This paper presents a SoC with an invisible middle-ear sensor for a fully-implantable CI. First, a piezoelectric sensor detects sound in the ear canal by converting the mechanical motion of the middle ear into an electrical signal that is captured by the PZFE. Measurements with human cadaveric ears show that the sensor is capable of detecting sounds from 40 to 90 dB SPL over the bandwidth of interest. Second, a highly-reconfigurable sound processor leverages digital processing to provide greater programmability over analog approaches, and enable voltage scaling down to 0.6 V to maximize energy-efficiency. The number of channels can be reconfigured between 8, 6, or 4 to enable a power-performance tradeoff, and all processor settings are programmable to ensure a patient-specific fit. Third, an auditory nerve fiber model is used to determine an energy-optimal biphasic stimulation waveform which was validated *in-vivo* with four human CI users and shown to provide 15% to 35% energy savings. A mixed-signal arbitrary waveform stimulator is used to deliver energy-efficient current pulses to the auditory nerve. In 8-channel mode, the SoC consumes just 572  $\mu$ W of power, 98% of which is attributed to

the stimulator. Therefore, the energy savings of the energy-optimal stimulation waveform transfer directly to the overall system. The SoC integrates implantable acoustic sensing, sound processing, and neural stimulation on one chip to minimize the implant size, and proof-of-concept is demonstrated by using the SoC and umbo-mounted sensor to detect a clip of speech played into the ear canal of a human cadaver ear.

## Acknowledgments

This work was supported by NSERC and the Bertarelli Foundation.

The authors would like to acknowledge Don Eddington and Victor Noel for help with the stimulation waveform measurements, Dave Perreault for helpful discussion, and the TSMC University Shuttle Program for chip fabrication.

#### References

- Young DJ, Zurcher MA, Semaan M, Megerian CA, Ko WH. MEMS capacitive accelerometer-based middle ear microphone. IEEE Trans Biomed Eng. Dec; 2012 59(12):3283–3292. [PubMed: 22542650]
- Germanovix W, Toumazou C. Design of a micropower current-mode log-domain analog cochlear implant. IEEE Trans Circuits Syst II. Oct; 2000 47(10):1023–1046.
- 3. Sarpeshkar, R.; Baker, MW.; Salthouse, CD.; Sit, JJ.; Turicchia, L.; Zhak, SM. An analog bionic ear processor with zero-crossing detection. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers; San Francisco, CA. Feb. 2005; p. 78-79.
- 4. Whiten, DM. PhD dissertation. Massachusetts Institute of Technology; Cambridge, MA: Feb. 2007 Electro-anatomical models of the cochlear implant.
- 5. Wongsarnpigoon A, Grill WM. Energy-efficient waveform shapes for neural stimulation revealed with a genetic algorithm. J Neural Eng. Aug. 2010 7(4)
- Kelly SK, Wyatt JL Jr. A power-efficient neural tissue stimulator with energy recovery. IEEE Trans Biomed Circuits Syst. Feb; 2011 5(1):20–29. [PubMed: 23850975]
- 7. Barbara M, Manni V, Monini S. Totally implantable middle ear device for rehabilitation of sensorineural hearing loss: preliminary experience with the esteem envoy. Acta Oto-Laryngologica. Apr; 2009 129(4):429–432. [PubMed: 19117172]
- 8. Kroll K, Grant I, Javel E. The Envoy totally implantable hearing system, St. Croix Medical. Trends in Amplification. 2002; 6(2):73–80. [PubMed: 25425915]
- 9. Maniglia A, Abbass H, Azar T, Kane M, Amantia P, Garverick S, Ko WH, Frenz W, Falk T. The middle ear bioelectronic microphone for a totally implantable cochlear hearing device for profound and total hearing loss. The American Journal of Otology. 1999; 20(5):602–611. [PubMed: 10503582]
- 10. Jenkins A, Atkins J, Horlbeck D, Hoffer M, Balough B, Arigo JV, Alexiades G, Garvis W. U.S. phase I preliminary results of use of the Otologics MET fully-implantable ossicular stimulator. Otolaryng Head Neck Surgery. Aug; 2007 137(2):206–232.
- James CJ, Skinner MW, Martin LFA, Holden LK, Galvin KL, Holden TA, Whitford L. An investigation of input level range for the nucleus 24 cochlear implant system: Speech perception performance, program preference, and loudness comfort ratings. Ear & Hearing. Apr; 2003 24(2): 157–174. [PubMed: 12677112]
- 12. Zeng FG, Grant G, Niparko J, Galvin J, Shannon R, Opie J, Segel P. Speech dynamic range and its effect on cochlear implant performance. J Acoust Soc Am. Jan; 2002 111(1):377–386. [PubMed: 11831811]
- 13. Loizou PC, Dorman M, Tu Z. On the number of channels needed to understand speech. J Acoust Soc Am. Oct; 1999 106(4):2097–2103. [PubMed: 10530032]
- Nie K, Barco A, Zeng FG. Spectral and temporal cues in cochlear implant speech perception. Ear & Hearing. Apr; 2006 27(2):208–217. [PubMed: 16518146]

15. Shannon RV, Zeng FG, Kamath V, Wygonski J, Ekelid M. Speech recognition with primarily temporal cues. Science. Oct; 1995 270(5234):303–304. [PubMed: 7569981]

- Fishman KE, Shannon RV, Slattery WH. Speech recognition as a function of the number of electrodes used in the SPEAK cochlear implant speech processor. J Speech Lang Hear Res. Oct; 1997 40(5):1201–1215. [PubMed: 9328890]
- 17. Fu QJ, Shannon RV, Wang X. Effects of noise and spectral resolution on vowel and consonant recognition: Acoustic and electric hearing. J Acoust Soc Am. Dec; 1998 104(6):3586–3596. [PubMed: 9857517]
- 18. Faulkner A, Rosen S, Wilkinson L. Effects of the number of channels and speech-to-noise ratio on rate of connected discourse tracking through a simulated cochlear implant speech processor. Ear & Hearing. Oct; 2001 22(5):431–438. [PubMed: 11605950]
- Yip, M.; Jin, R.; Nakajima, HH.; Stankovic, KM.; Chandrakasan, AP. A fully-implantable cochlear implant SoC with piezoelectric middle ear sensor and energy-efficient stimulation in 0.18 μm HVCMOS," in. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers; San Francisco, CA. Feb. 2014; p. 312-313.
- 20. Yip M, Chandrakasan AP. A resolution-reconfigurable 5-to-10-bit 0.4-to-1 V power scalable SAR ADC for sensor applications. IEEE J Solid-State Circuits. Jun; 2013 48(6):1453–1464.
- 21. Wilson BS, Finley CC, Lawson DT, Wolford RD, Eddington DK, Rabinowitz WM. Better speech recognition with cochlear implants. Letters to Nature. Jul.1991 352:236–238.
- 22. Georgiou J, Toumazou C. A 126-μW cochlear chip for a totally implantable system. IEEE J Solid-State Circuits. Feb; 2005 40(2):430–443.
- 23. Sahin M, Tie Y. Non-rectangular waveforms for neural stimulation with practical electrodes. J Neural Eng. Sep; 2007 4(3):227–233. [PubMed: 17873425]
- 24. Ghovanloo M, Najafi K. A compact large voltage-compliance high output-impedance programmable current source for implantable microstimulators. IEEE Trans Biomed Eng. Jan; 2005 52(1):97–105. [PubMed: 15651568]



**Fig. 1.** Block diagram of a conventional cochlear implant.



**Fig. 2.** Block diagram of the fully-implantable cochlear implant SoC.



**Fig. 3.**(a) Block diagram and (b) photograph of the measurement setup and discrete prototype used to characterize the piezoelectric sensor mounted on the middle ear of a human cadaveric temporal bone.



**Fig. 4.**(a) Umbo velocity and (b) charge amplifier output voltage versus ear canal sound pressure at 0.5, 1, 2, and 4.7 kHz. (c) Spectrum of the charge amplifier output for sound pressure levels from 40 to 90 dB SPL.



**Fig. 5.**(a) Equivalent circuit for the charge amplifier including noise sources, and (b) the corresponding block diagram.



**Fig. 6.**Block diagram of the 0.6 V digital reconfigurable multi-rate CIS sound processor.



**Fig. 7.** Structure of the reconfigurable FIR filter (Type 3) used for channels A, C, E, G that can be reconfigured into 3 modes: 14-, 16-, and 20-tap used in 4-, 6-, or 8-channel modes.



**Fig. 8.** Effective frequency response of the multi-rate filter bank at 16 kHz reconfigured in (a) 4-channel, (b) 6-channel, and (c) 8-channel modes.





Fig. 9. (a) Energy-optimal stimulation waveform at 25  $\mu$ s/phase from the heuristic search using the computational nerve fiber model. Rectangular and exponential waveforms are included for comparison. (b) Perceived loudness versus energy delivered per phase from four human subjects.



**Fig. 10.**(a) Schematic of the high-voltage electrode switch matrix during the cathodic phase of electrode 2. (b) Schematic of the fast-settling 6-bit current steering DAC.



**Fig. 11.** Timing diagram for the digital control of the electrode switch matrix.



Fig. 12. Ultra-low-voltage digital control of the stimulator. (a) Electrode selection state machine. (b) High-voltage electrode switch matrix control generation. (c) Digital arbitrary waveform interface.



**Fig. 13.** Die micrograph of the prototype SoC.



**Fig. 14.** Measured gain response of the charge amplifier (stage 1) of the PZFE with (a)  $C_P = 3.2$  nF and (b)  $C_P = 0.56$  nF. Panel (c) shows the combined response of the charge amplifier and PGA (stage 1 and 2). Simulation results are shown with dotted lines.



**Fig. 15.** Measured spectrograms at the output of the (a) ADC, (b) 4-channel processor, (c) 6-channel processor, and (d) 8-channel processor when a logarithmic chirp signal is applied at the input. (e) Ideal Matlab simulation to compare against the measured results shown in (d).



Fig. 16. Measured current and voltage of a model electrode ( $R_s = 3 \text{ k}\Omega$ ,  $C_d = 10 \text{ nF}$ ) with (a) a rectangular waveform, and (b) the energy-optimal waveform. (c) Measured current pulse trains at 1,000 pulses/sec through all electrodes in 8-channel mode.





Fig. 17. Measured total stimulator power across 8-, 6-, and 4-channel modes for phase widths of (a)  $25 \mu s$  and (b)  $50 \mu s$ .



Fig. 18.

(a) Spectrogram and time-domain waveform of the input speech signal ("her husband brought some flowers") to the audio amplifier driving the speaker placed in the ear canal of the temporal bone. (b) Measured spectrogram and reconstructed sound from the SoC with the piezoelectric sensor mounted on a cadaver temporal bone.

Yip et al. Page 35

**TABLE I**Measured performance summary for each sub-system of the FICI SoC.

|                                      | Sensor size $(C_P)$                  | 0.56 nF                                                                                                    | 3.2 nF                 |  |
|--------------------------------------|--------------------------------------|------------------------------------------------------------------------------------------------------------|------------------------|--|
| Piezoelectric sensor front-end       | ` * '                                | 21 to 41 dB                                                                                                | 34 to 53 dB            |  |
|                                      | Stage 1 gain                         |                                                                                                            |                        |  |
|                                      | Stage 2 gain                         | 0.8, 6.8, 12.4, 18, 23.8, 29.4 dB                                                                          |                        |  |
|                                      | Stage 3 gain                         | 12 dB (ADC buffer)                                                                                         |                        |  |
|                                      | Bandwidth                            | 300 Hz to 6 kHz                                                                                            |                        |  |
|                                      | Dynamic range                        | 62 dB                                                                                                      | 59 dB                  |  |
|                                      | Input-referred noise                 | $2.81  \mu V_{\rm rms}$                                                                                    | $1.93  \mu V_{ m rms}$ |  |
| 9-bit SAR ADC                        | $f_S$                                | 16 kS/s                                                                                                    |                        |  |
|                                      | SNDR                                 | 52 dB (8.35 ENOB)                                                                                          |                        |  |
|                                      | INL/DNL                              | 0.69/0.64 LSB                                                                                              |                        |  |
| Reconfigurable CIS sound processor   | Bandwidth                            | 300 Hz to 5.5 kHz                                                                                          |                        |  |
|                                      | Filter bank                          | Multi-rate FIR with -30 dB stop band                                                                       |                        |  |
|                                      | Output                               | 6 bits @ 1 kHz                                                                                             |                        |  |
|                                      | Number of channels                   | 8, 6, or 4                                                                                                 |                        |  |
|                                      | Rectification                        | Half-wave or full-wave                                                                                     |                        |  |
|                                      | Compression                          | $Y = \frac{\ln{(1+CX)}}{\ln{(1+C)}}$ for $C = 1024, 128, 16$                                               |                        |  |
|                                      | Channel gain                         | 1/16, 1/8, 1/4, 1/2, 1, 2, 4, 8                                                                            |                        |  |
|                                      | Patient THR                          | 3 bits/channel                                                                                             |                        |  |
|                                      | Patient MCL                          | 3 bits/channel                                                                                             |                        |  |
| Arbitrary waveform neural stimulator | Pulse rate/channel                   | 1000 pulses/sec                                                                                            |                        |  |
|                                      | Electrodes                           | Monopolar array, up to 8 electrodes                                                                        |                        |  |
|                                      | Current DAC (V <sub>MID</sub> = 7 V) | 6 bits, 1 mA full-scale INL/DNL: 1.25/0.16 LSB Voltage compliance: 6.78 V Output impedance: $> 20~M\Omega$ |                        |  |
|                                      | Waveform                             | Biphasic, 25–50 μs/phase<br>Arbitrary shape (8 time steps/phase)                                           |                        |  |
|                                      | DC block capacitors                  | Yes, required                                                                                              |                        |  |

 $\label{eq:TABLE II} \textbf{Summary table of the measured SoC power consumption}.$ 

| System component                               | Supply voltage [V] | Supply current [µA]            | Power [µW]                     | % of total power in 8-<br>channel mode |
|------------------------------------------------|--------------------|--------------------------------|--------------------------------|----------------------------------------|
| Piezoelectric sensor front-end                 | 1.5                | 6.83                           | 10.25                          | 1.8%                                   |
| SAR ADC (9 bits, 16 kS/s)                      | 0.6                | 0.51                           | 0.31                           | 0.05%                                  |
| Digital CIS sound processor                    | 0.6                | (8/6/4 chan)<br>2.65/2.10/1.87 | (8/6/4 chan)<br>1.59/1.26/1.12 | 0.28%                                  |
| Stimulator and switch matrix:                  |                    | (8/6/4 chan)                   | (8/6/4 chan)                   |                                        |
| Digital waveform interface                     | 0.6                | 0.58/0.54/0.49                 | 0.35/0.32/0.30                 | 0.06%                                  |
| Level shifters                                 | 1.8                | 0.21/0.16/0.11                 | 0.38/0.29/0.20                 | 0.07%                                  |
| Current DAC circuits                           | 3.3                | 22.7/17.4/12.1                 | 74.9/57.4/39.9                 | 13%                                    |
| Stimulator supply ( $V_{MID} = 5$ to 10 V)     | 7                  | 68.4/50.2/32.3                 | 479/351/226                    | 84%                                    |
| Switch matrix control ( $V_{DDG} = 7$ to 12 V) | 9                  | 0.60/0.45/0.30                 | 5.40/4.05/2.70                 | 0.94%                                  |
| Total (8-channel mode)                         |                    |                                | 572                            | 100%                                   |
| Total (6-channel mode)                         |                    |                                | 425                            |                                        |
| Total (4-channel mode)                         |                    |                                | 281                            |                                        |