Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Jul 1.
Published in final edited form as: IEEE Trans Circuits Syst I Regul Pap. 2016 Jun 29;63(7):972–981. doi: 10.1109/TCSI.2016.2556122

Micropower Mixed-signal VLSI Independent Component Analysis for Gradient Flow Acoustic Source Separation

Milutin Stanaćević 1, Shuo Li 2, Gert Cauwenberghs 3
PMCID: PMC5287422  NIHMSID: NIHMS805679  PMID: 28163663

Abstract

A parallel micro-power mixed-signal VLSI implementation of independent component analysis (ICA) with reconfigurable outer-product learning rules is presented. With the gradient sensing of the acoustic field over a miniature microphone array as a pre-processing method, the proposed ICA implementation can separate and localize up to 3 sources in mild reverberant environment. The ICA processor is implemented in 0.5 µm CMOS technology and occupies 3 mm × 3 mm area. At 16 kHz sampling rate, ASIC consumes 195 µW power from a 3 V supply. The outer-product implementation of natural gradient and Herault-Jutten ICA update rules demonstrates comparable performance to benchmark FastICA algorithm in ideal conditions and more robust performance in noisy and reverberant environment. Experiments demonstrate perceptually clear separation and precise localization over wide range of separation angles of two speech sources presented through speakers positioned at 1.5 m from the array on a conference room table. The presented ASIC leads to a extreme small form factor and low power consumption microsystem for source separation and localization required in applications like intelligent hearing aids and wireless distributed acoustic sensor arrays.

Index Terms: Blind source separation, Independent component analysis, Micropower techniques

I. Introduction

Blind source separation(BSS) has long been considered a hard signal processing problem and different algorithms for a wide range of applications in speech processing [1], wireless communications [2] and biomedical signal processing [3] exist. Independent component analysis (ICA) is a signal processing technique for solving instantaneous BSS problem that can be formulated as a linear transformation that minimizes the statistical dependence between components in a random data vector [4]. Few analog VLSI implementations of ICA exist in the literature [5], [6], while in the digital domain, the high-power implementations using FPGAs are common practice in the field [7], [8] and few digital ASICs [9], [10] are reported.

For blind separation of acoustic sources new opportunities arise with the miniaturization of microphones, specially with the design of MEMS microphones that can capture the field distribution of the impinging sound source [11], [12]. While the human auditory system performs remarkably well in segregating multiple streams of acoustic sources, performance of modern hearing aids significantly deteriorates in the presence of multiple signal and noise sources in the acoustic scene [13]. To be effective in resolving the signal of interest, both localization and separation of multiple acoustic sources are required. We have demonstrated that the direction of acoustic wave propagation can be estimated by differential spatial sensing of the field on sub-wavelength scale, using gradient flow technique [14], [15]. Mixed-signal VLSI implementation of the method [16] has demonstrated improved performance in terms of power dissipation and bearing resolution over conventional bearing estimation localizers. Besides its use in bearing estimation, gradient flow provides an efficient signal representation as a front-end for blind source separation of traveling wave signals. In the presence of multiple signal sources, gradient flow converts the problem of separating unknown delayed mixtures of independent signal sources, into a simpler problem of separating instantaneous mixtures of the time-differentiated source signals [14]. This formulation is equivalent to the problem statement in independent component analysis. Implementation of ICA in mixed-signal VLSI combined with spatial gradient sensing ASIC yields a real-time low-power system for separation and localization of multiple independent signal sources impinging on miniature microphone array. We have previously reported the implementation of the ICA update rule formulated in the outer-product form with the fixed-diagonal terms [17], [18]. The presented ICA architecture implements natural gradient update rule with the back-propagation path providing improved stability and faster convergence.

The paper is organized as follows. Section II reviews algorithms for static linear ICA and describes how gradient flow yields to joint localization and separation in case of multiple travelling source waves. A general mixed-signal parallel architecture, that can be configured for implementation of various ICA update rules is presented in Section III. Experimental characterization of a fabricated prototype and demonstration of the separation performance is presented in Section IV, followed by the concluding comments in Section V.

II. Blind Source Separation

The blind source separation problem can be formulated in a following manner. N unknown sources s(t) = [s1(t)s2(t)․․sN(t)]T propagate through an unknown medium and are observed by an array of M sensors x(t) = [x1(t)x2(t)․․xM(t)]T. The task is to obtain estimates of the source signals y(t) = [y1(t)y2(t)․․yN(t)]T from the observed sensor signals. The common assumption on the source signals utilized in solving the separation problem is that the source signals are statistically independent [4]. The model of the mixing process is determined in accordance with the setting of the problem. In the case of linear instantaneous mixing, the observed signals x are linear combinations of the unknown source signals

x(t)=As(t)+n(t), (1)

where A is M×N dimensional mixing matrix and n is the additive noise. When the number of source N is greater than the number of observations M, N > M, some prior information on the source signals is necessary for solving the problem.

A. Gradient Flow Signal Representation

The linear instantaneous mixing model is not applicable when the source signals are traveling waves like acoustic waves. We devised a signal conditioning technique gradient flow [14] for unmixing of the observed signals on the miniature microphone array with the distance between microphones much smaller than the wavelength of the source signals. Gradient flow separates and localizes sources by relating spatial and temporal derivatives of the impinging acoustic field, as illustrated in Figure 1(a). Observation of the first order spatial gradients of the acoustic field ξ10(t) and ξ01(t) in perpendicular directions p and q in the plane in the case of a single source enables indirect estimation of time delays τp and τq, as illustrated in Figure 1(b),

ξ10(t)τpξ˙00(t)
ξ01(t)τqξ˙00(t), (2)

where ξ̇00(t) is the time-differentiated spatial common mode of the acoustic field ξ00(t). The propagation delays τp and τq directly yield the azimuth θ and the elevation ϕ angle of the source.

Fig. 1.

Fig. 1

(a) Gradient flow principle. At low aperture, interaural level differences (ILD) and interaural time differences (ITD) are directly related, scaled by the temporal derivative of the signal [18]. (b) Propagation delays τp and τq scale the spatial gradients in perpendicular directions. (c) Joint separation and localization of up to three sources s1, s2 and s3 impinging on the array.

When multiple sources are impinging on the miniature array, as illustrated in Figure 1(c) for three sources, the observed first order spatial gradients yield linearly mixed observations of the time-differentiated source signals each scaled by propagation delays τp and τq along the gradient directions:

ξ10(t)lτpll(t)+ν10(t)
ξ01(t)lτqll(t)+ν01(t), (3)

where ν10 and ν01 represent spatial derivative components of additive noise in the sensor observations.

Likewise, the time-differentiated spatial common mode ξ̇00(t) yields further linearly mixed observation of the time-differentiated source signals :

ξ˙00(t)ll(t)+ν˙00, (4)

where ν̇00 represents the time-derivative of the common mode component of additive noise in the sensor observations. Looking at gradient signals ξ̇00(t), ξ10(t) and ξ01(t), equations (3) and (4) can be identified as a linear instantaneous mixture of time-differentiated source signals, which is in the form of classic linear static ICA (1) as

[ξ˙00ξ10ξ01]=[11τp1τpNτq1τqN][1N]+[ν˙00ν10ν01] (5)

The gradients ξ̇00(t), ξ10(t) and ξ01(t) in (5) are estimated by finite differences of the acoustic field on the sensor grid comprising four microphones in the configuration illustrated in Figure 1 [14] and computed in a micropower mixed-signal VLSI architecture [16]. The mixing coefficients in (5) represent the corresponding 3-D direction cosines in terms of inter-temporal differences and uniquely determine the direction of the source ℓ. Therefore, in this representation a system composed by integration of gradient flow and linear instantaneous ICA, can be used to achieve multiple source separation and localization of traveling source signals.

B. Independent Component Analysis

For this mixing model (1), ICA solution is formulated as a linear transformation that minimizes the statistical dependence between components in the output signals y

y(t)=Wx(t), (6)

where W is N×M dimensional unmixing matrix. The unmixing matrix W is not uniquely defined, with ambiguity in scaling and permutation. The energy of the source signals cannot be determined, since both s and A are unknown and any scalar multiplier in one of the sources could be canceled by dividing the corresponding column of A by the same scalar. As the order of the independent sources is not predefined, any permutation of the columns of the matrix W is a valid solution of the separation problem.

1) ICA Learning Rules

Most of the ICA learning algorithms proposed in the literature are based on optimizing a cost function defined as the measure of independence between the components of the output signals [19]. Different approaches, like maximization of entropy [20], minimization of mutual information of the output signals [19], [21] and the maximization of likelihood function [22], lead to the same form of the cost function

L(W)=N log(|det(W)|)i=1Nk=1Klog(pi(yi(k))), (7)

where pi (yi) are the marginal probability density functions (pdfs) of output signals. Term det(W) represent the volume conserving property of the linear transformation [23]. ICA learning rule is derived by applying the stochastic gradient descent to (7)

ΔW=μ([WT]1f(y)xT), (8)

where f (y) is the cumulative distribution function

f(yi)=dpi(yi)/dyipi(yi). (9)

This update rule was first derived as InfoMax learning rule in [20] by maximization of entropy of transformed output signals. Selection of f (y) as a non-linearity that approximately matches the input cdf’s does not affect the performance of the algorithm. The uniform and robust convergence is obtained by using Amari’s natural gradient [24], which has the simple form in the space of matrices, WWT. Multiplying (8) by WWT leads to the learning rule without matrix inversion

ΔW=μ[If(y)yT]W. (10)

The convergence of (10) implies E{fi (yi)yi}=1 as a constraint on the reconstructed signals. To avoid numerical instability due to non-stationarity in the sources, the Cichocki-Unbehauen (C-U) algorithm [25] introduces a non-holonomic constraint in the natural gradient learning rule (10), by fixing the diagonal terms of the unmixing matrix W:

ΔW=μ[Λf(y)yT]W, (11)

where Λ is a diagonal scaling matrix. Convergence of the C-U algorithm implies Λii = E[f (yi)yi].

The similar algorithm can be derived based on non-linear decorrelation that introduces higher-order statistics into the solution method. This first formulation of ICA, inspired by biomimetic principles, was derived by Herault-Jutten (H-J) [26] and was based on a feedback network topology

y=Wy+x, (12)

with zero diagonal terms (wii ≡ 0, ∀i). An independence criterion with nonlinear correlation between output signals yields the on-line learning rule for the off-diagonal terms

Δwij=μf(yi)g(yj),ij (13)

where f(․) and g(․) are appropriately chosen, odd-symmetric functions. Good example is the function that matches cumulative distribution function of source signals for f and linearity g(y) ≡ y for function g.

2) General Outer-Product Learning Rule

Efficient implementation of ICA in feedforward parallel architecture requires casting learning rules (10) and (13) in a form of the outer-product update rule. H-J learning rule (13) is in the form of the outer-product, however the learning rule is defined for recurrent architecture (12). To map the recurrent architecture onto a feedforward form, we apply the following approximation:

y=(I+W)1x(IW)x (14)

In other words, we choose to implement the H-J rule with linear feedforward network of the type y = Wx with fixed diagonal terms wii ≡ 1, and with off-diagonal terms adapting according to (13) [17], [18]. Equivalently, the implemented update rule can be seen as the gradient of InfoMax (8) multiplied by WT, rather than the natural gradient multiplication factor WTW. Interestingly, in the special case of a 2 × 2 network (2 sources and 2 observations) the update rule (13) reduces to non-holonomic (zero-diagonal) form of the C-U rule (11).

To obtain the natural gradient update rule (10) in outer-product form, it is necessary to include a back-propagation path in the network architecture to implement the vector contribution z = WT y. The learning rule can then be represented in the form of the decay term and the outer-product term

Δwij=μwijμf(yi)zj (15)

Through quantization of the vector terms in the outer-product rules (13) and (15), as well as the quantization of the decay term in (15), the update rules in the proposed implementation are simplified to the discrete counting operations. In the case of speech signals that are approximately Laplacian distributed, the optimal nonlinear scalar function f (yi) can be approximated by sign(yi), which requires a single bit quantization. Conversely, vectors y in rule (13) and z in rule (15) are approximated by a 3-level staircase function (−1, 0, +1) using 2-bit quantization.

III. Chip Architecture

The functional block diagram of a 3 × 3 outer-product mixed-signal ICA architecture is shown in Figure 2. An analog datapath directly interfaces with input signals and provides analog output signals, without the need for analog-to-digital conversion at the input and digital-to-analog conversion at the output. The digital adaptation offers the flexibility in selection of the learning rules. The mixed-signal architecture is implemented using fully differential switched-capacitor (SC) sampled-data circuits. The coefficients of the unmixing matrix are stored digitally in each cell of the architecture. The update is performed locally by once or repeatedly incrementing, decrementing or holding the current value of counter based on the on-chip defined learning rules or through a range of learning rules served by the external logic. Correlated double sampling performs common mode offset rejection and 1/f noise reduction.

Fig. 2.

Fig. 2

(a) Reconfigurable parallel mixed-signal ICA architecture implementing general outer-product form of ICA update rules with one unmixing coefficient cell shown in (b).

The differential analog input channels directly interface the gradient output signals from a previously developed gradient flow processor for acoustic localization [16], to extend the system functionality to joint separation and localization of up to three acoustic sources. The digital values of the unmixing coefficients correspond to the directional angles of the source signals in the gradient flow representation.

A. Vector Matrix Multiplication

ICA is a linear transformation and the main functionality of the proposed implementation is a (3×1)–(3×3) vector-matrix multiplication. In addition to the vector-matrix multiplication in the computation of the output signals y = Wx, the vector-matrix multiplication is required for the implementation of the natural gradient learning rule in the outer-product form (15), z = WT y. Since the same matrix W is used in both multiplication operations, the vector z is computed using the same circuit used to compute the vector y, but with the time-multiplexed input in and output out signals of the vector-matrix multiplication circuitry. That is, the product z = WT y in the natural-gradient learning rule is obtained after computation of the output signal y.

The switched-capacitor implementation of the vector-matrix multiplication is shown in Figure 3. The input vector is analog signal, while the coefficients of the unmixing matrix are adaptive digital values. Looking at the product y = Wx, one component of the output signal yi is decomposed in differential form as a linear sum of the weighted differential input contributions

yi+=j=13(wij+xj+wijxj+)
yi=j=13(wij+xj++wijxj). (16)

The multiplication is performed through multiplying D/A capacitor arrays and output of each of these DACs represents product wij+xj or wijxj. The output signal is computed by accumulating outputs from all the DAC cells in the ith row. The clocks ϕ1 and ϕ2 are non-overlapping, and ϕ1e replicates ϕ1 with its falling edge slightly preceding the falling edge of ϕ1. All the switches are complementary transmission gate FETs, except the switches controlled by ϕ1e, which are n-channel FETs. In the precharging phase ϕ1, all the unit capacitors in the array are precharged to the zero-level reference voltage Vmid (set to Vdd/2), as well as the feedback capacitor C2. The inverters are reset. In the computation phase ϕ2, the input signals are sampled on unit capacitors and the accumulation is performed on C2 by high-gain amplifier yielding the valid output signals during ϕ2 phase. The implementation and layout of the multiplying capacitor arrays are the same as in gradient flow acoustic localizer chip [16]. The value of the unit capacitor in the array is 5.1 fF. The value of the feedback capacitor C2 is externally set to accommodate variable gain in the vector-matrix multiplication and can take one of the four values: 0.5 pF, 1 pF, 2 pF and 4 pF. The vector-matrix multiplier is followed by the sample-and-hold circuit that provides the analog output signal.

Fig. 3.

Fig. 3

Switched-capacitor implementation of the vector-matrix multiplication for computation of the ith component of the output signal y.

A cascoded pseudo-nMOS inverter biased in weak inversion (the upper range of the subthreshold regime) is used as high-gain amplifier [16]. The choice of telescopic operational amplifier without tail transistor is driven by smaller area and reduced noise and power dissipation [27], [28], while the weak inversion is chosen as the region of operation due to extended output dynamic range and the maximum transconductance-to-current ratio for maximum energy efficiency at the highest possible speed [29]. The performance of the designed cascoded amplifier was simulated at 200 nA of biasing current with the load capacitance of 1 pF. At 3 V supply, simulations indicate an open-loop dc gain of 91 dB and gain-bandwidth product of 844.3 kHz. The biasing current in each of the amplifiers in the proposed architecture was set by considerations of sampling frequency, slew-rate, and power dissipation.

B. Comparator Design

The discrete values of functions sgn(yi), quant(yj) and quant(zj) in (15) are obtained through level comparisons. Implementation of the comparator, that is able to compare signal with variable level, consists of the preamplifier, shown in Figure 4, followed by a latched, regenerative comparator [16]. The clocks ϕ1c and ϕ2c are non-overlapping and their relative timing with respect to clocks ϕ1 and ϕ2 is shown as inset in Figure 4, as well as the changes in threshold voltage VTH for 3-level comparison. ϕ1ec replicates ϕ1c with its falling edge slightly preceding the falling edge of ϕ1c. The value of capacitors C1 and Cc is 300 fF.

Fig. 4.

Fig. 4

Design of the comparator for sequential comparison of the signals y and z with variable voltage levels defined by the threshold voltage VTH.

While the output signals are valid, yi+ is sampled in phase ϕ1c on capacitor C1. The sign of the comparison of yi with variable level threshold VTH is computed in the evaluate phase ϕ2c, through capacitive coupling into the amplifier input node. The change in voltage VTH in phase ϕ2c will lead to multiple level comparisons in a single clock cycle. The offset of the comparator was measured to be 10 mV and is consisted for the different level comparisons.

C. Learning Rule

The parallel architecture with the representation of the ICA learning rules in the form of outer-product enables the local updates of coefficients of the unmixing matrix W. The coefficients are represented as 14-bit values in two’s complement and are stored in a counter. In the gradient flow framework, the unmixing coefficients yield the 3-D direction cosines in terms of inter-temporal differences τ1 and τ2. The update is performed by incrementing, decrementing or holding the current value of the counter.

The implementation of H-J learning rule (13) is performed as [17]

wij+[n+1]=wij+[n]sgn(yi[n])quant(yj[n])
wij[n+1]=2141wij+[n+1], (17)

where quant(yj) is a 3-level staircase function (−1,0,+1), that is coded using 2-bits, sgn and mag, as illustrated in Figure 5. The natural gradient learning update (15) comprises the outer product update

wij+[n+1]=wij+[n]sgn(yi[n])quant(zj[n])
wij[n+1]=2141wij+[n+1], (18)

and periodical update that is proportional to the value of the first two MSBs of wij. Additional levels could be introduced in a staircase function quant(yj) by increasing a number of levels in the comparator signal VTH, as described in the Section III-B, at the penalty of increased power consumption due to additional clock transitions. The 8 most significant bits of the weights are presented to multiplying D/A capacitor array in thermometer code to construct the output signals. The remaining 6 bits in the coefficient registers provide flexibility in programming the update rate to tailor convergence.

Fig. 5.

Fig. 5

The unmixing coefficient wij is stored in a 14-bit counter and presented for vector-matrix multiplication through 8-bit multiplying D/A capacitor array [16].

IV. Experimental Results

A prototype 3 × 3 mixed-signal ICA processor was designed, fabricated, and tested. The architecture is integrated on a single 3mm × 3mm chip fabricated in 0.5 µm 3M2P CMOS technology. Figure 6 shows the micrograph of the chip.

Fig. 6.

Fig. 6

Micrograph of 3 mm × 3 mm chip in 0.5 µm CMOS technology.

All the experiments with the chip were conducted at 3 V supply voltage and with the zero-level reference Vmid set to 1.5 V. The signal swing at the chip inputs is 2.4 V peak-to-peak. The reference voltage Vmid and three biasing voltages for cascoded inverter were generated on-board for testing purposes. However, they might have been generated internally at expense of small increase in power consumption.

The integral nonlinearity(INL) of the multiplying DAC is measured at 0.54 LSB. The effective resolution in the output signal as the result of the vector-matrix multiplication is 8.83 bits. The measured power consumption of the chip is 195 µW at 16 kHz sampling rate.

The digital estimates of coefficients wij in unmixing matrix are obtained directly from the counters at convergence and are output 4-bits at time. The chip also outputs the estimated source signals yi(t). The output signals are presented in complementary analog format through sample-and-hold buffers.

A. Synthetic Speech Experiments

For the full characterization of the separation performance of ICA processor, the synthetic mixtures of the speech signals were generated under diverse conditions in a controlled environment. The observed signals were artificially generated as received on the four microphone array in a configuration illustrated in Figure 1. The distance between microphones was set at 1 cm with sampling frequency of 16 kHz. The source signals were speech signals from TIMIT database. The microphone signals were first processed by acoustic localizer [16] that computes the spatial gradient signals (5) using on-grid approximation. The spatial gradients are the input signals to the ICA processor. ICA processor was configured to implement the outer-product update algorithms in (13) and (15). The quantization level in 3-level approximation of quant(y) was set to 180 mV amplitude change in voltage VTH, while the level change in quant(z) was 150 mV throughout all the experiments.

In the on-chip implementation of the ICA learning rules, we introduced quantization of the vector terms in the learning rules (13) and (15), as well as quantization of the decay term in (15). To assess the effect of the quantization on the separation performance, ICA algorithms implementing H-J learning rule (13) and natural gradient learning rule (15) without quantization were implemented in MATLAB. As additional benchmark for the separation performance, the efficient FastICA(EFICA) algorithm [30] was used. As a measure of separation performance, we used the signal-to-interference ratio (SIR). SIR in a single output yi is computed as

SIRi=10 log10j<yij2>maxj<yij2>maxj<yij2>, (19)

where yij is the contribution of the signal j to the output signal i. The reported SIR for the separation of multiple sources is obtained as the minimum of SIR in all output signals

SIR=miniSIRi. (20)

In the first experiment, the effect of the angular distance between two sources on the separation performance was examined. The same elevation angles were assumed for both sources and the azimuth angle of the first source was set at θ1=30°. The azimuth angle of the second source θ2 was varied from −15° to 135° in increments of 15°. We have omitted the locations of the second source at 0° and 90° where the separation is trivial. The separation performance of ICA online on-chip rules (13) (on-chip HJ) and (15) (on-chip NG), as well as the performance obtained from MATLAB with H-J, natural gradient (NG) and EFICA algorithm are shown in Figure 7. The results demonstrate a uniform separation performance for EFICA, while performance of on-line learning rules slightly depends on the directional separation. HJ online learning rule has demonstrated a lower convergence rate than natural-gradient on-line rule, with also higher sensitivity to the initial conditions. The experiment was repeated with three sources present in the environment, with incidence angle corresponding to 30°, 70° and 135°. The simulated SIRs in three channels for on-chip NG were 28.5 dB, 25.6 dB and 18.2 dB. We also assessed the effect of the finite resolution of the vector-matrix multiplication on the separation performance of the quantized on-chip learning rules through MATLAB simulations. Increasing the effective resolution of the vector-matrix multiplication beyond 9 bits did not result in any significant improvement in the separation performance.

Fig. 7.

Fig. 7

SIR for separation of two sources incident on the microphone array. The azimuth angle of the first source is fixed at θ1=30° and the azimuth angle of the second source θ2 is varied from −15° to 135°.

In the second experiment, the effect of the acquisition noise on the separation performance is investigated. We assumed that a white, spatially uncorrelated Gaussian noise sources are added to each sensor. The results for different signal-to-noise ratios (SNR) are presented in Figure 8. In the noisy environment, on-line quantized learning rules outperform EFICA.

Fig. 8.

Fig. 8

Separation performance as a function of spatially uncorrelated sensor noise when the source signals are impinging the array at 30° and 70°.

In the mixing model adopted in the gradient flow representation (3), the anechoic environment is assumed. However, in the real room environment, due to reverberations the observed microphone signal is a sum of multi-path replicas of the source signal, that is a sum of time-delayed and attenuated source signals, where the delays and attenuations depend on the room geometry and the reflection coefficient of the walls. To measure the influence of the reverberations on the separation performance, we model a real room environment using image model [31]. The room dimensions were selected as [6m, 4m, 3m] and the reflection coefficient was the same for all the room surfaces. The location of the speakers and the microphone array is illustrated in Figure 9. The reflection coefficient was varied to determine the effect of the reverberation on the separation performance. The separation, as the function of the reflection coefficient, is shown in Figure 10. The separation performance degrades with the increase of the reflection coefficient, but satisfying performance is demonstrated in mild reverberant conditions. The separation is sustained as long as the direct path signal is stronger than the multi-path signals. The reflection coefficient of 0.52 in the presented room corresponds to the reverberation time of 300 ms.

Fig. 9.

Fig. 9

Location of the sensor array and speakers in the simulated room environment.

Fig. 10.

Fig. 10

Separation performance in reverberant environment as a function of reflection coefficient when the two sources are located at 30° and 70°.

B. Room Speech Separation Experiments

The separation performance of ICA processor was demonstrated and characterized in a real room environment. The recordings were performed in a typical conference room with size that corresponds to the simulated room in the synthetic room experiments. The input signals to the ICA processor were the spatial gradient signals, that are outputs of the gradient flow acoustic localizer [16]. The microphone array comprises four omnidirectional miniature microphones (Knowles FG-3629) in configuration shown in Figure 1. Microphones were arranged in a circular array with radius of 0.5 cm. The sensitivity of microphones is −53 dB and typical noise level is at 27 dB. The microphone signals were passed through a second-order bandpass filter with low-frequency cutoff set at 130 Hz and high-frequency cutoff set at 4.3 kHz. The signals were also amplified by a factor of 20. The system sampling frequency was set to 16 kHz. The speech signals, same as the ones used in the synthetic experiments, were presented through loudspeakers positioned at 1.5 m distance from the array.

To provide the quantifiable separation performance, speech segments were presented individually through either loudspeaker at different time instances. The first source was kept at the azimuth angle of 30°, while the second was moved from 45° to 165° in increments of 15°. The two recorded datasets were then added, and presented to the gradient flow localizer ASIC [16]. The gradient signals obtained from the chip were then presented to the ICA processor, configured to implement the outer-product update algorithms in (13) and (15). The obtained SIRs for both on-chip learning rules, as well as for software H-J, natural gradient and EFICA ICA algorithms, are presented in Figure 11. The real room SIRs are in accordance with the simulation results and verify the separation performance in a mild reverberant environment. The experiments were repeated with joint presentation of both sources and 14-bit digital weights that represent the coefficients of the unmixing matrix were recorded. The angles of incidence of the sources relative to the array were derived and compared to the localization results obtained by acoustic localizer ASIC when a single source was presented. The angles obtained through LMS bearing estimation under individual source presentation are within 2° to the angles produced by both on-line ICA learning rules. In Figure 12, the input and the output signals of the ICA processor, that is the spatial gradients (3) and the reconstructed source signals, are presented for azimuth angles of θ1=30° and θ2=105°. We verified that the time-waveforms of the residue signals in each of the reconstructed outputs, y12 and y21, are free of direct path signal components and are dominated by multipath contributions due to room reverberation. The performance of the proposed ICA mixed-signal implementation has been compared to the implementations in the digital-domain in Table I. It is important to note that the proposed IC has analog input and output signals which eliminates the need for data conversion. This results in significantly lower power consumption and form factor of the corresponding sensory system for source separation compared to the system that contains digital ASIC. Additionally, the operational frequency in the proposed IC matches the bandwidth of the input signals, which is not the case for digital ASICs.

Fig. 11.

Fig. 11

SIR for two reconstructed speech sources played through a loudspeakers to a miniature microphone array in a conference room. The azimuth angle of the first source was fixed at θ1 = 30° and the azimuth angle of the second source θ2 was varied from 45° to 135°. The elevation angles ϕ1 and ϕ2 for both sources were 8°.

Fig. 12.

Fig. 12

Time waveforms and spectrograms of the source signals s1 and s2, spatial gradients (3), the input signals to the ICA processor, and reconstructed source signals, output signals of the ICA processor for the case of two source signals with azimuth angles of θ1=30° and θ2=105°.

TABLE I.

Comparison Results of Different ICA Implementations

[7] [8] [9] [10] This work
Technology FPGA FPGA 90nm CMOS 90nm CMOS 0.5µm CMOS
Algorithm Infomax ICA FastICA FastICA FastICA NG ICA
No of channels 2 2 8 8 3
Speed (MHz) 12.288 50 100 11 0.016
Power 98.8mW NA 16.35mW 86µW 192µW

V. Conclusions

The proposed parallel low-power ICA architecture with outer-product learning rules using the gradient flow representation demonstrated state-of-the-art separation performance in noisy and mild reverberant conditions. The architecture mitigates the need for input or output data conversion as the inputs are the analog observed signals and the outputs are the analog reconstructed source signals with digitally stored coefficients of the unmixing matrix. The measured characteristics of the fabricated chip are summarized in Table II. These results suggest application of the gradient flow system integrating spatial sensing and static ICA with miniature microphone arrays to intelligent hearing aids with adaptive suppression of interfering signals and nonstationary noise. The separation performance of the proposed system can be extended to moderate reverberation environment by using subband decomposition of spatial gradient signals and static ICA applied in each frequency band.

TABLE II.

ICA Processor Characteristics

Technology 0.5 µm 2P3M CMOS
Size 3 mm × 3 mm
Supply 3 V
Room separation 6 dB – 12 dB
Power dissipation 192 µW at 16 kHz

Acknowledgments

This work was supported by the National Science Foundation CAREER Award 0846265. The chip was fabricated through the MOSIS foundry service.

Biographies

graphic file with name nihms805679b1.gif

Milutin Stanaćević (S’00-M’05) received the B.S. degree in Electrical Engineering from the University of Belgrade, Serbia in 1999. He received the M.S. and Ph.D. degrees in Electrical and Computer Engineering from Johns Hopkins University, Baltimore, MD, in 2001 and 2005, respectively.

In 2005, he joined the faculty of the Department of Electrical and Computer Engineering at Stony Brook University, Stony Brook, NY, where he is currently an Associate Professor. His research interests include mixed-signal VLSI circuits, systems, and algorithms for parallel multi-channel sensory information processing with emphasis on acoustic source separation and breath analysis, micropower biomedical instrumentation and readout ICs for radiation detection.

Dr. Stanaćević is a recipient of the National Science Foundation CAREER award and IEEE Region 1 Technological Innovation Award. He is an Associate Editor of the IEEE Transactions on Biomedical Circuits and Systems and serves on several technical committees of the IEEE Circuits and Systems Society.

graphic file with name nihms805679b2.gif

Shuo Li (S’07) received the B.Eng degree from Zhejiang University, Hangzhou, China in 2006, M.Phil degree from The Hong Kong University of Science and Technology, Hong Kong in 2008 and Ph.D. degree from Stony Brook University, Stony Brook, NY in 2015, all in electrical engineering. His research is on acoustic localization and separation and VLSI microsystem design.

Currently, he is an electrical engineer at Second Sight Medical Products, Inc. He works on the research and development of retinal and cortical stimulation medical devices.

graphic file with name nihms805679b3.gif

Gert Cauwenberghs (S’89-M’94-SM’04-F’11) received the Ph.D. degree in electrical engineering from California Institute of Technology, Pasadena, in 1994.

He is Professor of Bioengineering and Co-Director of the Institute for Neural Computation at the University of California-San Diego, La Jolla, CA, USA. He was previously Professor of Electrical and Computer Engineering at Johns Hopkins University, Baltimore, MD, USA, and Visiting Professor of Brain and Cognitive Science at Massachusetts Institute of Technology, Cambridge, MA, USA. He co-founded Cognionics Inc., and chairs its Scientific Advisory Board. His research focuses on micropower biomedical instrumentation, neuronsilicon and brainmachine interfaces, neuromorphic engineering, and adaptive intelligent systems.

Dr. Cauwenberghs received the NSF Career Award in 1997, ONR Young Investigator Award in 1999, and Presidential Early Career Award for Scientists and Engineers in 2000. He serves IEEE in a variety of roles including as General Chair of the IEEE Biomedical Circuits and Systems Conference (BioCAS 2011, San Diego, CA, USA), as Program Chair of the IEEE Engineering in Medicine and Biology Conference (EMBC 2012, San Diego, CA, USA), and as Editor-in-Chief of the Transactions on Biomedical Circuits and Systems.

Contributor Information

Milutin Stanaćević, Department of Electrical and Computer Engineering, Stony Brook University, Stony Brook, NY 11794-2350.

Shuo Li, Department of Electrical and Computer Engineering, Stony Brook University, Stony Brook, NY 11794-2350.

Gert Cauwenberghs, Department of Bioengineering, University of California San Diego, La Jolla, CA 92093.

References

  • 1.Pedersen MS, Larsen J, Kjems U, Para LC. A Survey of Convolutive Blind Source Separation Methods. Springer Multichannel Speech Processing Handbook. 2007:1065–1084. [Google Scholar]
  • 2.Ferreol A, Albera L, Chevalier P. Fourth order blind identification of underdetermined mixtures of sources (FOBIUM) IEEE Trans. on Signal Processing. 2005;53(5):1640–1653. [Google Scholar]
  • 3.Kachenoura A, Albera L, Senhadji L, Comon P. ICA: A potential tool for BCI systems. IEEE Signal Process. Mag. 2008;25(1):57–68. [Google Scholar]
  • 4.Cichocki A, Amari S. Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications. New York: John Wiley; 2002. [Google Scholar]
  • 5.Cohen MH, Andreou AG. Analog CMOS Integration and Experimentation with an Autoadaptive Independent Component Analyzer. IEEE Trans. Circuits and Systems II. 1995;42(2):65–77. [Google Scholar]
  • 6.Gharbi ABA, Salam FMA. Implementation and Test Results of a Chip for the Separation of Mixed Signals; Proc. Int. Symp. Circuits and Systems (ISCAS’95); 1995. [Google Scholar]
  • 7.Kim CM, Park HM, Kim T, Choi YK, Lee SY. FPGA implementation of ICA algorithm for blind signal separation and adaptive noise canceling. IEEE Trans. on Neural Network. 2003;14(5):1038–1046. doi: 10.1109/TNN.2003.818381. [DOI] [PubMed] [Google Scholar]
  • 8.Shyu KK, Lee MH, Wu YT, Lee PL. Implementation of pipelined fastICA on FPGA for real-time blind source separation. IEEE Trans. on Neural Network. 2008;19(6):958–970. doi: 10.1109/TNN.2007.915115. [DOI] [PubMed] [Google Scholar]
  • 9.Van L-D, Wu D-Y, Chen C-S. Energy-Efficient FastICA Implementation for Biomedical Signal Separation. IEEE Trans. on Neural Network. 2011;22(11):1809–1822. doi: 10.1109/TNN.2011.2166979. [DOI] [PubMed] [Google Scholar]
  • 10.Yang C-H, Shih Y-H, Chiueh H. An 81.6 µW FastICA Processor for Epileptic Seizure Detection. IEEE Trans. on Biomedical Circuits and Systems. 2015;9(1):60–71. doi: 10.1109/TBCAS.2014.2318592. [DOI] [PubMed] [Google Scholar]
  • 11.Miles RN, Su Q, Cui W, Shetye M, Degertekin FL, Bicen B, Garcia C, Jones S, Hall N. A low-noise differential microphone inspired by the ears of the parasitoid fly Ormia ochracea. J. Acoust. Soc. Am. 2009;125(5):2013–2026. doi: 10.1121/1.3082118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ando S, Kurihara T, Watanabe K, Yamanishi Y, Ooasa T. Novel theoretical design and fabrication test of biomimicry directional microphone; International Solid-State Sensors, Actuators and Microsystems Conference TRANSDUCERS 2009; 2009. pp. 1932–1935. [Google Scholar]
  • 13.Hamacher V, Chalupper J, Eggers J, Fisher E, Kornagel U, Puder H, Rass U. Signal Processing in High-End Hearing Aids: State of the Art, Challenges, and Future Trends. EURASIP Journal on Applied Signal Processing. 2005;18:2915–2929. [Google Scholar]
  • 14.Cauwenberghs G, Stanaćević M, Zweig G. Proc. IEEE. Int. Symp. Circuits and Systems (ISCAS’2001) Australia: Sydney; 2001. May 6–9, Blind Broadband Source Localization and Separation in Miniature Sensor Arrays. [Google Scholar]
  • 15.Stanaćević M, Cauwenberghs G, Zweig G. Gradient Flow Adaptive Beamforming and Signal Separation in a Miniature Microphone Array. Proc. IEEE Int. Conf. Acoustics Speech and Signal Processing (ICASSP’2002) 2002;4:4016–4019. [Google Scholar]
  • 16.Stanaćević M, Cauwenberghs G. Micropower Gradient Flow VLSI Acoustic Localizer. IEEE Transactions on Circuits and Systems I : Regular Papers. 2005;52(10):2148–2157. doi: 10.1109/TCSI.2016.2556122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Celik A, Stanaćević M, Cauwenberghs G. Mixed-signal real-time adaptive blind source separation; Proc. IEEE Int. Symp. Circuits Syst; 2004. pp. 760–763. [Google Scholar]
  • 18.Celik A, Stanaćević M, Cauwenberghs G. Adv. Neural Information Processing Systems (NIPS’2005) Vol. 18. Cambridge: MIT Press; 2006. Gradient Flow Independent Component Analysis in Micropower VLSI. [Google Scholar]
  • 19.Comon P. Indepenendent component analysis, a new concept? Signal Processing. 1994;36:287–314. [Google Scholar]
  • 20.Bell A, Sejnowski T. An information maximization approach to blind separation and blind deconvolution. Neural Computation. 1995;7:1129–1159. doi: 10.1162/neco.1995.7.6.1129. [DOI] [PubMed] [Google Scholar]
  • 21.Amari S, Cichocki A, Yang H. Adv. Neural Information Processing Systems. Vol. 8. Cambridge MA: MIT Press; 1996. A new learning algorithm for blind signal Separation; pp. 757–763. [Google Scholar]
  • 22.Nadal J, Parga N. Non linear neurons in the low noise limit : a factorial code maximizes information transfer. Network. 1994;5:565–581. [Google Scholar]
  • 23.Obradovic D, Deco G. Information Maximization and Independent Component Analysis: Is There a Difference? Neural Computation. 1998;10(8):2085–2101. doi: 10.1162/089976698300016972. [DOI] [PubMed] [Google Scholar]
  • 24.Cichocki A, Unbehauen R, Moszcnski L, Rummert E. Int. Symp. Artificial Neural Networks ISANN-94. Taiwan: 1994. Dec, A New On-Line Adaptive Learning Algorithm for Blind Separation of Sources; pp. 406–411. [Google Scholar]
  • 25.Cichocki A, Unbehauen R, Rummert E. Robust Learning Algorithm for Blind Separation of Signals. Electronics Letters. 30(17):1386–1387. [Google Scholar]
  • 26.Jutten C, Herault J. Blind Separation of Sources, part I: An Adaptive Algorithm Based on Neuromimetic Architecture. Signal Proc. 1991;24(1):1–10. [Google Scholar]
  • 27.Nicollini G, Moretti F, Conti M. High-Frequency Fully Differential Filter Usign Operational Amplifier Without Common-Mode Feedback. IEEE Journal of Solid-State Circuits. 1989;24(3):803–813. [Google Scholar]
  • 28.Li J, Moon U. A 1.8-V 67-mW 10-bit 100-MS/s pipelined ADC using time-shifted CDS technique. IEEE Journal of Solid-State Circuits. 2004 Sep;39(9):1468–1476. [Google Scholar]
  • 29.Vittoz E. Micropower Techniques. In: Franca, Tsividis, editors. Design of Analog-Digital VLSI Circuits for Telecommunications and Signal Processing. 2nd. Englewood Cliffs, NJ: Prentice-Hall; 1994. pp. 53–96. [Google Scholar]
  • 30.Koldovsky Z, Tichavsky P, Oja E. Efficient Variant of Algorithm FastICA for Independent Component Analysis Attaining the Cramer-Rao Lower Bound. IEEE Trans. on Neural Networks. 2006;17(5):1265–1277. doi: 10.1109/TNN.2006.875991. [DOI] [PubMed] [Google Scholar]
  • 31.Allen JB, Berkley DA. Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Amer. 1979 Apr;65:943–950. [Google Scholar]

RESOURCES