Fast and Stable Signal Deconvolution via Compressible State-Space Models

Abbas Kazemipour; Ji Liu; Krystyna Solarana; Daniel A Nagode; Patrick O Kanold; Min Wu; Behtash Babadi

doi:10.1109/TBME.2017.2694339

. Author manuscript; available in PMC: 2018 Jan 1.

Published in final edited form as: IEEE Trans Biomed Eng. 2017 Apr 13;65(1):74–86. doi: 10.1109/TBME.2017.2694339

Fast and Stable Signal Deconvolution via Compressible State-Space Models

Abbas Kazemipour ¹, Ji Liu ², Krystyna Solarana ³, Daniel A Nagode ⁴, Patrick O Kanold ⁵, Min Wu ⁶, Behtash Babadi ⁷

PMCID: PMC5683949 NIHMSID: NIHMS914825 PMID: 28422648

Abstract

Objective

Common biological measurements are in the form of noisy convolutions of signals of interest with possibly unknown and transient blurring kernels. Examples include EEG and calcium imaging data. Thus, signal deconvolution of these measurements is crucial in understanding the underlying biological processes. The objective of this paper is to develop fast and stable solutions for signal deconvolution from noisy, blurred and undersampled data, where the signals are in the form of discrete events distributed in time and space.

Methods

We introduce compressible state-space models as a framework to model and estimate such discrete events. These state-space models admit abrupt changes in the states and have a convergent transition matrix, and are coupled with compressive linear measurements. We consider a dynamic compressive sensing optimization problem and develop a fast solution, using two nested Expectation Maximization algorithms, to jointly estimate the states as well as their transition matrices. Under suitable sparsity assumptions on the dynamics, we prove optimal stability guarantees for the recovery of the states and present a method for the identification of the underlying discrete events with precise confidence bounds.

Results

We present simulation studies as well as application to calcium deconvolution and sleep spindle detection, which verify our theoretical results and show significant improvement over existing techniques.

Conclusion

Our results show that by explicitly modeling the dynamics of the underlying signals, it is possible to construct signal deconvolution solutions that are scalable, statistically robust, and achieve high temporal resolution.

Significance

Our proposed methodology provides a framework for modeling and deconvolution of noisy, blurred, and undersampled measurements in a fast and stable fashion, with potential application to a wide range of biological data.

Index Terms: state-space models, compressive sensing, signal deconvolution, calcium imaging, sleep spindles

I. Introduction

In many signal processing applications such as estimation of brain activity from magnetoencephalography (MEG) time-series [2], estimation of time-varying networks [3], electroencephalogram (EEG) analysis [4], calcium imaging [5], functional magnetic resonance imaging (fMRI) [6], and video compression [7], the signals often exhibit abrupt changes that are blurred through convolution with unknown kernels due to intrinsic measurement constraints. Extracting the underlying signals from blurred and noisy measurements is often referred to as signal deconvolution. Traditionally, state-space models have been used for such signal deconvolution problems, where the states correspond to the unobservable signals. Gaussian state-space models in particular are widely used to model smooth state transitions. Under normality assumptions, posterior mean filters and smoothers are optimal estimators, where the analytical solution is given respectively by the Kalman filter and the fixed interval smoother [8], [9].

When applied to observations from abruptly changing states, Gaussian state-space models exhibit poor performance in recovering sharp transitions of the states due to their underlying smoothing property. Although filtering and smoothing recursions can be obtained in principle for non-Gaussian state-space models, exact calculations are no longer possible. Apart from crude approximations like the extended Kalman filter, several methods have been proposed including numerical methods for low-dimensional states [10], Monte Carlo filters [10], [11], posterior mode estimation [12], [13], and fully Bayesian smoothing using Markov chain Monte Carlo simulation [14], [15]. In order to exploit sparsity, several dynamic compressed sensing (CS) techniques, such as the Kalman filtered CS algorithm, have been proposed that typically assume partial information about the sparse support or estimate it in a greedy and online fashion [16], [17], [18], [19], [20]. However, little is known about the theoretical performance guarantees of these algorithms.

In this paper, we consider the problem of estimating state dynamics from noisy and undersampled observations, where the state transitions are governed by autoregressive models with compressible innovations. Motivated by the theory of CS, we employ an objective function formed by the ℓ₁-norm of the state innovations [21]. Unlike the traditional compressed sensing setting, the sparsity is associated with the dynamics and not the states themselves. In the absence of observation noise, the CS recovery guarantees are shown to extend to this problem [21]. However, in a realistic setting in the presence of observation noise, it is unclear how the CS recovery guarantees generalize to this estimation problem.

We will present stability guarantees for this estimator under a convergent state transition matrix, which confirm that the CS recovery guarantees can be extended to this problem. The corresponding optimization problem in its Lagrangian form is akin to the MAP estimator of the states in a linear state-space model where the innovations are Laplace distributed. This allows us to integrate methods from Expectation-Maximization (EM) theory and Gaussian state-space estimation to derive efficient algorithms for the estimation of states as well as the state transition matrix, which is usually unknown in practice. To this end, we construct two nested EM algorithms in order to jointly estimate the states and the transition matrix. The outer EM algorithm for state estimation is akin to the fixed interval smoother, and the inner EM algorithm uses the state estimates to update the state transition matrix [22]. The resulting EM algorithm is recursive in time, which makes the computational complexity of our method scale linearly with temporal dimension of the problem. This provides an advantage over existing methods based on convex optimization, which typically scale super-linearly with the temporal dimension.

Our results are related to parallel applications in spectral estimation, source localization, and channel equalization [23], where the measurements are of the form Y = AX + N, with Y is the observation matrix, X denotes the unknown parameters, A is the measurement matrix, and N is the additive noise. These problems are referred to as Multiple Measurement Vectors (MMV) [24] and Multivariate Regression [25]. In these applications, solutions with row sparsity in X are desired. Recovery of sparse signals with Gaussian innovations is studied in [26]. Several recovery algorithms including the ℓ₁ − ℓ_q minimization methods, subspace methods and greedy pursuit algorithms [27] have been proposed for support union recovery in this setup. Our contributions are distinct in that we directly model the state innovations as a compressible sequence, for recovery of which we present both sharp theoretical guarantees as well as fast algorithms from state-space estimation.

Finally, we provide simulation results as well as applications to two experimentally-acquired data sets: calcium imaging recordings of neuronal activity, and EEG data during sleep. In the former, the deconvolution problem concerns estimating the location of spikes given the temporally blurred calcium fluorescence, and in the latter, the objective is to detect the occurrence and onset of sleep spindles. Our simulation studies confirm our theoretical predictions on the performance gain obtained by compressible state-space estimation over those obtained by traditional estimators such as the basis pursuit denoising. Our real data analyses reveal that our compressible state-space modeling and estimation framework outperforms two of the commonly-used methods for calcium deconvolution and sleep spindle detection. In the spirit of easing reproducibility, we have made MATLAB implementations of our codes publicly available [28].

II. Methods

In this section, we establish our problem formulation and notational conventions and present our main theoretical analysis and algorithm development. In the interest of space, the description of the experimental procedures is given in Section I of the Supplementary Material.

A. Problem Formulation and Theoretical Analysis

Throughout the paper we use bold lower and upper case letters for denoting vectors and matrices, respectively. We denote the support of a vector x_t ∈ ℝ^p by supp(x_t) and its jth element by (x_t)_j. We consider the linear compressible state-space model given by

x_{t} = Θ x_{t - 1} + w_{t}, y_{t} = A_{t} x_{t} + v_{t},

(1)

where ${(x_{t})}_{t = 1}^{T} \in ℝ^{p}$ denote the sequence of unobservable states, Θ is the state transition matrix satisfying ||Θ||< 1, w_t ∈ ℝ^p is the state innovation sequence, ${(y_{t})}_{t = 1}^{T} \in ℝ^{n_{t}}$ are the linear observations, A_t ∈ ℝ^n_t×p denotes the measurement matrix, and v_t ∈ ℝ^n_t denotes the measurement noise. The main problem is to estimate the unobserved sequence ${(x_{t})}_{t = 1}^{T}$ (and possibly Θ), given the sequence of observations ${(y_{t})}_{t = 1}^{T}$ . This problem is in general ill-posed, when n_t < p, for some t. We therefore need to make additional assumptions in order to seek a stable solution.

Given a sparsity level s and a vector x, we denote the set of its s largest magnitude entries by S, and its best s-term approximation error by σ_s(x) = ||x − x_S||₁. When σ_s(x) ~ 𝒪^{(1/2 −} ^ξ⁾ for some ξ ≥ 0, we refer to x as (s, ξ)– compressible. We assume that the state innovations are sparse (resp. compressible), i.e. x_t − Θx_t₋₁ is s_t-sparse (resp. (s_t, ξ)- compressible) with s₁ ≫ s_t for t ∈ [T]\{1}. Our theoretical analysis pertain to the compressed sensing regime where 1 ≪ s_t < n_t ≪ p. We refer to the regime of n_t > p as the denoising regime.

For simplicity of notation, we define x₀ to be the all-zero vector in ℝ^p. For a matrix A, we denote restriction of A to its first n rows by (A)_n. We say that the matrix A ∈ ℝ^n×p satisfies the restricted isometry property (RIP) [29] of order s, if for all s-sparse x ∈ ℝ^p, we have

(1 - δ_{s}) {‖ x ‖}_{2}^{2} \leq {‖ Ax ‖}_{2}^{2} \leq (1 + δ_{s}) {‖ x ‖}_{2}^{2},

(2)

where δ_s ∈ (0, 1) is the smallest constant for which Eq. (2) holds [30]. We assume that the rows of A_t are a subset of the rows of A₁, i.e. A_t = (A₁)_{n_t}, and define ${\tilde{A}}_{t} = \sqrt{\frac{n_{1}}{n_{t}}} A_{t}$ . Other than its technical usefulness, the latter assumption helps avoid prohibitive storage of all the measurement matrices. In order to promote sparsity of the state innovations, we consider the dynamic ℓ₁-regularization (dynamic CS from now on) problem defined as

\underset{{(x_{t})}_{t = 1}^{T}, Θ}{minimize} \sum_{t = 1}^{T} \frac{{‖ x_{t} - Θ x_{t - 1} ‖}_{1}}{\sqrt{s_{t}}} s.t. {‖ y_{t} - A_{t} x_{t} ‖}_{2} \leq \sqrt{\frac{n_{t}}{n_{1}}} ε .

(3)

where ε is an upper bound on the observation noise, i.e., ||v_t||₂ ≤ ε for all t. Note that this problem is a variant of the dynamic CS problem introduced in [21]. We also consider the modified Lagrangian form of (3) given by

\underset{{(x_{t})}_{t = 1}^{T}, Θ}{minimize} λ \sum_{t = 1}^{T} \frac{{‖ x_{t} - Θ x_{t - 1} ‖}_{1}}{\sqrt{s_{t}}} + \frac{1}{n_{t}} \frac{{‖ y_{t} - A_{t} x_{t} ‖}_{2}^{2}}{2 σ^{2}} .

(4)

for some constants $σ_{2}^{2}$ and λ ≥ 0. Note that if v_t ~ 𝒩(0, n_tσ²I), then Eq. (4) is algebraically similar to the maximum a posteriori (MAP) estimator of the states in (1), assuming that the state innovations were independent Laplace random variables with respective parameters $λ / \sqrt{s_{t}}$ . We will later use this analogy to derive fast solutions to the optimization problem in (4).

Uniqueness and exact recovery of the sequence ${(x_{t})}_{t = 1}^{T}$ in the absence of noise was proved in [21] for Θ = I, by an inductive construction of dual certificates. The special case Θ = I can be considered as a generalization of the total variation (TV) minimization problem [31]. Our main result on stability of the solution of (3) is the following:

Theorem 1 (Stable Recovery of Activity in the Presence of Noise)

Let ${(x_{t})}_{t = 1}^{T} \in ℝ^{p}$ be a sequence of states with a known transition matrix Θ = θI, where |θ|< 1 and Ã_t, t ≥ 1 satisfies RIP of order 4s with δ₄_s < 1/3. Suppose that n₁ > n₂ = n₃ = ⋯ = n_T. Then, the solution ${({\hat{x}}_{t})}_{t = 1}^{T}$ to the dynamic CS problem (3) satisfies

\frac{1}{T} \sum_{t = 1}^{T} {‖ x_{t} - {\hat{x}}_{t} ‖}_{2} \leq \frac{1 - θ^{T}}{1 - θ} (12.6 (1 + \frac{1}{T} \sqrt{\frac{n_{1}}{n_{2}}} - \frac{1}{T}) ε + \frac{3}{T} \sum_{t = 1}^{T} \frac{σ_{s_{t}} (x_{t} - Θ x_{t - 1})}{\sqrt{s_{t}}}) .

(5)

Proof Sketch

The proof of Theorem 1 is based on establishing a modified cone and tube constraint for the dynamic CS problem (3) and using the boundedness of the Frobenius norm of the inverse first-order differencing operator. Details of the proof are given in Appendix VII.

Remark 1

The first term on the right hand side of Theorem 1 implies that the average reconstruction error of the sequence ${(x_{t})}_{t = 1}^{T}$ is upper bounded proportional to the noise level ε, which implies the stability of the estimate. The second term is a measure of compressibility of the innovation sequence and vanishes when the sparsity condition is exactly met.

Remark 2

Similar to the general compressive sensing setup, the result of Theorem 1 holds even if the compressibility assumption on the innovation sequence is violated. In this case, the second term on the right hand side of Eq. (5) will be at most of the order of a constant. In addition, in the absence of any assumptions on the sparsity, i.e., s_t ~ 𝒪(p), the result of Theorem 1 provides a valid performance bound in the denoising regime. However, the result of Theorem 1 is the sharpest when the underlying innovation sequence is compressible, i.e., the second term in the bound is negligible compared to the first term. Even though we expect that the compressibility assumption holds for the innovation sequences pertaining to the biological signals of our interest (e.g., two-photon calcium traces of neuronal spiking and EEG sleep spindles), in order to take into account the possible violation of this assumption in our forthcoming algorithm development, we will work with the modified Lagrangian formulation of Eq. 4. The Lagrangian formulation allows us to adaptively select the sparsity level in a data-driven fashion: by choosing the trade-off parameter λ via cross-validation, the sparsity level of the reconstructed innovation sequence is adapted to the observed data.

B. Fast Iterative Solution via the EM Algorithm

Due to the high dimensional nature of the state estimation problem, algorithms with polynomial complexity exhibit poor scalability. Moreover, when the state transition matrix is not known, the dynamic CS optimization problem (4) is not convex in ( ${(x_{t})}_{t = 1}^{T}$ , Θ). Therefore standard convex optimization solvers cannot be directly applied. This problem can be addressed by employing the Expectation-Maximization algorithm [22]. A related existing result considers weighted ℓ₁-regularization to adaptively capture the state dynamics [32]. Our approach is distinct in that we derive a fast solution to (4) via two nested EM algorithms, in order to jointly estimate the states and their transition matrix. The outer EM algorithm converts the estimation problem to a form suitable for the usage of the traditional Fixed Interval Smoothing (FIS) by invoking the EM interpretation of the Iterative Re-weighted Least Squares (IRLS) algorithms [33]. The inner EM algorithm performs state and parameter estimation efficiently using the FIS. We refer to our estimated as the Fast Compressible State- Space (FCSS) estimator.

In order to employ the EM theory, we first note that the problem of Eq. (4) can be interpreted as a MAP problem: the first term corresponds to the state-space prior −log p(x_t|x_t₋₁, Θ) = −log p_{s_t} (x_t − Θx_t₋₁), where $p_{s_{t}} (x) ~ exp (- λ {‖ x ‖}_{1} / \sqrt{s_{t}})$ denoting the Laplace distribution; the second term is the negative log-likelihood of the data given the state, assuming a zero-mean Gaussian observation noise with covariance σ²I.

The choice of the ℓ₁-norm in our formulation is motivated by the superior performance of ℓ₁-regularized least-squares solutions in the compressive sensing tradition, as highlighted by Theorem 1. However, in the foregoing analogy to a MAP estimation problem, caution should be taken in interpreting the compressibility assumption in a statistical sense: we do not make the assumption that the innovation sequence ${w_{t}}_{t = 1}^{T}$ is Laplace distributed, but will exploit the algebraic similarity of the ℓ₁-regularized least-squares problem to the MAP estimator of a Laplace distributed linear state-space model in order to develop fast algorithms. In fact, it is well-known that samples from the Laplace distribution are not compressible [34]. There exist several other choices of regularization schemes for promoting compressibility, including the weighted ℓ₁-norm, the ℓ₀-quasi norm, and the ℓ_ν-norm for 0 < ν < 1. Theorem 1 suggests that the ℓ₁-norm provides a suitable regularization scheme in our setting, as the first term in the error bound is within a constant factor of that of the best sparse estimator, i.e., an estimator that has access to the location of the non-zero components of the innovation sequence. Nevertheless, the forthcoming algorithm development can be carried out using the weighted ℓ₁-norm or ℓ_ν-norm regularization, with no assumption on the statistics of the innovation sequence.

The outer EM loop of FCSS

Consider the ε-perturbed ℓ₁-norm as

{‖ x ‖}_{1, ε} = \sqrt{x_{1}^{2} + ε^{2}} + \sqrt{x_{2}^{2} + ε^{2}} + \dots + \sqrt{x_{p}^{2} + ε^{2}} .

(6)

Note that for ε = 0, ||x||₁_,ε coincides with the usual ℓ₁-norm. We define the ε-perturbed version of the dual problem (4) by

\underset{{(x_{t})}_{t = 1}^{T}, Θ}{minimize} λ \sum_{t = 1}^{T} \frac{{‖ x_{t} - Θ x_{t - 1} ‖}_{1, ε}}{\sqrt{s_{t}}} + \frac{1}{n_{t}} \frac{{‖ y_{t} - A_{t} x_{t} ‖}_{2}^{2}}{2 σ^{2}} .

(7)

As it will become evident shortly, this slight modification is carried out for the sake of numerical stability. The ε-perturbation only adds a term of the order 𝒪(εp) to the estimation error bound of Theorem 1, which is negligible for small enough ε [33].

If instead of the ℓ₁_,ε-norm, we had the square ℓ₂ norm, then the above problem could be efficiently solved using the FIS. The outer EM algorithm indeed transforms the problem of Eq. (7) into this form, by invoking the equivalence of the IRLS algorithm as an instance of the EM algorithm for solving ℓ₁_,ε-minimization problems via the Normal/Independent (N/I) characterization of the ε-perturbed Laplace distribution [33].

Suppose that at the end of the l^th iteration, the estimates ${({\hat{x}}_{t}^{(l)})}_{t = 1}^{T}$ , Θ̂⁽^l⁾ are obtained, given the observations ${(y_{t})}_{t = 1}^{T}$ . As it is shown in Section II of the Supplementary Material, the outer EM algorithm transforms the optimization problem to:

\underset{{(x_{t})}_{t = 1}^{T}, Θ}{minimize} \frac{λ}{2} \sum_{j = 1}^{p} \sum_{t = 1}^{T} \frac{{(x_{t} - Θ x_{t - 1})}_{j}^{2} + ε^{2}}{\sqrt{s_{t}} \sqrt{{({\hat{x}}_{t}^{(l)} - {\hat{Θ}}^{(l)} {\hat{x}}_{t - 1}^{(l)})}_{j}^{2} + ε^{2}}} + \sum_{t = 1}^{T} \frac{1}{n_{t}} \frac{{‖ y_{t} - A_{t} x_{t} ‖}_{2}^{2}}{2 σ^{2}} .

(8)

in order to find ${({\hat{x}}_{t}^{(l + 1)})}_{t = 1}^{T}$ , Θ̂⁽^l⁺¹⁾. Under mild conditions, convergence of the solution of (8) to that of (4) was established in [33]. The objective function of (8) is still not jointly convex in ( ${(x_{t})}_{t = 1}^{T}$ , Θ). Therefore, to carry out the optimization, i.e. the outer M step, we will employ another instance of the EM algorithm, which we call the inner EM algorithm, to alternate between estimating of ${(x_{t})}_{t = 1}^{T}$ and Θ.

The inner EM loop of FCSS

Let $W_{t}^{(l)}$ be a diagonal matrix such that

{(W_{t}^{(l)})}_{j, j} = {s_{t}}^{- 1 / 2} {{({\hat{x}}_{t}^{(l)} - {\hat{Θ}}^{(l)} {\hat{x}}_{t - 1}^{(l)})}_{j}^{2} + ε^{2}}^{- 1 / 2} .

Consider an estimate Θ̂⁽^l,m⁾, corresponding to the m^th iteration of the inner EM algorithm within the l^th M-step of the outer EM. In this case, Eq. (8) can be thought of the MAP estimate of the Gaussian state-space model given by:

x_{t} = {\hat{Θ}}^{(l, m)} x_{t - 1} + w_{t}, w_{t} ~ N (0, \frac{1}{λ} W_{t}^{{(l)}^{- 1}}) y_{t} = A_{t} x_{t} + v_{t}, v_{t} ~ N (0, n_{t} σ^{2} I),

(9)

In order to obtain the inner E step, one needs to find the density of ${(x_{t})}_{t = 1}^{T}$ given ${(y_{t})}_{t = 1}^{T}$ and Θ̂⁽^l,m⁾}. Given the Gaussian nature of the state-space in Eq. (9), this density is a multivariate Gaussian density, whose means and covariances can be efficiently computed using the FIS. For all t ∈ [T], the FIS performs a forward Kalman filter and a backward smoother to generate [35], [8]:

x_{t ∣ T}^{(l, m + 1)} : = E {x_{t} ∣ {(y_{t})}_{t = 1}^{T}, {\hat{Θ}}^{(l, m)}}, \sum_{t ∣ T}^{(l, m + 1)} : = E {x_{t} x_{t}^{'} ∣ {(y_{t})}_{t = 1}^{T}, {\hat{Θ}}^{(l, m)}}, and \sum_{t - 1, t ∣ T}^{(l, m + 1)} = \sum_{t, t - 1 ∣ T}^{(l, m + 1)} = E {x_{t - 1} x_{t}^{'} ∣ {(y_{t})}_{t = 1}^{T}, {\hat{Θ}}^{(l, m)}} .

Note that due to the quadratic nature of all the terms involving ${(x_{t})}_{t = 1}^{T}$ , the outputs of the FIS suffice to compute the expectation of the objective function in Eq. (8), i.e., the inner E step, which results in:

\underset{Θ}{maximize} - \frac{λ}{2} (Θ (\sum_{t = 1}^{T} W_{t}^{(l)} (x_{t - 1 ∣ T}^{(l, m + 1)} {x^{'}}_{t - 1 ∣ T}^{(l, m + 1)} + \sum_{t - 1 ∣ T}^{(l, m + 1)})) Θ^{T}) + \frac{λ}{2} Tr (Θ (\sum_{t = 1}^{T} W_{t}^{(l)} (x_{t - 1 ∣ T}^{(l, m + 1)} {x^{'}}_{t ∣ T}^{(l, m + 1)} + x_{t ∣ T}^{(l, m + 1)} {x^{'}}_{t - 1 ∣ T}^{(l, m + 1)} + 2 \sum_{t - 1, t ∣ T}^{(l, m + 1)}))),

(10)

to obtain Θ̂⁽^l,m⁾. The solution has a closed-form given by:

{\hat{Θ}}^{(l, m + 1)} = {(\sum_{t = 1}^{T} 2 W_{t}^{(l)} (x_{t - 1 ∣ T}^{(l, m + 1)} {x^{'}}_{t - 1 ∣ T}^{(l, m + 1)} + \sum_{t - 1 ∣ T}^{(l, m + 1)}))}^{- 1} (\sum_{t = 1}^{T} W_{t}^{(l)} (x_{t - 1 ∣ T}^{(l, m + 1)} {x^{'}}_{t ∣ T}^{(l, m + 1)} + x_{t ∣ T}^{(l, m + 1)} {x^{'}}_{t - 1 ∣ T}^{(l, m + 1)} + 2 \sum_{t - 1, t ∣ T}^{(l, m + 1)})) .

(11)

This process is repeated for M iterations for the inner EM and L iterations for the outer EM, until a convergence criterion is met. Algorithm 1 summarizes the FCSS algorithm.

Algorithm 1.

The Fast Compressible State-Space (FCSS) Estimator

procedure FCSS

Initialize: Θ̂⁽⁰⁾ = 0,

{({\hat{x}}_{t}^{(0)} = 0)}_{t = 1}^{T}, {(W_{t}^{(0)} = 0)}_{t = 1}^{T}

repeat

l = 0.

Outer E-step:

W_{t}^{(l)} = diag {\frac{1}{\sqrt{s_{t}} \sqrt{{({\hat{x}}_{t}^{(l)} - {\hat{Θ}}^{(l)} {\hat{x}}_{t - 1}^{(l)})}_{j}^{2} + ε^{2}}}}_{j = 1}^{p}

Outer M-step:

repeat

m = 0.

10:

Inner E-Step: Find the smoothed estimates

x_{t ∣ T}^{(l, m + 1)}, \sum_{t ∣ T}^{(l, m + 1)}

and

\sum_{t - 1, t ∣ T}^{(l, m + 1)}

using a Fixed Interval Smoother for

{\begin{cases} x_{t} = {\hat{Θ}}^{(l, m)} x_{t - 1} + w_{t}, w_{t} ~ N (0, \frac{1}{λ} W_{t}^{{(l)}^{- 1}}) \\ y_{t} = A_{t} x_{t} + v_{t}, v_{t} ~ N (0, n_{t} σ^{2} I) \end{cases} .

11:

Inner M-Step: Update Θ̂⁽^l,m⁺¹⁾ via Eq. (11).

12:

m ← m + 1.

13:

until convergence criteria met

14:

Update the estimates:

{\hat{Θ}}^{(l + 1)} \leftarrow {\hat{Θ}}^{(l, m)}, {({\hat{x}}_{t}^{(l + 1)})}_{t = 1}^{T} \leftarrow {(x_{t ∣ T}^{(l, m)})}_{t = 1}^{T}

15:

l ← l + 1.

16:

until convergence criteria met

\hat{Θ} \leftarrow {\hat{Θ}}^{(l)} . {({\hat{x}}_{t})}_{t = 1}^{T} \leftarrow {({\hat{x}}_{t}^{(l)})}_{t = 1}^{T}

17:

end procedure

Open in a new tab

Remark 1

In order to quantify the complexity of the FCSS algorithm, we first consider the complexity of EM iterations. By virtue of the FIS procedure, the complexity of the FCSS algorithm is linear in T, i.e., the observation duration. Each iteration of the FIS algorithm requires two matrix inversions, each with a complexity of 𝒪(p³). Similarly, updating Θ requires an additional inversion which is of complexity 𝒪(p³). Altogether, the complexity of the FCSS algorithm amounts to 𝒪(p³T). As we will show in Section III, the linear dependence of the computational complexity on T makes the FCSS algorithm scale favorably when applied to long data sets. For the applications of interest in this paper, we make additional assumptions on the structure of Θ which mitigate the cubic dependence on p. In particular, we assume a diagonal structure for the calcium deconvolution problem and a two-parameter matrix form for the sleep spindle detection. Hence, the complexity reduces to O(p²T), where the quadratic dependence on p arises from matrix multiplication. Finally, the EM algorithm is known to converge linearly in the number of iterations [36]. Our numerical experiment show that in the applications of interest in this paper, very few EM iterations suffice for achieving stable estimates.

Remark 2

In order to update Θ in the inner M-step given by E. (11), we have not specifically enforced the condition ||Θ||< 1 in the maximization step. This condition is required to obtain a convergent state transition matrix which results in the stability of the state dynamics. It is easy to verify that the set of matrices Θ satisfying ||Θ||< 1 − η, is a closed convex set for small positive η, and hence one can perform the maximization in (11) by projection onto this closed convex set. Alternatively, matrix optimization methods with operator norm constraints can be used [37]. We have avoided this technicality by first finding the global minimum and examining the largest eigenvalue. In the applications of interest in this paper which follow next, the largest eigenvalue has always been found to be less than 1.

III. Results

In this section, we study the performance of the FCSS estimator on simulated data as well real data from two-photon calcium imaging recordings of neuronal activity and sleep spindle detection from EEG.

A. Application to Simulated Data

We first apply the FCSS algorithm to simulated data and compare its performance with the Basis Pursuit Denoising (BPDN) algorithm. The parameters are chosen as p = 200, T = 200, s₁ = 8, s₂ = 4, ε = 10⁻¹⁰, and Θ = 0.95I. We define the quantity 1 − n/p as the compression ratio. We refer to the case of n_t = p, i.e., no compression, as the denoising setting. The measurement matrix A is an n_t×p i.i.d. Gaussian random matrix, where n_t is chosen such that $\frac{s_{t}}{n_{t}}$ is a fixed ratio. An initial choice of $λ \geq \frac{2 \sqrt{2}}{σ} \sqrt{\frac{log p}{n_{t}}}$ is made inspired by the theory of LASSO [38], which is further tuned using two-fold cross-validation.

Figures 1–(a) and 1–(b) show the estimated states as well as the innovations for different compression ratios for one sample component. In the denoising regime, all the innovations (including the two closely-spaced components) are exactly recovered. As we take fewer measurements, the performance of the algorithm degrades as expected. However, the overall structure of the innovation sequence is captured even for highly compressed measurements.

Fig. 1 — Reconstruction results of FCSS on simulated data vs. compression levels. (a) reconstructed states, (b) reconstructed spikes. The FCSS estimates degrade gracefully as the compression level increases.

Figure 2 shows the MSE comparison of the FCSS vs. BPDN, where the MSE is defined as $10 {log}_{10} \frac{1}{T} \sum_{t = 1}^{T} {‖ {\hat{x}}_{t} - x_{t} ‖}_{2}^{2}$ . The FCSS algorithm significantly outperforms BPDN, especially at high SNR values. Figure 3 compares the performance of FCSS and BPDN on a sample component at a compression level of n/p = 2/3, in order to visualize the performance gain implied by Figure 3.

Fig. 2 — MSE vs. SNR comparison betwee FCSS and BPDN. The FCSS significantly outperforms the BPDN, even for moderate SNR levels.

Fig. 3 — Example of state reconstruction results for FCSS and BPDN. Top: true states, Middle: FCSS state estimates, Bottom: BPDN estimates. The FCSS reconstruction closely follows the true state evolution, while the BPDN fails to capture the state dynamics.

Finally, Figure 4 shows the comparison of the estimated states for the entire simulated data in the denoising regime. As can be observed from the figure, the sparsity pattern of the states and innovations are captured while significantly denoising the observed states.

Fig. 4 — Raster plots of the observed and estimated states via FCSS. Left: noisy observations, Right: FCSS estimates. The FCSS significantly denoises the observed states.

B. Application to Calcium Signal Deconvolution

Calcium imaging takes advantage of intracellular calcium flux to directly visualize calcium signaling in living neurons. This is done by using calcium indicators, which are fluorescent molecules that can respond to the binding of calcium ions by changing their fluorescence properties and using a fluorescence or two-photon microscope and a CCD camera to capture the visual patterns [39], [40]. Since spikes are believed to be the units of neuronal computation, inferring spiking activity from calcium recordings, referred to as calcium deconvolution, is an important problem in neural data analysis. Several approaches to calcium deconvolution have been proposed in the neuroscience literature, including model-free approaches such as sequential Monte Carlo methods [41] and model-based approaches such as non-negative deconvolution methods [5], [42]. These approaches require solving convex optimization problems, which do not scale well with the temporal dimension of the data. In addition, they lack theoretical performance guarantees and do not provide clear measures for assessing the statistical significance of the detected spikes.

In order to construct confidence bounds for our estimates, we employ recent results from high-dimensional statistics [43]. We first compute the confidence intervals around the outputs of the FCSS estimates using the node-wise regression procedure of [43], at a confidence level of $1 - \frac{α}{2}$ . We perform the node-wise regression separately for each time t. For an estimate x̂_t, we obtain ${\hat{x}}_{t}^{u}$ and ${\hat{x}}_{t}^{l}$ as the upper and lower confidence bounds, respectively. Next, we partition the estimates into small segments, starting with a local minimum (trough) and ending in a local maximum (peak). For the i^th component of the estimate, let t_min and t_max denote the time index corresponding to two such consecutive troughs and peaks. If the difference ${({\hat{x}}_{t_{max}}^{l})}_{i} - {({\hat{x}}_{t_{min}}^{u})}_{i}$ is positive, the detected innovation component is declared significant (i.e., spike) at a confidence level of 1 − α, otherwise it is discarded (i.e., no spike). We refer to this procedure as Pruned-FCSS (PFCSS).

We first apply the FCSS algorithm for calcium deconvolution in a scenario where the ground-truth spiking is recorded in vitro through simultaneous electrophysiology (cell-attached patch clamp) and two-photon calcium imaging. The calcium trace as well as the ground-truth spikes are shown for a sample neuron in Figure 5–(a). The FCSS denoised estimate of the states (black) and the detected spikes (blue) using 95% confidence intervals (orange hulls) and the corresponding quantities for the constrained f-oopsi algorithm [42] are shown in Figures 5–(b) and –(c), respectively. Both algorithms detect the large dynamic changes in the data, corresponding to the spikes, which can also be visually captured in this case. However, in doing so, the f-oopsi algorithm incurs a high rate of false positive errors, manifested as clustered spikes around the ground truth events. Similar to f-oopsi, most state-of-the-art model-based methods suffer from high false positive rate, which makes the inferred spike estimates unreliable. Thanks to the aforementioned pruning process based on the confidence bounds, the PFCSS is capable of rejecting the insignificant innovations, and hence achieve a lower false positive rate. One factor responsible for this performance gap can be attributed to the underestimation of the calcium decay rate in the transition matrix estimation step of f-oopsi. However, we believe the performance gain achieved by FCSS is mainly due to the explicit modeling of the sparse nature of the spiking activity by going beyond the Gaussian state-space modeling paradigm.

Fig. 5 — Ground-truth performance comparison between PFCSS and constrained f-oopsi. Top: the observed calcium traces (black) and ground-truth electrophysiology data (blue), Middle: PFCSS state estimates (black) with 95% confidence intervals (orange) and the detected spikes (blue), Bottom: the constrained f-oopsi state estimates (black) and the detected spikes (blue). The FCSS spike estimates closely match the ground-truth spikes with only a few false detections, while the constrained f-oopsi estimates contain significant clustered false detections.

Next, we apply the FCSS algorithm to large-scale in vivo calcium imaging recordings, for which the ground-truth is not available due to measurement constraints. The data used in our analysis was recorded from 219 spontaneously active neurons in mouse auditory cortex. The two-photon microscope operates at a rate of 30 frames per second. We chose T = 2000 samples corresponding to 1 minute for the analysis. We chose p = 108 well-separated neurons visually. We estimate the measurement noise variance by appropriate re-scaling of the power spectral density in the high frequency bands where the signal is absent. We chose a value of ε = 10^{− 10}. It is important to note that estimation of the measurement noise variance is critical, since it affects the width of the confidence intervals and hence the detected spikes. Moreover, we estimate the baseline fluorescence by averaging the signal over values within a factor of 3 standard deviations of the noise. By inspecting Eq. (4), one can see a trade-off between the choice of λ and the estimate of the observation noise variance σ². We have done our analysis in both the compression regime, with a compression ratio of 1/3 (n/p = 2/3), and the denoising regime. The measurements in the compressive regime were obtained from applying i.i.d. Gaussian random matrices to the observed calcium traces. The latter is done to motivate the use of compressive imaging, as opposed to full sampling of the field of view.

Figure 6–(a) shows the observed traces for four selected neurons. The reconstructed states using FCSS in the compressive and denoising regimes are shown in Figures 6–(b) and –(c), respectively. The 90% confidence bounds are shown as orange hulls. The FCSS state estimates are significantly denoised while preserving the calcium dynamics. Figure 7 shows the detected spikes using constrained f-oopsi and PFCSS in both the compressive and denoising regimes. Finally, Figure 8 shows the corresponding raster plots of the reconstructed spikes for the entire ensemble of neurons. Similar to the preceding application on ground-truth date, the f-oopsi algorithm detects clusters of spikes, whereas the PFCSS procedure results in sparser spike detection. This results in the detection of seemingly more active neurons in the raster plot. However, motivated by the foregoing ground-truth analysis, we believe that a large fraction of these detected spikes may be due to false positive errors. Strikingly, even with a compression ratio of 1/3 the performance of the PFCSS is similar to the denoising case. The latter observation corroborates the feasibility of compressed two-photon imaging, in which only a random fraction of the field of view is imaged, which in turn can result in higher acquisition rates.

Fig. 6 — Performance of FCSS on large-scale calcium imaging data. Top, middle and bottom rows correspond to three selected neurons labeled as Neuron 1, 2, and 3, respectively. (a) raw calcium traces, (b) FCSS reconstruction with no compression, (c) FCSS reconstruction with 2/3 compression ratio. Orange hulls show 95% confidence intervals. The FCSS significantly denoises the observed traces in both the uncompressed and compressed settings.

Fig. 7 — Reconstructed spikes of PFCSS and constrained f-oopsi from large-scale calcium imaging data. Top, middle and bottom rows correspond to three selected neurons labeled as Neuron 1, 2, and 3, respectively. (a) constrained f-oopsi spike estimates, (b) PFCSS spike estimates with no compression, (c) PFCSS spike estimates with 2/3 compression ratio. The PFCSS estimates in both the uncompressed and compressed settings are sparse in time, whereas the constrained f-oopsi estimates are in the form of clustered spikes.

Fig. 8 — Raster plot of the estimated spikes from large-scale calcium imaging data. Left: the constrained f-oopsi estimates, Middle: PFCSS estimates with no compression, Right: PFCSS estimates with a $\frac{1}{3}$ compression ratio. The PFCSS estimates are spatiotemporally sparse, whereas the constrained f-oopsi outputs temporally clustered spike estimates.

In addition to the foregoing discussion on the comparisons in Figures 5, 6, 7, and 8, two remarks are in order. First, the iterative solution at the core of FCSS is linear in the observation length and hence significantly faster than the batch-mode optimization procedure used for constrained f-oopsi. Our comparisons suggest that the FCSS reconstruction is at least 3 times faster than f-oopsi for moderate data sizes of the order of tens of minutes. Moreover, the vector formulation of FCSS allows for easy parallelization (without the need for GPU implementations), which allows simultaneous processing of ROI’s without losing speed. As a numerical example, the results of Figure 6–(b) took an average of 60–70 seconds for all the p = 108 ROI’s and T = 2000 frames on an Apple Macintosh desktop computer. Second, using only about two-thirds of the measurements achieves similar results by FCSS as using the full measurements.

C. Application to Sleep Spindle Detection

In this section we use compressible state-space models in order to model and detect sleep spindles. A sleep spindle is a burst of oscillatory brain activity manifested in the EEG that occurs during stage 2 non-rapid eye movement (NREM) sleep. It consists of stereotypical 12–14 Hz wave packets that last for at least 0.5 seconds [44]. The spindles occur with a rate of 2–5% in time, which makes their generation an appropriate candidate for compressible dynamics. Therefore, we hypothesize that the spindles can be modeled using a combination of few echoes of the response of a second order compressible state-space model. As a result, the spindles can be decomposed as sums of modulated sine waves.

In order to model the oscillatory nature of the spindles, we consider a second order autoregressive (AR) model where the pole locations are given by $a e^{- j 2 π \frac{f}{f_{s}}}$ and $a e^{+ j 2 π \frac{f}{f_{s}}}$ , where 0 < a < 1 is a positive constant controlling the effective duration of the impulse response, f_s is the sampling frequency and f is a putative frequency accounting for the dominant spindle frequency. The equivalent state-space model for which the MAP estimation admits the FCSS solution is therefore:

x_{t} = 2 a cos (2 π \frac{f}{f_{s}}) x_{t - 1} - a^{2} x_{t - 2} + w_{t}, y_{t} = A_{t} x_{t} + v_{t} .

(12)

Note that impulse response of the state-space dynamics is given by $h_{n} = a^{n} cos (2 π \frac{f}{f_{s}} n) u_{n}$ , which is a stereotypical decaying sinusoid. By defining the augmented state ${\tilde{x}}_{t} = {[x_{t}^{'}, x_{t - 1}^{'}]}^{'}$ , Eq. (12) can be expressed in the canonical form:

{\tilde{x}}_{t} = \tilde{Θ} {\tilde{x}}_{t - 1} + {\tilde{w}}_{t}, y_{t} = {\tilde{A}}_{t} {\tilde{x}}_{t} + v_{t},

(13)

where ${\tilde{w}}_{t} : = {[w_{t}^{'}, 0^{'}]}^{'}, {\tilde{A}}_{t} = [\begin{array}{l} A_{t} & 0 \end{array}]$ and

\tilde{Θ} = [\begin{array}{c} 2 a cos (2 π \frac{f}{f_{s}}) I & - a^{2} I \\ I & 0 \end{array}] .

Eq. (10) can be used to update Θ̃ in the M step of the FCSS algorithm. However, Θ̃ has a specific structure in this case, determined by a and f, which needs to be taken into account in the optimization step. Let $ϕ = : 2 a cos (2 π \frac{f}{f_{s}})$ and ψ = a², and let

\sum_{t = 1}^{T} {\tilde{W}}_{t}^{(l)} (x_{t - 1 ∣ T}^{(l, m + 1)} {\tilde{x}}^{'}_{t - 1 ∣ T}^{(l, m + 1)} + {\sum^{\sim}}_{t - 1 ∣ T}^{(l, m + 1)}) = : [\begin{array}{c} A & B \\ C & D \end{array}], \sum_{t = 1}^{T} {\tilde{W}}_{t}^{(l)} (x_{t - 1 ∣ T}^{(l, m + 1)} {\tilde{x}}^{'}_{t ∣ T}^{(l, m + 1)} + {\tilde{x}}_{t ∣ T}^{(l, m + 1)} {\tilde{x}}^{'}_{t - 1 ∣ T}^{(l, m + 1)} + 2 {\sum^{\sim}}_{t - 1, t ∣ T}^{(l, m + 1)}) = : [\begin{array}{c} E & F \\ G & H \end{array}] .

Then, Eq. (10) is equivalent to maximizing

\underset{ϕ, ψ}{maximize} \frac{λ}{2} (ϕ^{2} Tr (A) - ϕ ψ Tr (B + C) + ψ^{2} Tr (D) + Tr (A)) - \frac{λ}{2} (ϕ Tr (E) - ψ Tr (G) + Tr (F)) .

(14)

subject to 0 ≤ ψ ≤ 1 and ϕ² ≤ 4ψ, which can be solved using interior point methods. In our implementation, we have imposed additional priors of f ~ Uniform(12, 14) Hz and a ~ Uniform(0.95, 0.99), which simplifies the constraints on ϕ and ψ to

{0.95}^{2} \leq ψ \leq {0.99}^{2}, and 4 ψ {cos}^{2} (2 π \frac{14}{f_{s}}) \leq ϕ^{2} \leq 4 ψ {cos}^{2} (2 π \frac{12}{f_{s}}) .

Given the convexity of the cost function in (14), one can conclude from the KKT conditions that if the global minimum is not achieved inside the region of interest it must be achieved on the boundaries.

Figure 9, top panel, shows two instances of simulated spindles (black traces) with parameters f_s = 200 Hz, f = 13 Hz and a = 0.95, with the ground truth events generating the wave packets shown in red. The middle panel shows the noisy version of the data with an SNR of − 7.5 dB. The noise was chosen as white Gaussian noise plus slowly varying (2 Hz) oscillations to resemble the slow oscillations in real EEG data. As can be observed the simulated signal exhibits visual resemblance to real spindles, which verifies the hypothesis that spindles can be decomposed into few combinations of wave packets generated by a second order AR model. The third panel, shows the denoised data using FCSS, which not only is successful in detecting the the ground-truth dynamics (red bars), but also significantly denoises the data.

Fig. 9 — Performance of FCSS on simulated spindles. Top: simulated clean data (black) and ground-truth spindle events (red), Middle: simulated noisy data, Bottom: the denoised signal (black) and deconvolved spindle events (red). The FCSS estimates are significantly denoised and closely match the ground-truth data shown in the top panel.

We next apply FCSS to real EEG recordings from stage-2 NREM sleep. Manual scoring of sleep spindles can be very time-consuming, and achieving accurate manual scoring on a long-term recording is a highly demanding task with the associated risk of decreased diagnosis. Although automatic spindle detection would be attractive, most available algorithms sensitive to variations in spindle amplitude and frequency that occur between both subjects and derivations, reducing their effectiveness [45], [46]. Moreover most of these algorithms require significant pre- and post-processing and manual tuning. Examples include algorithms based on Empirical Mode Decomposition (EMD) [47], [48], [49], data-driven Bayesian methods [50], and machine learning approaches [51], [52]. Unlike our approach, none of the existing methods consider modeling the generative dynamics of spindles, as transient sparse events in time, in the detection procedure.

The data used in our analysis is part of the recordings in the DREAMS project [53], recorded using a 32-channel polysomnograpgh. We have used the EEG channels in our analysis. The data was recorded at a rate of f_s = 200 Hz for 30 minutes. The data was scored for sleep spindles independently by two experts. We have used expert annotations to separate regions which include spindles for visualization purposes. For comparison purposes, we use a bandpass filtered version of the data within 12–14 Hz, which is the basis of several spindle detection algorithms [46], [54], [55], hallmarked by the widely-used bandpass filtered root-mean-square (RMS) method [46].

Figure 10 shows the detection results along with the bandpass filtered version of the data for two of the EEG channels. The red bars show the expert markings of the onset and offset of the spindle events. The FCSS simultaneously captures the spindle events and suppressed the activity elsewhere, whereas the bandpass filtered data produces significant activity in the 12–14 Hz throughout the observation window, resulting in high false positives. To quantify this observation, we have computed the ROC curves of the FCSS and bandpass filtering followed by root mean square (RMS) computation in Figure 11, which confirms the superior performance of the FCSS algorithm over the data set. The annotations of one of the experts has been used for as the ground truth benchmark.

Fig. 10 — Performance comparison between FCSS and band-pass filtered EEG data. Left and right panels correspond to two selected electrodes labeled as 1 and 2, respectively. The orange blocks show the extent of the detected spindles by the expert. Top: raw EEG data, Middle: band-pass filtered EEG data in the 12–14 Hz band, Bottom: FCSS spindle estimates. The FCSS estimates closely match the expert annotations, while the band-pass filtered data contains significant signal components outside of the orange blocks.

Fig. 11 — The ROC curves for FCSS (solid red) and bandpass-filtered RMS (dashed black). The FCSS outperforms the widely-used band-pass filtered RMS method as indicated by the ROC curves.

IV. Discussion

In this section, we discuss the implication of our techniques in regard to the application domains as well as existing methods.

A. Connection to existing literature in sparse estimation

Contrary to the traditional compressive sensing, our linear measurement operator does not satisfy the RIP [21], despite the fact that A_t’s satisfy the RIP. Nevertheless, we have extended the near-optimal recovery guarantees of CS to our compressible state-space estimation problem via Theorem 1. Closely related problems to our setup are the super-resolution and sparse spike deconvolution problems [56], [57], in which abrupt changes with minimum separation in time are resolved in fine scales using coarse (lowpass filtered) frequency information, which is akin to working in the compressive regime.

Theoretical guarantees of CS require the number of measurements to be roughly proportional to the sparsity level for stable recovery [38]. These results do not readily generalize to the cases where the sparsity lies in the dynamics, not the states per se. Most of the dynamic compressive sensing techniques such as Kalman filtered compressed sensing, assume partial information about the support or estimate them in a greedy and often ad-hoc fashion [16], [17], [18], [19], [20]. As another example, the problem of recovering discrete signals which are approximately sparse in their gradients using compressive measurements, has been studied in the literature using Total Variation (TV) minimization techniques [58], [31]. For one-dimensional signals, since the gradient operator is not orthonormal, the Frobenius operator norm of its inverse grows linearly with the discretization level [58]. Therefore, stability results of TV minimization scale poorly with respect to discretization level. In higher dimensions, however, the fast decay of Haar coefficients allow for near-optimal theoretical guarantees [59]. A major difference of our setup with those of CS for TV-minimization is the structured and causal measurements, which unlike the non-causal measurements in [58], do not result in an overall measurement matrix satisfying RIP. We have considered dynamics with convergent transition matrices, in order to generalize the TV minimization approach. To this end, we showed that using the state-space dynamics one can infer temporally global information from local and causal measurements. Another closely related problem is the fused lasso [60] in which sparsity is promoted both on the covariates and their differences.

B. Application to calcium deconvolution

In addition to scalability and the ability to detect abrupt transitions in the states governed by discrete events in time (i.e., spikes), our method provides several other benefits compared to other spike deconvolution methods based on state-space models, such as the constrained f-oopsi algorithm. First, our sampling-complexity trade-offs are known to be optimal from the theory of compressive sensing, whereas no performance guarantee exists for constrained f-oopsi. Second, we are able to construct precise confidence intervals on the estimated states, whereas constrained f-oopsi does not produce confidence intervals over the detected spikes. A direct consequence of these confidence intervals is estimation of spikes with high fidelity and low false alarm. Third, our comparisons suggest that the FCSS reconstruction is at least 3 times faster than f-oopsi for moderate data sizes of the order of tens of minutes. Finally, our results corroborate the possibility of using compressive measurement for reconstruction and denoising of calcium traces. From a practical point of view, a compressive calcium imaging setup can lead to higher scanning rate as well as better reconstructions, which allows monitoring of larger neuronal populations [61]. Due to the structured nature of our sampling and reconstruction schemes, we can avoid prohibitive storage problems and benefit from parallel implementations.

C. Application to sleep spindle detection

Another novel application of our modeling and estimating framework is to case sleep spindle generation as a second-order dynamical system governed by compressive innovations, for which FCSS can be efficiently used to denoise and detect the spindle events. Our modeling framework suggest that spectrotemporal spindle dynamics cannot be fully captured by just pure sinusoids via bandpass filtering, as the data consistently contains significant 12–14 Hz oscillations almost everywhere (See Figure 10). Therefore, using the bandpass filtered data for further analysis purposes clearly degrades the performance of the resulting spindle detection and scoring algorithms. The FCSS provides a robust alternative to bandpass filtering in the form of model-based denoising.

In contrast to state-of-the-art methods for spindle detection, our spindle detection procedure requires minimal pre- and post-processing steps. We expect similar properties for higher order AR dynamics, which form a useful generalization of our methods for deconvolution of other transient neural signals. In particular, K-complexes during the stage 2 NREM sleep form another class of transient signals with high variability. A potential generalization of our method using higher order models can be developed for simultaneous detection of K-complexes and spindles.

V. Conclusion

In this paper, we considered estimation of compressible state-space models, where the state innovations consist of compressible discrete events. For dynamics with convergent state transition dynamics, using theory of compressed sensing we provided an optimal error bound and stability guarantees for the dynamic ℓ₁-regularization algorithm which is akin to the MAP estimator for a Laplace state-space model. We also developed a fast and low-complexity iterative algorithm, namely FCSS, for estimation of the states as well as their transition matrix. We further verified the validity of our theoretical results through simulation studies as well as application to spike deconvolution from calcium traces and detection of sleep spindles from EEG data. Our methodology has two unique major advantages: first, we have proven theoretically why our algorithm performs well, and characterized its error performance. Second, we have developed a fast algorithm, with guaranteed convergence to a solution of the deconvolution problem, which for instance, is ~ 3 times faster than the widely-used f-oopsi algorithm in calcium deconvolution applications.

While we focused on two specific application domains, our modeling and estimation techniques can be generalized to apply to broader classes of signal deconvolution problems: we have provided a framework to model transient phenomena which are driven by sparse generators in time domain, and whose event onsets are of importance. Examples include heart beat dynamics and rapid changes in the covariance structure of neural data (e.g., epileptic seizures). In the spirit of easing reproducibility, we have made a MATLAB implementation of our algorithm publicly available [28].

Supplementary Material

NIHMS914825-supplement-1.pdf^{(182.2KB, pdf)}

Acknowledgments

This work was supported in part by the National Science Foundation Award No. 1552946 and the National Institutes of Health Awards No. R01-DC009607 and U01-NS090569.

VII. Appendix: Proof of Theorem 1

The main idea behind the proof is establishing appropriate cone and tube constraints [30]. In order to avoid unnecessary complications we assume s₁ ≫ s₂ = ⋯ = s_T and n₁ ≫ n₂ = ⋯ = n_T. Let x̂_t = x_t + g_t, t ∈ [T] be an arbitrary solution to the primal form (3). We define z_t = x_t − θx_t₋₁ and h_t = z_t − ẑ_t for t ∈ [T]. For a positive integer p, let [p] := {1, 2, ⋯, p}. For an arbitrary set V ⊂ [p], x_V denotes the vector x restricted to the indices in V, i.e. all the components outside of V set to zero. We can decompose ${(h_{t})}_{S_{t}^{c}} = {(h_{t})}_{I_{1}} + {(h_{t})}_{I_{2}} + \dots + {(h_{t})}_{I_{r_{t}}}$ , where r_t = ⌊p/4s_t⌋, and (h_t)_I₁ is the 4s_t-sparse vector corresponding to 4s_t largest-magnitude entries remaining in ${(h_{t})}_{S_{t}^{c}}$ , (h_t)_I₂ is the 4s_t-sparse vector corresponding to 4s largest-magnitude entries remaining in ${(h_{t})}_{S_{t}^{c}} - {(h_{t})}_{I_{1}}$ and so on. By the optimality of ${({\hat{z}}_{t})}_{t = 1}^{T}$ we have

\sum_{t = 1}^{T} \frac{{‖ z_{t} + h_{t} ‖}_{1}}{\sqrt{s_{t}}} \leq \sum_{t = 1}^{T} \frac{{‖ z_{t} ‖}_{1}}{\sqrt{s_{t}}} .

Using several instances of triangle inequality we have

\sum_{t = 1}^{T} \frac{- σ_{s_{t}} (z_{t}) + {‖ {(h_{t})}_{S_{t}^{c}} ‖}_{1} - {‖ {(h_{t})}_{S_{t}} ‖}_{1} + {‖ {(z_{t})}_{S_{t}} ‖}_{1}}{\sqrt{s_{t}}} \leq \sum_{t = 1}^{T} \frac{{‖ z_{t} + h_{t} ‖}_{1}}{\sqrt{s_{t}}} \leq \sum_{t = 1}^{T} \frac{{‖ z_{t} ‖}_{1}}{\sqrt{s_{t}}} = \sum_{t = 1}^{T} \frac{{‖ {(z_{t})}_{S_{t}} ‖}_{1} + σ_{s_{t}} (z_{t})}{\sqrt{s_{t}}},

which after re-arrangement yields the cone condition given by

\sum_{t = 1}^{T} \frac{{‖ {(h_{t})}_{S_{t}^{c}} ‖}_{1}}{\sqrt{s_{t}}} \leq \sum_{t = 1}^{T} \frac{{‖ {(h_{t})}_{S_{t}} ‖}_{1} + {‖ {(z_{t})}_{S_{t}^{c}} ‖}_{1}}{\sqrt{s_{t}}} .

(15)

Also, by the definition of partitions ${(I_{j})}_{j = 1}^{r_{t}}$ we have

\sum_{t = 1}^{T} \sum_{j = 2}^{r_{t}} {‖ {(h_{t})}_{I_{j}} ‖}_{2} \leq \sum_{t = 1}^{T} \sum_{j = 2}^{r_{t}} 2 \sqrt{s_{t}} {‖ {(h_{t})}_{I_{j}} ‖}_{\infty} \leq \sum_{t = 1}^{T} \sum_{j = 2}^{r_{t}} \frac{{‖ {(h_{t})}_{I_{j - 1}} ‖}_{1}}{2 \sqrt{s_{t}}} = \sum_{t = 1}^{T} \frac{{‖ {(h_{t})}_{S_{t}^{c}} ‖}_{1}}{2 \sqrt{s_{t}}} \leq \sum_{t = 1}^{T} \frac{{‖ {(h_{t})}_{S_{t}} ‖}_{1} + σ_{s_{t}} (z_{t})}{2 \sqrt{s_{t}}} \leq \sum_{t = 1}^{T} \frac{{‖ {(h_{t})}_{S_{t}} ‖}_{2}}{2} + \frac{σ_{s_{t}} (z_{t})}{2 \sqrt{s_{t}}} .

(16)

Moreover, using the feasibility of both x_t and x̂_t we have the tube constraints

{‖ y_{1} - A_{1} x_{1} ‖}_{2} \leq ε \Rightarrow {‖ θ {(y_{1})}_{[n_{2}]} - θ A_{2} x_{1} ‖}_{2} \leq θ ε, {‖ y_{2} - A_{2} x_{2} ‖}_{2} \leq \sqrt{\frac{n_{2}}{n_{1}}} ε,

from which we conclude ||y₂ − θ (y₁)_[n₂] − A₂z₂||₂≤ (1+θ)ε. Similarly ||y₂ − θ (y₁)_[n₂] − A₂ẑ₂||₂≤ (1+θ)ε. Therefore the triangle inequality yields ||A₂h₂||₂≤ 2(1 + θ)ε. Similarly for all t ∈ [T]\{2}, we have the tighter bound

{‖ A_{t} h_{t} ‖}_{2} \leq 2 (1 + θ) \sqrt{\frac{n_{t}}{n_{1}}} ε,

(17)

which is a consequence of having fewer measurements for t ∈ [T]\{2}. In conjunction, (15), (16), and (17) yield

2 (1 + θ) (T + \sqrt{\frac{n_{1}}{n_{2}}} - 1) ε \geq {‖ A_{1} h_{1} ‖}_{2} + \sum_{t = 2}^{T} \sqrt{\frac{n_{1}}{n_{t}}} {‖ A_{t} h_{t} ‖}_{2} \geq \sum_{t = 1}^{T} {‖ {\tilde{A}}_{t} {(h_{t})}_{S_{t} \cup I_{1}} ‖}_{2} - \sum_{t = 1}^{T} \sum_{j = 2}^{r_{t}} {‖ {\tilde{A}}_{t} {(h_{t})}_{I_{j}} ‖}_{2} \geq \sqrt{1 - δ_{4 s}} \sum_{t = 1}^{T} {‖ {(h_{t})}_{S_{t} \cup I_{1}} ‖}_{2} - \frac{\sqrt{1 + δ_{4 s}}}{2} \sum_{t = 1}^{T} \sum_{j = 2}^{r_{t}} {‖ (h_{I_{j}, t}) ‖}_{2} \geq \sqrt{1 - δ_{4 s}} \sum_{t = 1}^{T} {‖ {(h_{t})}_{S_{t} \cup I_{1}} ‖}_{2} - \frac{\sqrt{1 + δ_{4 s}}}{2} \sum_{t = 1}^{T} {‖ {(h_{t})}_{S_{t} \cup I_{1}} ‖}_{2} + \frac{σ_{s_{t}} (z_{t})}{\sqrt{s_{t}}} \geq (\sqrt{1 - δ_{4 s}} - \sqrt{\frac{1 + δ_{4 s}}{2}}) \sum_{t = 1}^{T} {‖ {(h_{t})}_{S_{t} \cup I_{1}} ‖}_{2} - \frac{\sqrt{1 + δ_{4 s}}}{2} \sum_{t = 1}^{T} \frac{σ_{s_{t}} (z_{t})}{\sqrt{s_{t}}} .

Therefore after rearrangement for δ₄_s < 1/3

\sum_{t = 1}^{T} {‖ {(h_{t})}_{S_{t} \cup I_{1}} ‖}_{2} \leq 8.37 (1 + θ) (T + \sqrt{\frac{n_{1}}{n_{2}}} - 1) ε + \frac{5}{2} \sum_{t = 1}^{T} \frac{σ_{s_{t}} (z_{t})}{\sqrt{s_{t}}} .

Next, using (16) yields

\sum_{t = 1}^{T} {‖ h_{t} ‖}_{2} \leq \sum_{t = 1}^{T} \sum_{j = 2}^{r_{t}} {‖ {(h_{t})}_{I_{j}} ‖}_{2} + {‖ {(h_{t})}_{S_{t} \cup I_{1}} ‖}_{2} \leq 12.55 (T + \sqrt{\frac{n_{1}}{n_{2}}} - 1) ε + 3 \sum_{t = 1}^{T} \frac{σ_{s_{t}} (z_{t})}{\sqrt{s_{t}}} .

(18)

By definition we have h_t = g_t − θg_t₋₁ for t ∈ [T] with g₀ = 0. Therefore by induction we have $g_{t} = \sum_{j = 1}^{t} θ^{t - j} h_{j}$ or in matrix form

G : = [\begin{array}{l} g_{1} \\ g_{2} \\ ⋮ \\ g_{t} \end{array}] = \underset{A}{\underset{︸}{[\begin{array}{l} I & 0 & \dots & 0 \\ θ I & I & \dots & 0 \\ θ^{2} I & θ I & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ θ^{T - 1} I & θ^{T - 2} I & \dots & I \end{array}]}} [\begin{array}{l} h_{1} \\ h_{2} \\ ⋮ \\ h_{t} \end{array}] = : A H .

Using several instances of the triangle inequality we get:

\sum_{t = 1}^{T} {‖ g_{t} ‖}_{2} = \sum_{t = 1}^{T} {‖ \sum_{j = 1}^{t} θ^{t - j} h_{j} ‖}_{2} \leq \sum_{t = 1}^{T} \sum_{j = 1}^{t} θ^{t - j} {‖ h_{j} ‖}_{2} \leq \sum_{j = 1}^{T - 1} θ^{j} \sum_{t = 1}^{T} {‖ h_{t} ‖}_{2} = \frac{1 - θ^{T}}{1 - θ} \sum_{t = 1}^{T} {‖ h_{t} ‖}_{2},

which in conjunction with (18) completes the proof.

Footnotes

This work has been presented in part at the 2016 IEEE Global Conference on Signal and Information Processing [1].

Contributor Information

Abbas Kazemipour, Department of Electrical & Computer Engineering, University of Maryland (UMD), College Park, MD USA.

Ji Liu, Department of Biology, UMD, College Park, MD USA.

Krystyna Solarana, Department of Biology, UMD, College Park, MD USA.

Daniel A. Nagode, Department of Biology, UMD, College Park, MD USA

Patrick O. Kanold, Department of Biology, UMD, College Park, MD USA

Min Wu, Department of Electrical & Computer Engineering, University of Maryland (UMD), College Park, MD USA.

Behtash Babadi, Department of Electrical & Computer Engineering, University of Maryland (UMD), College Park, MD USA.

References

1.Kazemipour A, Liu J, Kanold P, Wu M, Babadi B. Efficient estimation of compressible state-space models with application to calcium signal deconvolution. IEEE Global Conference on Signal and Information Processing (GlobalSIP); 2016. available online: arXiv preprint arXiv:1610.06461. [Google Scholar]
2.Phillips JW, Leahy RM, Mosher JC. Meg-based imaging of focal neuronal current sources. Medical Imaging, IEEE Transactions on. 1997;16(3):338–348. doi: 10.1109/42.585768. [DOI] [PubMed] [Google Scholar]
3.Kolar M, Song L, Ahmed A, Xing EP. Estimating time-varying networks. The Annals of Applied Statistics. 2010:94–123. [Google Scholar]
4.Nunez PL, Cutillo BA. Neocortical dynamics and human EEG rhythms. Oxford University Press; USA: 1995. [Google Scholar]
5.Vogelstein JT, Packer AM, Machado TA, Sippy T, Babadi B, Yuste R, Paninski L. Fast nonnegative deconvolution for spike train inference from population calcium imaging. Journal of neurophysiology. 2010;104(6):3691–3704. doi: 10.1152/jn.01073.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Chang C, Glover GH. Time–frequency dynamics of resting-state brain connectivity measured with fmri. Neuroimage. 2010;50(1):81–98. doi: 10.1016/j.neuroimage.2009.12.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Jung H, Ye JC. Motion estimated and compensated compressed sensing dynamic magnetic resonance imaging: What we can learn from video compression techniques. International Journal of Imaging Systems and Technology. 2010;20(2):81–98. [Google Scholar]
8.Anderson B, Moore JB. Prentice-Hall Information and System Sciences Series. Vol. 1979. Englewood Cliffs: Prentice-Hall; 1979. Optimal filtering. [Google Scholar]
9.Haykin SS. Adaptive filter theory. Pearson Education India; 2008. [Google Scholar]
10.Kitagawa G. A self-organizing state-space model. Journal of the American Statistical Association. 1998:1203–1215. [Google Scholar]
11.Hürzeler M, Künsch HR. Monte carlo approximations for general state-space models. Journal of Computational and graphical Statistics. 1998;7(2):175–193. [Google Scholar]
12.Frühwirth-Schnatter S. Applied state space modelling of non-gaussian time series using integration-based kalman filtering. Statistics and Computing. 1994;4(4):259–269. [Google Scholar]
13.Frühwirth-Schnatter S. Data augmentation and dynamic linear models. Journal of time series analysis. 1994;15(2):183–202. [Google Scholar]
14.Knorr-Held L. Conditional prior proposals in dynamic models. Scandinavian Journal of Statistics. 1999;26(1):129–144. [Google Scholar]
15.Shephard N, Pitt MK. Likelihood analysis of non-gaussian measurement time series. Biometrika. 1997;84(3):653–667. [Google Scholar]
16.Vaswani N. Ls-cs-residual (ls-cs): compressive sensing on least squares residual. IEEE Transactions on Signal Processing. 2010;58(8):4108–4120. [Google Scholar]
17.Vaswani N. Kalman filtered compressed sensing. 2008 15th IEEE International Conference on Image Processing; IEEE; 2008. pp. 893–896. [Google Scholar]
18.Carmi A, Gurfil P, Kanevsky D. Methods for sparse signal recovery using kalman filtering with embedded pseudo-measurement norms and quasi-norms. IEEE Transactions on Signal Processing. 2010;58(4):2405–2409. [Google Scholar]
19.Ziniel J, Schniter P. Dynamic compressive sensing of time-varying signals via approximate message passing. IEEE transactions on signal processing. 2013;61(21):5270–5284. [Google Scholar]
20.Zhan J, Vaswani N. Time invariant error bounds for modified-cs-based sparse signal sequence recovery. IEEE Transactions on Information Theory. 2015;61(3):1389–1409. [Google Scholar]
21.Ba D, Babadi B, Purdon P, Brown E. Exact and stable recovery of sequences of signals with sparse increments via differential ℓ1- minimization. 2012 [Google Scholar]
22.Shumway RH, Stoffer DS. An approach to time series smoothing and forecasting using the em algorithm. Journal of time series analysis. 1982;3(4):253–264. [Google Scholar]
23.Fevrier IJ, Gelfand SB, Fitz MP. Reduced complexity decision feedback equalization for multipath channels with large delay spreads. Communications, IEEE Transactions on. 1999;47(6):927–937. [Google Scholar]
24.Cotter SF, Rao BD, Engan K, Kreutz-Delgado K. Sparse solutions to linear inverse problems with multiple measurement vectors. Signal Processing, IEEE Transactions on. 2005;53(7):2477–2488. [Google Scholar]
25.Obozinski G, Wainwright MJ, Jordan MI. Support union recovery in high-dimensional multivariate regression. The Annals of Statistics. 2011:1–47. [Google Scholar]
26.Zhang Z, Rao BD. Sparse signal recovery with temporally correlated source vectors using sparse bayesian learning. Selected Topics in Signal Processing, IEEE Journal of. 2011;5(5):912–926. [Google Scholar]
27.Davies ME, Eldar YC. Rank awareness in joint sparse recovery. Information Theory, IEEE Transactions on. 2012;58(2):1135–1146. [Google Scholar]
28.MATLAB implementation of the FCSS algorithm. 2016 Available on GitHub Repository: https://github.com/kaazemi/FCSS.
29.Candès EJ. Compressive sampling. Proceedings of the International Congress of Mathematicians Madrid; August 22–30, 2006; pp. 1433–1452. [Google Scholar]
30.Candès EJ, Wakin MB. An introduction to compressive sampling. IEEE Signal Processing Magazine. 2008;25(2):21–30. [Google Scholar]
31.Poon C. On the role of total variation in compressed sensing. SIAM Journal on Imaging Sciences. 2015;8(1):682–720. [Google Scholar]
32.Charles AS, Rozell CJ. Dynamic filtering of sparse signals using reweighted ℓ1. Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on; IEEE; 2013. pp. 6451–6455. [Google Scholar]
33.Ba D, Babadi B, Purdon PL, Brown EN. Convergence and stability of iteratively re-weighted least squares algorithms. IEEE Transactions on Signal Processing. 2014;62(1):183–195. doi: 10.1109/TSP.2013.2287685. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Gribonval R, Cevher V, Davies ME. Compressible distributions for high-dimensional statistics. IEEE Transactions on Information Theory. 2012;58(8):5016–5034. [Google Scholar]
35.Rauch HE, Striebel C, Tung F. Maximum likelihood estimates of linear dynamic systems. AIAA journal. 1965;3(8):1445–1450. [Google Scholar]
36.McLachlan G, Krishnan T. The EM algorithm and extensions. Vol. 382 John Wiley & Sons; 2007. [Google Scholar]
37.Mishra B, Meyer G, Bach F, Sepulchre R. Low-rank optimization with trace norm penalty. SIAM Journal on Optimization. 2013;23(4):2124–2149. [Google Scholar]
38.Negahban SN, Ravikumar P, Wainwright MJ, Yu B. A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers. Statistical Science. 2012;27(4):538– 557. [Google Scholar]
39.Smetters D, Majewska A, Yuste R. Detecting action potentials in neuronal populations with calcium imaging. Methods. 1999;18(2):215–221. doi: 10.1006/meth.1999.0774. [DOI] [PubMed] [Google Scholar]
40.Stosiek C, Garaschuk O, Holthoff K, Konnerth A. In vivo two-photon calcium imaging of neuronal networks. Proceedings of the National Academy of Sciences. 2003;100(12):7319–7324. doi: 10.1073/pnas.1232232100. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Vogelstein JT, Watson BO, Packer AM, Yuste R, Jedynak B, Paninski L. Spike inference from calcium imaging using sequential monte carlo methods. Biophysical journal. 2009;97(2):636–655. doi: 10.1016/j.bpj.2008.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Pnevmatikakis EA, Soudry D, Gao Y, Machado TA, Merel J, Pfau D, Reardon T, Mu Y, Lacefield C, Yang W, et al. Simultaneous denoising, deconvolution, and demixing of calcium imaging data. Neuron. 2016;89(2):285–299. doi: 10.1016/j.neuron.2015.11.037. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Van de Geer S, Bühlmann P, Ritov Y, Dezeure R, et al. On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics. 2014;42(3):1166–1202. [Google Scholar]
44.De Gennaro L, Ferrara M. Sleep spindles: an overview. Sleep medicine reviews. 2003;7(5):423–440. doi: 10.1053/smrv.2002.0252. [DOI] [PubMed] [Google Scholar]
45.Nonclercq A, Urbain C, Verheulpen D, Decaestecker C, Van Bogaert P, Peigneux P. Sleep spindle detection through amplitude–frequency normal modelling. Journal of neuroscience methods. 2013;214(2):192–203. doi: 10.1016/j.jneumeth.2013.01.015. [DOI] [PubMed] [Google Scholar]
46.Schimicek P, Zeitlhofer J, Anderer P, Saletu B. Automatic sleep-spindle detection procedure: aspects of reliability and validity. Clinical EEG and neuroscience. 1994;25(1):26–29. doi: 10.1177/155005949402500108. [DOI] [PubMed] [Google Scholar]
47.Yang Z, Yang L, Qi D. Wavelet Analysis and Applications. Springer; 2006. Detection of spindles in sleep eegs using a novel algorithm based on the hilbert-huang transform; pp. 543–559. [Google Scholar]
48.Fraiwan L, Lweesy K, Khasawneh N, Wenz H, Dickhaus H. Automated sleep stage identification system based on time–frequency analysis of a single eeg channel and random forest classifier. Computer methods and programs in biomedicine. 2012;108(1):10–19. doi: 10.1016/j.cmpb.2011.11.005. [DOI] [PubMed] [Google Scholar]
49.Causa L, Held CM, Causa J, Estévez PA, Perez CA, Chamorro R, Garrido M, Algarín C, Peirano P. Automated sleep-spindle detection in healthy children polysomnograms. IEEE Transactions on Biomedical Engineering. 2010;57(9):2135–2146. doi: 10.1109/TBME.2010.2052924. [DOI] [PubMed] [Google Scholar]
50.Babadi B, McKinney SM, Tarokh V, Ellenbogen JM. Diba: a data-driven bayesian algorithm for sleep spindle detection. IEEE Transactions on Biomedical Engineering. 2012;59(2):483–493. doi: 10.1109/TBME.2011.2175225. [DOI] [PubMed] [Google Scholar]
51.Duman F, Erdamar A, Erogul O, Telatar Z, Yetkin S. Efficient sleep spindle detection algorithm with decision tree. Expert Systems with Applications. 2009;36(6):9980–9985. [Google Scholar]
52.Ventouras EM, Monoyiou EA, Ktonas PY, Paparrigopoulos T, Dikeos DG, Uzunoglu NK, Soldatos CR. Sleep spindle detection using artificial neural networks trained with filtered time-domain eeg: a feasibility study. Computer methods and programs in biomedicine. 2005;78(3):191–207. doi: 10.1016/j.cmpb.2005.02.006. [DOI] [PubMed] [Google Scholar]
53.Devuyst S, Dutoit T, Stenuit P, Kerkhofs M. Automatic sleep spindles detection-overview and development of a standard proposal assessment method. 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society; IEEE; 2011. pp. 1713–1716. [DOI] [PubMed] [Google Scholar]
54.Clemens Z, Fabo D, Halasz P. Overnight verbal memory retention correlates with the number of sleep spindles. Neuroscience. 2005;132(2):529–535. doi: 10.1016/j.neuroscience.2005.01.011. [DOI] [PubMed] [Google Scholar]
55.Huupponen E, Gómez-Herrero G, Saastamoinen A, Värri A, Hasan J, Himanen SL. Development and comparison of four sleep spindle detection methods. Artificial intelligence in medicine. 2007;40(3):157–170. doi: 10.1016/j.artmed.2007.04.003. [DOI] [PubMed] [Google Scholar]
56.Candès EJ, Fernandez-Granda C. Towards a mathematical theory of super-resolution. Communications on Pure and Applied Mathematics. 2014;67(6):906–956. [Google Scholar]
57.Duval V, Peyré G. Exact support recovery for sparse spikes deconvolution. Foundations of Computational Mathematics. 2015;15(5):1315–1355. [Google Scholar]
58.Needell D, Ward R. Stable image reconstruction using total variation minimization. SIAM Journal on Imaging Sciences. 2013;6(2):1035–1058. [Google Scholar]
59.Cohen A, DeVore R, Petrushev P, Xu H. Nonlinear approximation and the space bv(ℝ2) American Journal of Mathematics. 1999:587–628. [Google Scholar]
60.Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2005;67(1):91–108. [Google Scholar]
61.Pnevmatikakis EA, Paninski L. Sparse nonnegative deconvolution for compressive calcium imaging: algorithms and phase transitions. Advances in Neural Information Processing Systems. 2013:1250–1258. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS914825-supplement-1.pdf^{(182.2KB, pdf)}

[R1] 1.Kazemipour A, Liu J, Kanold P, Wu M, Babadi B. Efficient estimation of compressible state-space models with application to calcium signal deconvolution. IEEE Global Conference on Signal and Information Processing (GlobalSIP); 2016. available online: arXiv preprint arXiv:1610.06461. [Google Scholar]

[R2] 2.Phillips JW, Leahy RM, Mosher JC. Meg-based imaging of focal neuronal current sources. Medical Imaging, IEEE Transactions on. 1997;16(3):338–348. doi: 10.1109/42.585768. [DOI] [PubMed] [Google Scholar]

[R3] 3.Kolar M, Song L, Ahmed A, Xing EP. Estimating time-varying networks. The Annals of Applied Statistics. 2010:94–123. [Google Scholar]

[R4] 4.Nunez PL, Cutillo BA. Neocortical dynamics and human EEG rhythms. Oxford University Press; USA: 1995. [Google Scholar]

[R5] 5.Vogelstein JT, Packer AM, Machado TA, Sippy T, Babadi B, Yuste R, Paninski L. Fast nonnegative deconvolution for spike train inference from population calcium imaging. Journal of neurophysiology. 2010;104(6):3691–3704. doi: 10.1152/jn.01073.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Chang C, Glover GH. Time–frequency dynamics of resting-state brain connectivity measured with fmri. Neuroimage. 2010;50(1):81–98. doi: 10.1016/j.neuroimage.2009.12.011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Jung H, Ye JC. Motion estimated and compensated compressed sensing dynamic magnetic resonance imaging: What we can learn from video compression techniques. International Journal of Imaging Systems and Technology. 2010;20(2):81–98. [Google Scholar]

[R8] 8.Anderson B, Moore JB. Prentice-Hall Information and System Sciences Series. Vol. 1979. Englewood Cliffs: Prentice-Hall; 1979. Optimal filtering. [Google Scholar]

[R9] 9.Haykin SS. Adaptive filter theory. Pearson Education India; 2008. [Google Scholar]

[R10] 10.Kitagawa G. A self-organizing state-space model. Journal of the American Statistical Association. 1998:1203–1215. [Google Scholar]

[R11] 11.Hürzeler M, Künsch HR. Monte carlo approximations for general state-space models. Journal of Computational and graphical Statistics. 1998;7(2):175–193. [Google Scholar]

[R12] 12.Frühwirth-Schnatter S. Applied state space modelling of non-gaussian time series using integration-based kalman filtering. Statistics and Computing. 1994;4(4):259–269. [Google Scholar]

[R13] 13.Frühwirth-Schnatter S. Data augmentation and dynamic linear models. Journal of time series analysis. 1994;15(2):183–202. [Google Scholar]

[R14] 14.Knorr-Held L. Conditional prior proposals in dynamic models. Scandinavian Journal of Statistics. 1999;26(1):129–144. [Google Scholar]

[R15] 15.Shephard N, Pitt MK. Likelihood analysis of non-gaussian measurement time series. Biometrika. 1997;84(3):653–667. [Google Scholar]

[R16] 16.Vaswani N. Ls-cs-residual (ls-cs): compressive sensing on least squares residual. IEEE Transactions on Signal Processing. 2010;58(8):4108–4120. [Google Scholar]

[R17] 17.Vaswani N. Kalman filtered compressed sensing. 2008 15th IEEE International Conference on Image Processing; IEEE; 2008. pp. 893–896. [Google Scholar]

[R18] 18.Carmi A, Gurfil P, Kanevsky D. Methods for sparse signal recovery using kalman filtering with embedded pseudo-measurement norms and quasi-norms. IEEE Transactions on Signal Processing. 2010;58(4):2405–2409. [Google Scholar]

[R19] 19.Ziniel J, Schniter P. Dynamic compressive sensing of time-varying signals via approximate message passing. IEEE transactions on signal processing. 2013;61(21):5270–5284. [Google Scholar]

[R20] 20.Zhan J, Vaswani N. Time invariant error bounds for modified-cs-based sparse signal sequence recovery. IEEE Transactions on Information Theory. 2015;61(3):1389–1409. [Google Scholar]

[R21] 21.Ba D, Babadi B, Purdon P, Brown E. Exact and stable recovery of sequences of signals with sparse increments via differential ℓ1- minimization. 2012 [Google Scholar]

[R22] 22.Shumway RH, Stoffer DS. An approach to time series smoothing and forecasting using the em algorithm. Journal of time series analysis. 1982;3(4):253–264. [Google Scholar]

[R23] 23.Fevrier IJ, Gelfand SB, Fitz MP. Reduced complexity decision feedback equalization for multipath channels with large delay spreads. Communications, IEEE Transactions on. 1999;47(6):927–937. [Google Scholar]

[R24] 24.Cotter SF, Rao BD, Engan K, Kreutz-Delgado K. Sparse solutions to linear inverse problems with multiple measurement vectors. Signal Processing, IEEE Transactions on. 2005;53(7):2477–2488. [Google Scholar]

[R25] 25.Obozinski G, Wainwright MJ, Jordan MI. Support union recovery in high-dimensional multivariate regression. The Annals of Statistics. 2011:1–47. [Google Scholar]

[R26] 26.Zhang Z, Rao BD. Sparse signal recovery with temporally correlated source vectors using sparse bayesian learning. Selected Topics in Signal Processing, IEEE Journal of. 2011;5(5):912–926. [Google Scholar]

[R27] 27.Davies ME, Eldar YC. Rank awareness in joint sparse recovery. Information Theory, IEEE Transactions on. 2012;58(2):1135–1146. [Google Scholar]

[R28] 28.MATLAB implementation of the FCSS algorithm. 2016 Available on GitHub Repository: https://github.com/kaazemi/FCSS.

[R29] 29.Candès EJ. Compressive sampling. Proceedings of the International Congress of Mathematicians Madrid; August 22–30, 2006; pp. 1433–1452. [Google Scholar]

[R30] 30.Candès EJ, Wakin MB. An introduction to compressive sampling. IEEE Signal Processing Magazine. 2008;25(2):21–30. [Google Scholar]

[R31] 31.Poon C. On the role of total variation in compressed sensing. SIAM Journal on Imaging Sciences. 2015;8(1):682–720. [Google Scholar]

[R32] 32.Charles AS, Rozell CJ. Dynamic filtering of sparse signals using reweighted ℓ1. Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on; IEEE; 2013. pp. 6451–6455. [Google Scholar]

[R33] 33.Ba D, Babadi B, Purdon PL, Brown EN. Convergence and stability of iteratively re-weighted least squares algorithms. IEEE Transactions on Signal Processing. 2014;62(1):183–195. doi: 10.1109/TSP.2013.2287685. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Gribonval R, Cevher V, Davies ME. Compressible distributions for high-dimensional statistics. IEEE Transactions on Information Theory. 2012;58(8):5016–5034. [Google Scholar]

[R35] 35.Rauch HE, Striebel C, Tung F. Maximum likelihood estimates of linear dynamic systems. AIAA journal. 1965;3(8):1445–1450. [Google Scholar]

[R36] 36.McLachlan G, Krishnan T. The EM algorithm and extensions. Vol. 382 John Wiley & Sons; 2007. [Google Scholar]

[R37] 37.Mishra B, Meyer G, Bach F, Sepulchre R. Low-rank optimization with trace norm penalty. SIAM Journal on Optimization. 2013;23(4):2124–2149. [Google Scholar]

[R38] 38.Negahban SN, Ravikumar P, Wainwright MJ, Yu B. A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers. Statistical Science. 2012;27(4):538– 557. [Google Scholar]

[R39] 39.Smetters D, Majewska A, Yuste R. Detecting action potentials in neuronal populations with calcium imaging. Methods. 1999;18(2):215–221. doi: 10.1006/meth.1999.0774. [DOI] [PubMed] [Google Scholar]

[R40] 40.Stosiek C, Garaschuk O, Holthoff K, Konnerth A. In vivo two-photon calcium imaging of neuronal networks. Proceedings of the National Academy of Sciences. 2003;100(12):7319–7324. doi: 10.1073/pnas.1232232100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] 41.Vogelstein JT, Watson BO, Packer AM, Yuste R, Jedynak B, Paninski L. Spike inference from calcium imaging using sequential monte carlo methods. Biophysical journal. 2009;97(2):636–655. doi: 10.1016/j.bpj.2008.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] 42.Pnevmatikakis EA, Soudry D, Gao Y, Machado TA, Merel J, Pfau D, Reardon T, Mu Y, Lacefield C, Yang W, et al. Simultaneous denoising, deconvolution, and demixing of calcium imaging data. Neuron. 2016;89(2):285–299. doi: 10.1016/j.neuron.2015.11.037. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] 43.Van de Geer S, Bühlmann P, Ritov Y, Dezeure R, et al. On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics. 2014;42(3):1166–1202. [Google Scholar]

[R44] 44.De Gennaro L, Ferrara M. Sleep spindles: an overview. Sleep medicine reviews. 2003;7(5):423–440. doi: 10.1053/smrv.2002.0252. [DOI] [PubMed] [Google Scholar]

[R45] 45.Nonclercq A, Urbain C, Verheulpen D, Decaestecker C, Van Bogaert P, Peigneux P. Sleep spindle detection through amplitude–frequency normal modelling. Journal of neuroscience methods. 2013;214(2):192–203. doi: 10.1016/j.jneumeth.2013.01.015. [DOI] [PubMed] [Google Scholar]

[R46] 46.Schimicek P, Zeitlhofer J, Anderer P, Saletu B. Automatic sleep-spindle detection procedure: aspects of reliability and validity. Clinical EEG and neuroscience. 1994;25(1):26–29. doi: 10.1177/155005949402500108. [DOI] [PubMed] [Google Scholar]

[R47] 47.Yang Z, Yang L, Qi D. Wavelet Analysis and Applications. Springer; 2006. Detection of spindles in sleep eegs using a novel algorithm based on the hilbert-huang transform; pp. 543–559. [Google Scholar]

[R48] 48.Fraiwan L, Lweesy K, Khasawneh N, Wenz H, Dickhaus H. Automated sleep stage identification system based on time–frequency analysis of a single eeg channel and random forest classifier. Computer methods and programs in biomedicine. 2012;108(1):10–19. doi: 10.1016/j.cmpb.2011.11.005. [DOI] [PubMed] [Google Scholar]

[R49] 49.Causa L, Held CM, Causa J, Estévez PA, Perez CA, Chamorro R, Garrido M, Algarín C, Peirano P. Automated sleep-spindle detection in healthy children polysomnograms. IEEE Transactions on Biomedical Engineering. 2010;57(9):2135–2146. doi: 10.1109/TBME.2010.2052924. [DOI] [PubMed] [Google Scholar]

[R50] 50.Babadi B, McKinney SM, Tarokh V, Ellenbogen JM. Diba: a data-driven bayesian algorithm for sleep spindle detection. IEEE Transactions on Biomedical Engineering. 2012;59(2):483–493. doi: 10.1109/TBME.2011.2175225. [DOI] [PubMed] [Google Scholar]

[R51] 51.Duman F, Erdamar A, Erogul O, Telatar Z, Yetkin S. Efficient sleep spindle detection algorithm with decision tree. Expert Systems with Applications. 2009;36(6):9980–9985. [Google Scholar]

[R52] 52.Ventouras EM, Monoyiou EA, Ktonas PY, Paparrigopoulos T, Dikeos DG, Uzunoglu NK, Soldatos CR. Sleep spindle detection using artificial neural networks trained with filtered time-domain eeg: a feasibility study. Computer methods and programs in biomedicine. 2005;78(3):191–207. doi: 10.1016/j.cmpb.2005.02.006. [DOI] [PubMed] [Google Scholar]

[R53] 53.Devuyst S, Dutoit T, Stenuit P, Kerkhofs M. Automatic sleep spindles detection-overview and development of a standard proposal assessment method. 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society; IEEE; 2011. pp. 1713–1716. [DOI] [PubMed] [Google Scholar]

[R54] 54.Clemens Z, Fabo D, Halasz P. Overnight verbal memory retention correlates with the number of sleep spindles. Neuroscience. 2005;132(2):529–535. doi: 10.1016/j.neuroscience.2005.01.011. [DOI] [PubMed] [Google Scholar]

[R55] 55.Huupponen E, Gómez-Herrero G, Saastamoinen A, Värri A, Hasan J, Himanen SL. Development and comparison of four sleep spindle detection methods. Artificial intelligence in medicine. 2007;40(3):157–170. doi: 10.1016/j.artmed.2007.04.003. [DOI] [PubMed] [Google Scholar]

[R56] 56.Candès EJ, Fernandez-Granda C. Towards a mathematical theory of super-resolution. Communications on Pure and Applied Mathematics. 2014;67(6):906–956. [Google Scholar]

[R57] 57.Duval V, Peyré G. Exact support recovery for sparse spikes deconvolution. Foundations of Computational Mathematics. 2015;15(5):1315–1355. [Google Scholar]

[R58] 58.Needell D, Ward R. Stable image reconstruction using total variation minimization. SIAM Journal on Imaging Sciences. 2013;6(2):1035–1058. [Google Scholar]

[R59] 59.Cohen A, DeVore R, Petrushev P, Xu H. Nonlinear approximation and the space bv(ℝ2) American Journal of Mathematics. 1999:587–628. [Google Scholar]

[R60] 60.Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2005;67(1):91–108. [Google Scholar]

[R61] 61.Pnevmatikakis EA, Paninski L. Sparse nonnegative deconvolution for compressive calcium imaging: algorithms and phase transitions. Advances in Neural Information Processing Systems. 2013:1250–1258. [Google Scholar]

PERMALINK

Fast and Stable Signal Deconvolution via Compressible State-Space Models

Abbas Kazemipour

Ji Liu

Krystyna Solarana

Daniel A Nagode

Patrick O Kanold

Min Wu

Behtash Babadi

Abstract

Objective

Methods

Results

Conclusion

Significance

I. Introduction

II. Methods

A. Problem Formulation and Theoretical Analysis

Theorem 1 (Stable Recovery of Activity in the Presence of Noise)

Proof Sketch

Remark 1

Remark 2

B. Fast Iterative Solution via the EM Algorithm

The outer EM loop of FCSS

The inner EM loop of FCSS

Algorithm 1.

Remark 1

Remark 2

III. Results

A. Application to Simulated Data

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 4.

B. Application to Calcium Signal Deconvolution

Fig. 5.

Fig. 6.

Fig. 7.

Fig. 8.

C. Application to Sleep Spindle Detection

Fig. 9.

Fig. 10.

Fig. 11.

IV. Discussion

A. Connection to existing literature in sparse estimation

B. Application to calcium deconvolution

C. Application to sleep spindle detection

V. Conclusion

Supplementary Material

Acknowledgments

VII. Appendix: Proof of Theorem 1

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases