Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Jul 2.
Published in final edited form as: IEEE Trans Med Imaging. 2014 Mar 21;33(7):1422–1433. doi: 10.1109/TMI.2014.2313000

Synthetic Generation of Myocardial Blood-Oxygen-Level-Dependent MRI Time Series via Structural Sparse Decomposition Modeling

Cristian Rusu 1, Rita Morisi 2, Davide Boschetto 3, Rohan Dharmakumar 4, Sotirios A Tsaftaris 5
PMCID: PMC4079741  NIHMSID: NIHMS586005  PMID: 24691119

Abstract

This paper aims to identify approaches that generate appropriate synthetic data (computer generated) for Cardiac Phase-resolved Blood-Oxygen-Level-Dependent (CP–BOLD) MRI. CP–BOLD MRI is a new contrast agent- and stress-free approach for examining changes in myocardial oxygenation in response to coronary artery disease. However, since signal intensity changes are subtle, rapid visualization is not possible with the naked eye. Quantifying and visualizing the extent of disease relies on myocardial segmentation and registration to isolate the myocardium and establish temporal correspondences and ischemia detection algorithms to identify temporal differences in BOLD signal intensity patterns. If transmurality of the defect is of interest pixel-level analysis is necessary and thus a higher precision in registration is required. Such precision is currently not available affecting the design and performance of the ischemia detection algorithms. In this work, to enable algorithmic developments of ischemia detection irrespective to registration accuracy, we propose an approach that generates synthetic pixel-level myocardial time series. We do this by (a) modeling the temporal changes in BOLD signal intensity based on sparse multi-component dictionary learning, whereby segmentally derived myocardial time series are extracted from canine experimental data to learn the model; and (b) demonstrating the resemblance between real and synthetic time series for validation purposes. We envision that the proposed approach has the capacity to accelerate development of tools for ischemia detection while markedly reducing experimental costs so that cardiac BOLD MRI can be rapidly translated into the clinical arena for the noninvasive assessment of ischemic heart disease.

Index Terms: Synthetic generators, Shift-invariance, Sparse Decomposition, Blood-oxygen-level-dependent, MRI, Heart

I. Introduction

Synthetic data (i.e., computer generated data of good resemblance to real data) are used extensively in medical imaging for the development, testing and validation of critical image analysis tasks. In cardiac imaging for example, synthetic data have been used to develop and validate 4-D segmentation [1] and nonrigid registration methods [2], [3]. In functional Magnetic Resonance Imaging (fMRI) of the brain, testing with synthetic data, is common and has helped the broad adoption of certain analysis approaches [4]. The goal of this paper is to identify mechanisms that permit the generation of synthetic data for cardiac phase-resolved Blood-Oxygen-Level-Dependent (CP–BOLD) MRI. Using this mechanism we further our understanding of CP–BOLD and we can generate synthetic data sets to build, test and validate algorithmic developments of ischemia detection.

CP–BOLD MRI is a new contrast agent- and stress-free imaging technique for the assessment of myocardial ischemia at rest [5], capturing simultaneously BOLD changes (≈10–20%) and wall motion in a single acquisition that can be viewed as a movie (cine BOLD) [6], [7]. Identifying ischemic territories at rest is important, since provocative stress can be contraindicated [8], carrying rare but serious risks [9]. This identification is possible due to differential variations in BOLD signal intensities between systole and diastole, as Fig. 1 illustrates, in a manner which is dependent on the patency of the coronary artery supplying the myocardial territories [5].

Fig. 1.

Fig. 1

Time series extracted from rest CP–BOLD MRI, with a single coronary stenosis, from several remote territories (left) and within the ischemic territory (right) of a single subject.

With the naked eye such differential variations are not visible thus requiring post-processing to visualize and quantify the extent of disease. When transmurality of the defect is of interest, pixel-level, rather than segmental-level, analysis can ensure spatial specificity [10]. This analysis entails the following sequence of steps: i) myocardial segmentation [11]–[16] and subsequent ii) registration [5], [17]–[20] to establish myocardial correspondence at the pixel-level across the cardiac cycle and extract time series of the BOLD variation for each myocardial pixel at each cardiac phase; and iii) an ischemia detection algorithm that is able to discern differences in time series originating from pixels belonging to affected or remote territories, and thus inform the clinician on the part of the myocardium affected by stenosis.

Precise myocardial segmentation and registration still remain a challenge particularly in CP–BOLD imaging due to the temporal variations of the BOLD contrast. The extracted time series can be noisy and affect our understanding of myocardial CP–BOLD effects. The remedy of segmental averaging in turn reduces the number of available time series. Unfortunately, the quality and number of available time series directly affects the design of ischemia detection algorithms. For example, the approach in [5] used only 6 myocardial segments and two cardiac phases to define a hypothesis-based biomarker. Because of these dependencies, developments in ischemia detection algorithms are limited and automated pixel-level of ischemia at rest with CP–BOLD imaging is currently lacking.

Synthetic data, when used in lieu of experimental data, can relax such dependencies and allow for the direct development of ischemia detection tools. However, methods to generate such data for CP–BOLD MRI are lacking and current approaches for generating synthetic data in cardiac MRI cannot be readily deployed. For example, image-based methods [3], [21], [22] do not account for cardiac phase-dependent myocardial signal distribution which is necessary for CP–BOLD MRI. Even physics-based approaches [23]–[26] will also need to account for the cardiac phase-dependent BOLD effect. Currently, two general states (systole and diastole) [5] with two-pool exchange models [27] have been used but multi-state models (one for each cardiac phase of the cine acquisition in CP–BOLD) are not available. Thus, any of these avenues will indeed require the prior investigation of real CP–BOLD time series to understand how the BOLD effect evolves as a function of cardiac phase and in the presence of disease.

In this paper, to further our understanding of BOLD effects, we follow a different approach: we learn statistical models that describe CP–BOLD myocardial time series, on the basis of real experimental data from an animal model of ischemia. To alleviate issues with the accuracy of myocardial registration we use segmental-based time series to learn two models: one for baseline myocardial status and one to describe time series originating from ischemia territories. We then use these models to generate synthetic datasets for which we can alter by design several of their characteristics. We can examine how different acquisition scenarios (acceleration factors and sampling resolution) and the effects of inaccuracies originating from imprecise registration affect algorithm performance, i.e., the correct identification of the disease status of each pixel. Thus, we bypass the steps of segmentation and registration and we can develop and validate ischemia detection algorithms directly in an unbiased fashion. This is important since in the context of BOLD only indirect approaches have been used thus far, relying either on invasive gold standards (such as microsphere-derived segmental perfusion) or on other imaging modalities [5], [10], [28], [29]. By using synthetic data we avoid unnecessary and costly experimentation (involving real subjects) without compromising the validity of the studies and we can accelerate the clinical translation of CP–BOLD.

Specifically, this paper is the first to take a rigorous approach to further our understanding of myocardial CP–BOLD effects using image processing and pattern recognition methods. A new data-driven learning framework builds a statistical model where the CP–BOLD effect is extracted and isolated, relying only on a few input time series. In turn, this allows us to build a synthetic time series generator. Furthermore, we account, using shift-invariant learning, for cases when time series obtained from various myocardial territories under baseline conditions appear slightly shifted (in time) [5], likely because of physiological differences between territories supplied by different arteries [30]. Central to our approach is a sparse multi-component dictionary based method that decomposes time series as linear combinations of template time series (which are also learned from the data) and their shifts. The method can be viewed as a combination of subspace identification and statistical modeling using Gaussian Mixture Models (GMMs) for each of the (low-dimensional) subspaces found. Since all learning models exhibit a trade-off between performance and complexity, we introduce a fitness measure that balances the internal dimensions of the proposed model with the representation performance. Since the final goal is to generate synthetic time series with good resemblance to real data we validate this via supervised classification, an approach not previously considered for this task. To show the benefit of having such generators, as proof of concept, we assess the efficacy of a method previously used in fMRI analysis [4] for unsupervised myocardial ischemia detection and compare it with the biomarker introduced in [5]. We evaluate their performance with regards to the number of time series used and their quality (less or more noisy). Unsupervised ischemia detection outperforms the approach in [5], opening the road to significant advances in algorithmic design.

The paper is organized as follows: Sec. II presents the methodology for learning, modeling, model selection and validation. Sec. III presents our experimental findings and Sec. IV discusses our findings and conclusions.

II. Materials and Methods

We propose a model which i) learns with few input data, ii) offers low representation error when approximating input time series (good representability) without iii) overfitting (i.e., low model complexity) while iv) extracting components and relative parameters that permit the interpretation of the CP–BOLD effect (interpretability) and v) the controllable generation of time series (controllability).

Given a set of MRI datasets, after necessary pre-processing (Fig. 2, block A) detailed in Sec. II-A, we obtain an input matrix Y containing a concatenation of time series (examples of which are shown in Fig. 1). Subsequently, in Sec. II-B, we propose a dictionary learning method to find a sparse approximation of the data based on the dictionary D and the sparse representations X, i.e., each time series (a column of Y) is decomposed as a linear combination of columns of D, weighted by the coefficients in a row of X. This sparse approximation can decompose the input space as a collection of subspaces of much lower dimension (Fig. 2, block B). Based on this approximation, we fit statistical sub-models (Gaussian Mixture Models) on the distribution of the coefficients of each subspace, i.e., the coefficient blocks of X (Fig. 2, block C). The outcome of this process, called Dictionary Based Modeling Algorithm (DBMA), is a model composed of the dictionary D, the statistical sub-models extracted from X, and its control parameters. Depending on the input data (baseline or ischemic), models are learned for each conditional state.

Fig. 2.

Fig. 2

Workflow of the proposed learning procedure.

A learned model (D, Inline graphic) and a set of parameters ( Inline graphic) are given to the proposed generator to produce on request synthetic CP–BOLD time series (Fig. 3).

Fig. 3.

Fig. 3

Proposed generator structure based on DBMA.

A. Experimental data and preprocessing

We use imaging data obtained from a canine experimental model [5] of early ischemia and reperfusion injury briefly described below. Canines (K = 10) had a surgically implanted hydraulically-activated occluder affecting the Left Anterior Descending (LAD) artery. While anesthetized and mechanically ventilated, the subjects were imaged using a clinical 1.5T MRI system during baseline conditions (i.e., without any occluder activation) and severe stenosis (> 90% LAD occlusion). As detailed in [5], CP–BOLD data were acquired mid-ventricle at rest with breath holding twice (pre and 20 minutes post occlusion) using a flow compensated steady state free procession (SSFP) BOLD cine sequence [7] with parameters: field of view, 240×145 mm2; spatial resolution, 1.2×1.2×8 mm3; readout bandwidth, 930 Hz/pixel; flip angle, 70°; TR/echo time (TE), 6.2/3.1 ms; temporal resolution 37.2 ms; and 24 time frames. Late Gadolinium Enhancement (LGE) images were acquired within 20 minutes post occlusion (to rule out early infarction) and after 3 hours of occlusion and during reperfusion (to identify myocardial regions succumbed to ischemic tissue damage), using the sequence described in [31] with parameters: spatial resolution, 1.3×1.3×8 mm3; TE/TR, 3.9/8.2 ms; TI, 200 to 220 ms; flip angle, 25°; and readout bandwidth, 140 Hz/pixel.

We pre-process BOLD stacks to obtain CP–BOLD time series for training and testing the models presented herein (see also Fig. 2, block A). For each subject k out of total K, endo- and epicardial boundaries are manually traced in one image (in systole) and propagated automatically to all images of the cardiac cycle [32]. Subsequently, the myocardium is further segmented automatically into j = 1,, J segments/image. For each segment the average intensity is recorded, resulting in average segmental BOLD SSFP signal intensity as a function of cardiac phase, i.e., a time series. All time series are spline interpolated and sampled at n equally spaced time points to create series of the same length across the study population.

Each jth myocardial segment of the CP–BOLD acquisition at 20 minutes post occlusion is identified as “ischemic” if tissue damage is observed on the LGE images acquired after 3 hours of occlusion and during reperfusion and as “baseline” otherwise. All segments of the CP–BOLD acquisition pre-occlusion are identified as “baseline”. Overall, N time series each of length n and of the same status are concatenated to form the input matrix Y ∈ ℝn×N.

B. MC–DLA: Multi-Component Dictionary Learning

Finding multivariate models (at the extreme of one variable for each of the n cardiac phases) when relying on few training data is difficult due to the known “curse of dimensionality” [33]. Furthermore, as Fig. 1 shows, CP–BOLD time series contain a structure which appears to shift through cardiac phase locations (i.e., the curves appear to be shifted versions of each other). Thus, the learned model should reveal (and retain) such structure to aid in the interpretation of the findings. Various ensemble methods, such as GMMs or random density forests [34], even when combined with dimensionality reduction techniques are agnostic to shift-invariant structure of the time series and as such do not permit the interpretation of cardiac-phase dependence of the myocardial BOLD effect. On the other hand, due to the sparsity constraint and the robustness to noise [35], sparse decompositions have been shown to be crucial for the development of robust learning algorithms in high dimensions [36], [37], such as the time series modeling considered herein. Furthermore, sparse decompositions generally offer useful interpretations of the data [38] and based on recent developments can also be handle shifts [39], [40].

Dictionary learning algorithms (DLAs) find dictionary D and sparse representations X such that YDX. Solving this problem involves the ℓ0 pseudo-norm || • ||0, the number of non-zero elements. The goal is to minimize, under sparsity constraints, the representation error EF2=i,jeij2, i.e., the Frobenius norm of E = YDX. However, general DLAs cannot account explicitly for a specific structure [41], as opposed to recent shift-invariant DLA [40]. Combining the advantages of both, here we propose a multi-component dictionary learning method (Fig. 2).

Algorithm 1. MC–DLA: Multi-Component Dictionary Learning Algorithm.

Given the time series Y and dimensions p, m, sc and sd, design the dictionary D, and the representations X, such that ||YDX||F is reduced.

  • Initialization: Compute [C, X(C)] = C–DLA(Y, p, sc).

  • Iterations: For iter = 1,, 100

    1. Compute [Dg, X(D)] = K–SVD(YCX(C), m, sd).

    2. Compute [C, X(C)] = C–DLA(YDgX(D), p, sc).

We formulate the multi-component dictionary learning as:

minimizeD=[C,Dg],XY-CX(C)-DgX(D)Fsubjecttoxi(C)0sc,xi(D)0sddj2=1,1jp+m. (1)

At the core of the proposed formulation is the circulant dictionary C ∈ ℝn×p with p allowed shifts (0 ≤ pn), where each of them is a down circular-shifted version of the first column denoted as c ∈ ℝn, referred to as kernel. Using this kernel we account efficiently for the shift structure in the data. The kernel in baseline conditions, as we will subsequently see, corresponds to the characteristic CP–BOLD curve and overall should follow the cardiac phase-dependent variability illustrated in Fig. 1. Its data-driven recovery is a significant innovation of this work.

The general dictionary component Dg ∈ ℝn×m has m columns dj called atoms. Overall we denote D = [C, Dg] ∈ ℝn×(p+m) as the multi-component dictionary and X = [X(C); X(D)] ∈ ℝ(p+mN, with each column i partitioned in xi(C)p and xi(D)m, 1 ≤ iN. Variables sc and sd impose the target sparsity for the circulant and the general dictionary coefficients, respectively. Restrictions on these parameters are naturally set according to the dimensions of each component: 0 ≤ scp and 0 ≤ sdm. The total target sparsity is s = sc+sd. Dictionaries for which n < m are called overcomplete. Since we want to avoid overfitting our model, in this paper we consider dictionaries with m < n. Using the split proposed in (1), the two components are separated and efficient solvers are used to find them. This improves representation performance and interpretability.

To solve (1) we alternately update X, Dg and C. Although this alternate updating is a standard procedure deployed in dictionary learning, where only Dg and X(D) are available [41], here we alternate the 3 unknowns and use efficient solvers for each [40], [42]. We split the overall target sparsity s among the two components, namely sc = 1 and sd = s − 1.

To recover the general dictionary, the K–SVD algorithm [41] uses an alternating optimization procedure to update the unknowns (Dg, X(D)): keep Dg fixed and find X(D) using the Orthogonal Matching Pursuit (OMP) algorithm [43]; keep X(D) fixed and update Dg one atom at a time. The power of K–SVD relies in its atom by atom update rule that simultaneously updates also the corresponding coefficients in the sparse representations. The whole update procedure is based on a rank–1 approximation of an error term. We refer to the K–SVD algorithm as [Dg, X(D)] = K–SVD(Y, m, sd).

In order to account for the shift-invariant structure we use a recently developed state-of-the-art shift-invariant DLA, the Circulant Dictionary Learning Algorithm (C–DLA) [40]. This approach relies on the observation that circulant dictionaries explicitly construct atoms in all their possible shifts. The solution is based on several developments: the eigenvalue factorization of circulant matrices, the internal symmetries of the Fourier transform and the observation that a circulant dictionary C when used together with the OMP algorithm behaves as a shift-invariant system. The final result is obtained by solving a sequence of ⌈n/2⌉ complex least squares problems of dimension 1 × N involving the Fourier transforms of the rows of Y and X(C). Here we employ an extension of the C–DLA. Instead of designing a full circulant dictionary C we can create a partial circulant dictionary C ∈ ℝn×p that allows for only p consecutive shifts explicitly, out of the n total. We refer to C–DLA as [C, X(C)] = C–DLA(Y, p, sc).

When put together, K–SVD and C–DLA constitute the proposed MC–DLA, outlined in Algorithm 1. We use a fixed number of iterations, primarily for clarity and because we observed rapid convergence.

With [D, X] = MC–DLA(Y, p, m, sc, sd) we refer to MC–DLA. Note that MC–DLA(Y, 0, m, 0, sd) ≡ K–SVD (Y, m, sd) and MC–DLA(Y, p, 0, sc, 0) ≡ C–DLA(Y, p, sc). For simplicity, Y is implied and omitted, when allowed.

C. DBMA: The Dictionary Based Modeling Algorithm

Following the previously defined MC–DLA and the general framework presented in Fig. 2, this section describes an efficient sparse generative model of the data. The goal is to create models that allow for a very good representability of the training data, while they retain interpretability and are easy to integrate into a synthetic data generator. To this end, we use the previously defined dictionary and sparse decomposition methods to separate and allow easy access to the feature of interest (i.e., the shift-invariant component), while constructing a natural method to randomly sample from the union of subspaces created by the model (Fig. 2, block B). To achieve this, we statistically model the coefficients used for each subspace and create sub-models that capture a particular behavior in the data (Fig. 2, block C).

The atoms and the subspaces created by their combination, maximum(ms) with ms, offer deep insight into the geometric properties of the data, while they provide robustness to noise, outliers and overfitting due to the sparsity prior. The method is especially effective in dealing with high dimensional data by projecting results onto lower dimensional subspaces that are easier to manage statistically and algorithmically.

Algorithm 2. DBMA: Dictionary Based Modeling Algorithm.

Given time series Y, create model Inline graphic = { Inline graphic, …, Inline graphic} that accurately describes the input.

  1. Learn a Gaussian Inline graphic(μm, σm) for time series’ mean.

  2. Normalize time series and remove their mean in Y.

  3. Invoke [D, X] = MC–DLA (Y, p, m, 1, s − 1).

  4. Find all L subspaces composed of unique combinations of atoms from D using X.

  5. For each subspace ℓ = 1, …, L:

    1. Find time series Y ∈ ℝn×N used by D.

    2. Separate coefficients X ∈ ℝs×N used by D

    3. Using X, construct the GMM

      G=j=1JπjN(μj,j), where J is the number of mixtures in the GMM (chosen during training) while μj ∈ ℝs and Σj ∈ ℝs×s represent the mean vector and covariance matrix of the Gaussian, all created for model ℓ.

    4. Estimate sub-model Inline graphic importance P = N/N.

  6. Construct the error matrix E and model its contribution as additive white Gaussian noise Inline graphic(με, σε).

Each sub-model is fully determined by its s atoms and an associated Gaussian Mixture Model (GMM). GMMs have become particularly useful in the modeling of data densities and have produced satisfactory results in a number of applications [44]. However, in large dimensions they are met with significant difficulty since the limited number of samples results in non-singular covariance matrices, and thus some form of regularization is required [45]. This motivates our solution: separate the input space into a union of low-dimensional subspaces and statistically model each one.

Algorithm 2 shows the steps of the proposed modeling approach, which we call Dictionary Based Modeling Algorithm (DBMA), with a visual description provided in Fig. 2. At the core of DBMA lies the MC–DLA. Based on the inputs p, m and s we create the dictionary D and its sparse representation coefficients X. During the modeling steps the coefficient matrix X is used exclusively to group the input time series into clusters that lend themselves to easier statistical modeling. Based on X, the next step calls for the search of all subspaces (all unique combinations of s atoms actually used) obtained after the dictionary learning. Albeit the total number of subspaces can be very high, maximum(ms-1)p assuming ms − 1, we expect to have only a fraction of this number due to highly correlated and structured real world data. To create an effective learning mechanism we use the coefficients as training data to fit GMMs.

For each subspace D ∈ ℝn×s we use the reduced (without zeros) coefficient matrix X ∈ ℝs×N, where N denotes the number of time series attributed to sub-model ℓ, to create efficient and stable statistical models on the low-dimensional space of size s –note that snN. Since for each D we know the corresponding atoms from the dictionary, after sampling the GMM we can generate synthetic output data – thus, each sub-model Inline graphic = (D, Inline graphic) contains the subspace information and the statistical GMM associated with it. To estimate the importance of each sub-model P, we compute the percentage of time series that belong to the subspace. P can be seen as a probability measure since =1LP=1. Finally, we account for the residual error E of the general model as additive Gaussian noise Inline graphic(με, σ ε).

D. Model selection

We mentioned previously that the parameters of DBMA must be identified to balance crucial and competing properties: controllability and interpretability against model complexity and representation capabilities. More degrees of freedom (i.e., more atoms and larger target sparsity) lead to lower representation error but they increase model complexity, number of subspaces and statistical sub-models and do risk overfitting. Thus, we should avoid overly complex solutions where each sub-model may reflect idiosyncrasies of the data and not valuable key characteristics. Since we are dealing with different types of components (the m generic and the p circulant atoms), in order to provide a fair comparison we consider as the total degrees of freedom the maximum number of subspaces the model can reach: (ms-1)p in this particular case of MC–DLA.

We propose a measure that can evaluate dimension selection, deviating from traditional performance bounds in sparse decomposition and overcomplete dictionaries. When evaluating the performance of overcomplete dictionaries, in general, two obvious indicators are widely used: the representation performance [41] and the mutual coherence of the dictionary μ(D) that directly affects the successful recovery rate of sparse approximation algorithms [46]. Both approaches focus on the computation of incoherent dictionary atoms during the training phase [42], [47], while in our case the kernel and its shifted versions may be correlated. Other statistical model selection methods, such as the Akaike or Bayesian information criteria or Minimum description length [48], which aim to balance model likelihood and dimensions, do not fit the approach proposed in this paper since many subspaces are created, each with varying dimension and importance.

Given a model Inline graphic, based on MC–DLA(p, m, 1, s − 1) with L sub-models, we introduce the model fitness measure:

F()=εL(ms-1)p, (2)

that not only balances the relative representation capability:

ε=Y-DXF/YF, (3)

but takes into account also the full available search space (the combinatorial term). The number of sub-models L, relative to the total number of possible sub-models, affects directly the computational complexity of the model.

We are interested in finding the model corresponding to a minimum value of the fitness –low relative representation error and model dimension. Notice that F( Inline graphic) is dimensionless but positive, with a zero minimum value achieved when the representation error is exactly zero. When m = N a trivial solution exists (Dg = Y, X(D) = IN) and perfect representation is achieved with target sparsity one. Of course, here we focus on solutions where mN.

We will show that given certain conditions, there exist local minima of (2) with respect to the dictionary dimension m. These minima reflect the ideal balance between representation error and complexity. The choice of p will be discussed in Sec. III-B. Also, for simplicity, but without sacrificing generality, we assume the target sparsity s fixed.

Let us consider for a given dataset, fixed sparsity s and shifts p, three models Inline graphic, i = −1, 0, 1, where each utilizes m + i general dictionary atoms with representation errors ε(i) and L(i) sub-models. Assuming Inline graphic is a local minimum:

F((-1))F((0))F((1)), (4)

using (2) and expanding the factorial terms, we reach:

ε(0)L(0)min{ε(1)L(1)m+1m-s+2,ε(-1)L(-1)m-s+1m}, (5)

which shows how representation error and complexity relate for consecutive m. Recall that the problem dimensions obey 1 ≤ smN while ε(i) ∈ [0, 1] and L(i) ∈ {1, …, N}.

Inequality (5) conveys information on the relative importance of representation error and complexity. For example, assuming for simplicity that L(0)L(1) we find that ε(0)m-s+2m+1ε(1) must be true in order for F( Inline graphic) ≤ F( Inline graphic) to hold. As m increases, this bounds the representation error improvement that has to occur for the new model with more atoms to be considered better according to (2).

Generally, we are interested in conditions for which the existence of minima, even local, is possible as mN. Result (5) is useful but involves knowing the values of the error and complexity. Although both clearly depend on m, good, generally proven formulations or estimates are not available. Motivated by findings of others [46] and by results in Sec. III, we study the dynamics of the proposed fitness measure assuming an exponential evolution of the representation error:

ε(i)=aeb(m+i), (6)

with a, b ∈ ℝ, b < 0 and a > 0. Using these assumptions, writing the complexities L(1) = γ(1)L(0), L(−1) = γ(−1)L(0) and including (5), minima points exist if:

γ(1)ebm+1m-s+2,γ(-1)e-bm-s+1m. (7)

In the general case, we can assume that the dynamics of the complexity follows γ(1) ≥ 1 and γ(−1) ≤ 1, i.e., the number of used subspaces increases with the increase in the total search space. These bounds on γ(1), γ(−1) are met when m ≤ ⌊(s − 1)(1 − eb)−1−1⌋. The assumptions and bounds provided in this section fall into the regions of interest of mN and are not particularly restrictive. In Sec. III we support the choice made in (6) and show experimental results that validate F( Inline graphic).

E. Sampling from the sparse generative model

As Fig. 3 illustrates, given a learned model Inline graphic, the dictionary D and some parameters (denoted as Inline graphic), generating synthetic time series can be done efficiently on request since sampling from GMM distributions is well known. First, a sub-model (D, Inline graphic) is selected and then synthetic time series are generated by drawing projection coefficients following their GMM distribution and using them as weights to linearly combine the atoms of each individual subspace. Concretely, to generate synthetic dataset Ŷ ∈ ℝn× the process is:

y^i=Dzi+ni+mi1,i=1,,N^, (8)

where zi ∈ ℝs×1, zi ~ Inline graphic, ℓ is chosen randomly with probability P, ni ∈ ℝn, ni ~ Inline graphic(με, σε), 1 ∈ ℝn is the all-ones vector and mi ~ Inline graphic(μm, σm). The set Inline graphic includes:

  1. –the number of output series to generate.

  2. Inline graphic –the sampling method. To generate time series use: ALL –all of the available sub-models each selected with probability P or SINGLE –a single sub-model.

  3. Systolic to Diastolic Ratio (SDR) –the maximum BOLD contrast, as defined in [5]: ratio between the maximal (end-systole) and last (diastole) values.

  4. Inline graphic(με, σε) –the Gaussian model of the residual. Obtain a vector ni randomly sampled from the Gaussian.

  5. Inline graphic(μm, σm) –the Gaussian model of the mean. Obtain a random sample mi and add it to each time series.

Varying these parameters can reflect many effects such as: different flip angles and magnet strengths (e.g., 1.5T or 3T) [6], accuracy of registration, noise level, resolution and others. While some parameters are modeled in a simple fashion (e.g., as additive Gaussian processes) others stem from the power of the proposed decomposition method to separate effects of interest (BOLD phase dependent variability and shifts) from noise, adding the desired interpretability and controllability. This further permits the definition of a sampling methodology Inline graphic and the extraction of CP–BOLD related information (e.g., SDR and the characteristic CP–BOLD curve itself).

F. Validation of synthetic CP–BOLD myocardial time series

For the important task of validation here we indirectly validate the resemblance between synthetic and real data following an application driven approach: we show that synthetic and real data behave the same when used in a supervised learning context. We learn support vector machines (SVM) [49] on the real data (binary classification to baseline or ischemia) and test them on several instances of synthetic data and vice versa.

We utilize the dataset of real K subjects with 2KJ available time series, for which ground truth is available on the basis of LGE imaging (Sec. II-A). Keeping the baseline to ischemic time series overall ratio fixed, we use the synthetic generators to create γ sets of K “synthetic subjects”, for which ground truth is known by design. We choose an SVM with a radial basis function kernel with scaling factor σ = 1 and a soft margin ν = 1. We train it using each time series yi directly as an input vector without any feature extraction.

We consider the following training/testing combinations:

Cross validation accuracy on real data

We split the real subjects using a leave one subject out strategy, training with K −1 and testing on the remaining one. We measure accuracy as the number of correctly identified time series. The average (and standard deviation) of this cross validation experiment serves as benchmark performance on real data.

Cross validation accuracy on synthetic data

Similar to the above, for each of the γ sets we train the classifier on the K − 1 synthetic subjects, test on the remaining one.

Classification accuracy of synthetic with classifier trained on real data

The SVM is trained with all real K subjects and is tested on each of the γ synthetic sets.

Classification accuracy of real data with classifier trained on synthetic data

We train the SVM picking one of the γ sets of K synthetic subjects. We test accuracy on the real data.

Accuracy is always the average over all γ sets. An accuracy similar in value across the four combinations, tested with two sample t–tests, would indicate that the classifier performs the same irrespective of training data origin.

III. Results

We utilize imaging data from the same K = 10 canine subjects in controlled rest experiments. Overall, 20 CP–BOLD MRI stacks (2 per each subject) are available: 10 without and 10 with critical LAD stenosis. We use J = 24 segments, and interpolate to length n = 28. In total N = 480 segmental time series are available. Based on LGE imaging 3 hours post occlusion and during reperfusion, Ni = 80 are considered ischemic (i.e., the segments were affected by the stenosis) and Nb = 400 are baseline (i.e., we group together time series from baseline and from remote territories in rest-stenosis studies since they were found to behave the same).

We show in sequence, results on representation accuracy, model selection, DBMA and its validation. Then, we use DMBA in a proof of concept experiment (Sec. III-D). We conclude showing statistically the existence of shifts (Sec. III-E).

A. Model accuracy and representation capabilities

To evaluate the representation capabilities of the proposed method we use the metric defined in equation (3). As reference we include the Principal Component Analysis (PCA).

Referring to Table I, with respect to PCA, the simplest model (the circulant only) performs quite well, indicating a strong circulant structure in the data. By comparison, the general dictionary with unitary sparsity and 5 atoms offers worse performance. As the number of atoms increases, the general dictionary component outperforms the circulant component, indicating that there is a circulant structure but that it is localized. In fact, observing for MC–DLA(28,0,1,0) which circulant atom (i.e., which shift of the kernel) is mostly used in reconstructing each time series, we can see the importance (energy content) of each shift, shown in Fig. 4. The energy content for each shift is computed as the sum of the squares of the representation coefficients where that shift has the largest correlation (in an OMP fashion, with target sparsity 1). The values are scaled to the total energy content, the square Frobenius norm of the coefficients matrix. The grouped peaks and their wide spread allude to the presence of localized shift structure attributed to variability in ES among subjects.

TABLE I.

Relative representation error ε (%) of various settings of MC–DLA(p, m, sc, sd), s = sc + sd.

MC–DLA Target sparsity (s)/Number of components
1 2 3 4 5 6
Baseline data
Circulant only
(4, 0, s, 0) 62.9 57.2 44.1 38.8
(28, 0, s, 0) 50.6 38.3 36.2 25.5 20.1 17.0

General only
(0, 5, 0, s) 53.4 40.9 34.3 30.6 29.9
(0, 10, 0, s) 47.3 35.0 27.7 22.3 18.6 16.6
(0, 15, 0, s) 46.1 30.5 24.3 19.8 17.5 15.3

Multi-component
(4, 6, 1, s − 1) 36.7 29.0 25.1 22.3 21.5
(4, 7, 1, s − 1) 35.7 27.9 22.8 20.1 18.5
(4, 10, 1, s − 1) 33.7 25.8 20.8 18.1 14.9
(28, 15, 1, s − 1) 28.7 21.3 17.7 14.7 12.5

 PCA 69.9 56.1 43.9 36.2 29.9 25.0

Ischemic data

General only
(0, 6, 0, s) 50.9 38.0 29.6 25.0 24.1 23.8
(0, 7, 0, s) 48.6 35.2 26.8 21.8 19.5 18.7
(0, 8, 0, s) 46.7 29.9 24.5 20.1 17.6 16.0

 PCA 78.6 57.5 43.6 35.5 29.1 23.8

Fig. 4.

Fig. 4

Global importance of each kernel shift found via the circulant only MC–DLA(28, 0, 1, 0) using all baseline data.

While for baseline conditions we identify a strong underlying structure, we do not expect or impose such underlying structure in ischemic time series. Thus, we use only the general dictionary component of MC–DLA, to identify possible sub-models. Table I depicts representation errors and compares various MC–DLA settings with the PCA. Naturally, the representation error decreases as model complexity increases (model dimensions, p or m, and target sparsity s).

Our findings and the atoms shown in Fig. 5, lend to the following interpretation of the data: under baseline conditions time series contain a strong structure, the characteristic CP–BOLD curve, represented here by the circulant kernel and its shifts, plus a collection of additional components (attributed to subject variability, or to registration-related inaccuracies) modeled by the general dictionary atoms, which when linearly combined can alter this curve. The kernel, shown in Fig. 5, relates the intensity of the CP–BOLD signal as a function of cardiac phase: at ≈ 33% of the nominal cardiac cycle the highest value is present. In healthy conditions, this is a physiologically expected location for end-systole [5], [7] and is in agreement with theoretical findings in [5] that in end-systole the BOLD effect should the highest.

Fig. 5.

Fig. 5

Dictionary components (atoms) estimated with MC–DLA(4, 6, 1, 3) from baseline time series: (left) the circulant kernel and its three shifts; (middle) the general dictionary atoms; (right) their relative importance.

Throughout this paper we pay attention to extracting the kernel and its shifts, since both are crucial to the interpretability and controllability of the proposed model. We can also use them to recover ES location variability between subjects. In our dataset using the myocardial segmentations (Sec. II-A), on the basis of the standard identification of ES as the minimum blood volume in the ventricle, we find the location of ES at 37 ± 6%. Observing which kernel is used the most for each subject, combined with the peak’s location in the kernel, we estimate an ES average position at 37 ± 11%. A paired t–test comparing the two showed no difference (p–value=1).

B. Model selection

Representation performance, albeit crucial, is not the only indicator we are interested in. For example, the model MC–DLA(4, 10, 1, 5) achieves a representation error of 14.9% while the general only model MC–DLA(0, 15, 0, 6) achieves a higher error of 15.3%. Recall that for the multi-component method one component from the total s is restricted to one of the four circulant atoms –compared to the general model that can access 15 independent atoms in any combination.

Since we are not only interested in good representations, we use the proposed fitness measure (2) to search for a more suitable model. For a fixed s = 4, increasing values of m and several values of p, we show in Fig. 6 the fitness of some models. Clearly, a local minimum point is present in m = 6 for p = 4. As we hypothesized, representation error decays exponentially with respect to m as Fig. 7 shows. Following the error model in (6), we estimate ε(m) ≈ 0.21e−0.009m.

Fig. 6.

Fig. 6

Model fitness for various values of m and p (s = 4).

Fig. 7.

Fig. 7

Representation error evolution achieved by the general dictionary component, with the increase in m shown on a logarithmic scale, after the removal of the circulant contribution.

While in the methods section we focused on the selection of m, with respect to p several observations can be made. Fig. 6 for example shows how with varying p, the fitness value differs, and we could repeat the analysis of Sec. II-D with respect to s. Interestingly, we could identify a suitable value of p via the importance plot in Fig. 5. As p increases the importance for the newly added shift decreases. After a certain cut-off value the importance of the atoms in the general dictionary will outweigh the benefit of further considering shifts of the kernel. This heuristic provides very good results. For example, the p = 4 allowed shifts account for approximately 60% of the energy, showing the strong structure existing in the data. The first non-circulant atom is less important than any circulant one. These experimental verifications motivate the selection of MC–DLA(4, 6, 1, 3) as the model of the baseline time series.

Having demonstrated the representation capabilities of the proposed formulation we are now interested in its modeling and generating capacity. We consider two models of DBMA: Inline graphic for baseline time series (using the composite learning method) found with MC–DLA(4, 6, 1, 3) and Inline graphic for ischemic time series, which due to the lack of structure uses only general components found via MC–DLA(0, 7, 0, 6).

Fig. 8 shows sub-models of both learned models. For the 6 baseline sub-models, it is evident that each sub-model represents a different shift of the kernel and alternations to its shape. The percentage of utilization of each sub-model (P) is also shown. As the utilization decreases the deviation from the characteristic CP–BOLD curve increases. The ischemic sub-models lack a consistent characteristic behavior.

Fig. 8.

Fig. 8

Representative sub-models found via MC–DLA(4, 6, 1, 3) for baseline (in blue) and MC–DLA(0, 7, 0, 6) for ischemia (in red). We show real time series belonging to each sub-model and their principal component found via the SVD (thicker line), with the utilization percent of each.

C. Validation and comparison with a PCA-derived model

Since DBMA is designed specifically for BOLD time series analysis and no reference method exists, direct comparisons with other approaches are difficult. General statistical modeling approaches, while they can still provide acceptable representation error (see the performance of PCA in Table I) they do not necessarily provide insight into the structure of the data and extract the key expected feature: the characteristic CP–BOLD curve (see Fig. 5). It is expected that any model of high complexity should be capable to represent the input data with low representation error. Thus, one must take into account not only error but also parsimony (model size), interpretability and controllability in terms of the CP–BOLD effect.

We compare with a PCA-derived decomposition, which we refer to as PCA–GMM, which uses a single model not several sub-models as with DBMA. With 6 principal components obtained from Y, using the so-called reduced singular value decomposition Y=U66V6T, Σ6 ∈ ℝ6×6, we learn a mixture of 10 Gaussian components for the coefficients 6V6T in the basis U6. Since the EM algorithm deployed to fit the mixture encounters convergence problems, the results presented are obtained after several runs with different initializations.

The baseline DBMA generator uses the options: b = 18, Inline graphic = SINGLE while SDR and the Gaussian models of the residual and the mean are estimated by DBMA. The ischemic generator uses the options: i = 6, Inline graphic = ALL, other parameters are estimated by DMBA. We create γ = 100 synthetic sets reflecting K = 10 synthetic subjects, with = 24 total time series. Thus, the created synthetic subjects have similar composition to our real subjects. Fig. 9 shows, for visual comparison, baseline time series used for training (Y) and synthetic (Ŷ), equally numbered.

Fig. 9.

Fig. 9

Examples of real (left) and synthetic (right) time series.

As presented in Sec. II-F, we use a supervised approach for validating time series generated by DBMA with four combinations of training and testing sets (results in Table II). In all cases an accuracy greater than 80% is observed. (Although this accuracy could increase with better choice of features, this is beyond the scope of this paper.) The goal is to provide strong evidence that the synthetic data closely resemble the real ones. We find no statistical significance when comparing average accuracy independently on the origin of the training data and testing data (t–tests, p–values > 0.5). This indicates that the classifiers have comparable behavior when trained on the real and tested on the synthetic dataset and vice versa. Thus, we show indirectly the equivalence between the real and synthetic data (i.e., they lie on the same manifold). The synthetic data generated by the PCA–GMM perform similarly to the results presented in Table II, albeit the modeling approaches are different. This behavior is due to the significant variability between baseline and ischemic time timeseries. However, when trained and tested on synthetic data, PCA–GMM achieves an undesirable higher accuracy (85±2%) indicating that PCA-GMM produces synthetic sets will less variability than the real input data. Note that we are not in the pursuit of high accuracy, but in replicating the results obtained with the real dataset.

TABLE II.

Accuracy (mean ± standard deviation) for the four setups of SVM-based validation of the synthetic generators.

Training set Testing set
Real Synthetic
Real 83 ± 6% 84 ± 0.7%
Synthetic 85 ± 0.5% 83 ± 3%

Next we evaluate the capability of the synthetic data in retaining the shift structure (and its importance function) of the input data. Given the input dataset and its importance function (see Fig. 4), using both methods (MC–DLA and PCA–GMM) we learn and generate several synthetic datasets containing = N time series. For each synthetic dataset we learn a kernel, extract its importance function, as described in Section III-A, and correlate it with the one of the input dataset (i.e., Fig. 4). Over 200 realizations, MC–DLA obtains an average correlation of 92%, and PCA–GMM of 88%. The difference is statistically significant (p–value < 10−6; t–test). Synthetic experiments, not shown for brevity, involving larger datasets of greater shift variability (in real data this would occur due to larger variance in ES position, possibly due to arrhythmia) and noise show that this performance gap increases in favor of MC–DLA since it accounts explicitly for shift structure.

D. Synthetic data to evaluate unsupervised ischemia detection

As a proof of concept, we use synthetic time series obtained from the two generators as inputs, in lieu of real data, to compare the performance of Independent Component Analysis (ICA), an unsupervised pattern recognition method, with the S/D approach in [5] for ischemia detection. ICA is used extensively in fMRI for the detection of activation areas in a subject’s brain [4]: it identifies activated areas (i.e., variations in time series) by decomposing the observed fMRI time series in a linear combination of different spatial components. Drawing a parallel between fMRI and CP–BOLD, we explore ICA as an unsupervised method for separating ischemic from baseline CP–BOLD time series extracted from a “synthetic subject’s” data. Using synthetic data we know by design the ground truth, i.e., the originating class of each time series; thus, exact validation is possible, making comparisons between ICA and S/D fully quantitative. We show accuracy with respect to varying the number of time series (reflecting variations in spatial resolution), the percentage that are ischemic (reflecting variations in coronary patency) and finally the amount of artificial noise introduced (reflecting errors in registration or different acquisition scenarios, e.g., acceleration factors).

We assume an input observation matrix originating from a single “synthetic” subject. We use the same setup (same models) as before, and sample the baseline and ischemic synthetic generators, varying the composition with respect to the amount of baseline (b) and ischemic (i) time series, choosing ratios (i/) reflecting realistic situations of single vessel disease (where = i + b). We follow [4] to obtain a single independent component (of zero mean), which carries information on the class of each time series. Since we are not interested in inference, we use a fixed threshold of 0 on the independent component’s values to label each time series and compare it to the ground truth.

Table III summarizes our findings after generating 100 “synthetic” subjects for each composition. To demonstrate the benefit of utilizing all cardiac phases, in Table III we also compare, for all experimental settings, the accuracy between ICA and S/D [5]. S/D uses the ratio of the time series values at end-systole and diastole, and labels baseline a time series when this ratio is over 1 and ischemic otherwise. In a synthetic dataset, we assume the position of diastole as the last point in the time series and estimate ES as the location in the utilized kernel (or shifted kernel) of the sub-model that generated the data with the maximum intensity value. ICA outperforms S/D in all cases and exhibits less variability.

TABLE III.

Comparing accuracy (%) of ICA and the S/D biomarker [5] for different synthetic compositions.

i/ = 300 = 150 = 24

ICA S/D ICA S/D ICA S/D
2/5 93±5* 73±6 91±5* 73±8 85±7* 68±12
1/3 88±5* 75±6 87±6* 74±7 82±7* 72±12
1/4 80±5* 78±7 79±6 76±7 77±8 75±12
*

denotes statistically significant difference (p–value< 0:05) between ICA and S/D for the same choice of and i.

To evaluate robustness to additive noise, we varied σε of the noise model Inline graphic(με, σε) considering = 600 and i = 200. Increasing σε from 0 to 16 results in a 6% drop in the performance of the ICA (92% to 86%) while the drop of S/D is significantly higher, 11% (81% to 70%).

To put the obtained accuracy of ICA in perspective, when i = 100 and = 300, its accuracy is 88 ± 5%. This value is close to the one obtained with the supervised method of SVM on real data which was used for validation (Table II) Although here we used synthetic data, it illustrates that unsupervised identification of ischemia on the basis of time series extracted from myocardial pixels on the basis of CP–BOLD MRI is possible, and that fMRI literature and methods, particularly those based on decompositions, can provide fruitful inspiration.

E. Investigation of shifts with circulant only MC—DLA

We previously saw that a characteristic CP–BOLD curve exists and that shifted versions of it are utilized throughout the dataset (Fig. 4). Such shifts may arise due to variations in end-systole (ES) location as discussed previously. In addition, the presence of shifts across different territories of the same subject, is likely attributed to physiological differences among different arteries [5], [30]. Relying on the baseline dataset and MC–DLA(28, 0, 1, 0) we can construct a statistical procedure to test for their existence. Based on the learned circulant kernel c ∈ ℝn and MC–DLA(28, 0, 1, 0) we can obtain the importance of each shift for each subject k, denoted as Ik(i). If no shifts exist between time series of the same subject, Ik(i) should be dominated by a single high valued peak; thus, less collective importance on the remaining peaks is expected and the lower their collective importance the less shifts would exist. Using this simple observation, on average we find 72 ± 6% importance outside the main peak, and when tested versus 25% the null was rejected (p–value < 10−4).

IV. Discussion and Conclusions

The overarching goals of this work are (a) to generate synthetic CP–BOLD time series with good resemblance to real data; and (b) to provide insight into the characteristics of the differential variations in BOLD signal intensities.

To achieve these goals, the proposed composite dictionary learning algorithm provides a proper, data-driven, representation of the input time series. Each time series is represented by a linear combination between one circulant atom (the kernel or one of its shifts), and other dictionary atoms. We are able to identify a characteristic behavior under baseline conditions which supports the notion that the BOLD signal is cardiac phase dependent and that it can vary across myocardial territories, possibly due to physiological/anatomical differences between subjects [5], [30]. Most importantly, our results suggest that all baseline CP–BOLD time series should follow this behavior, which we refer to as the characteristic CP–BOLD curve. These findings, demonstrated here for the first time are remarkable and provide further insight into cardiac physiology.

The availability of the proposed synthetic generators allows us to build and test detection algorithms independently of accurate myocardial segmentation and registration and without requiring new experimental data. In this paper, as proof-of-concept, with synthetic data we show that classical unsupervised methods (based on ICA) can potentially be used to identify areas of ischemia in CP–BOLD MRI, outperforming previously considered methods [5]. Highlighting the importance for precision in myocardial segmentation and registration, we also show that the number of available time series and their quality (i.e., noise level) directly affects the accuracy of ischemia detection. In the future, with our generator, for example, we could also test (and quantify) the impact of shift variability on the accuracy and design of ischemia detection algorithms. We anticipate a lower accuracy if the presence of physiological shifts are not accounted for, when designing the algorithm.

Furthermore, insights on how the BOLD effect varies according to cardiac phase and disease status, may be integrated within image-based generators (e.g., [3]), to create synthetic but realistic imaging stacks. These stacks can be used to facilitate the development of segmentation and registration methods tailored to cardiac BOLD imaging. Similarly these insights can improve the design of multi-state hemoglobin relaxation models. These models can be used to examine via simulation how MR sequence design and imaging protocols affect detection methods down the line.

Being the first study to propose synthetic generation of CP–BOLD time series, this study could not compare with any prior method in the literature, which can be seen as a limitation. Nevertheless, for a fair comparison we included a PCA-derived decomposition for modeling which follows similar principles to the one proposed here. However, their comparison beyond validation accuracy must include the capability to extract the object of interest, i.e., the characteristic CP–BOLD curve. The proposed method cannot directly model spatial correlations (i.e., that time series extracted from neighboring territories are correlated), but post-generation time series can be smoothed to introduce correlation. Clearly, we need a sufficient number of training data with ground truth to learn the two DBMA models. For baseline conditions this is much easier than ischemic conditions, due to limitations in spatial resolution, potential partial volume effects and, in this study, the reliance on LGE images to identify the area at risk. Further improvements in MRI techniques that increase spatial resolution and fidelity could alleviate such issues in the future. In addition, adopting techniques that could register information of area at risk with cine (BOLD) MRI [50], [51] could improve the identification of ischemic time series for training and validation purposes.

From a technical perspective, the newly proposed MC–DLA is able to provide a unique decomposition of input time series as linear combinations between structured (circulant) and unstructured (generic) dictionary components or atoms. Learning this model is done efficiently using a newly proposed circulant [40] and a well-known dictionary learning mechanisms. Based on this decomposition we statistically model all low dimensional subspaces found. Outside the context of cardiac MRI, MC–DLA could also decompose fMRI data, particularly if shift-invariant decomposition is sought-after [52]–[54]. Or, it can be used more generally for modeling and clustering data [48] or imaging applications [55]. We also provide a fitness measure to identify the internal dimensions of the model such that representation capabilities and model complexity are well balanced. This is necessary since the traditional dictionary learning framework does not naturally accommodate for the compact dictionary and subspace structure imposed in our models. The proposed fitness measure can be applied whenever dictionary and subspace models with specific structure are applied [48].

In conclusion, we show that the proposed method can generate realistic CP–BOLD time series. It learns and represents the data efficiently, taking advantage of a unique sparse decomposition framework to produce parsimonious models that are easy to use. Its application towards the development of better ischemia detection algorithms and also the simulation of imaging scenarios when designing CP–BOLD acquisitions accelerates the clinical translation of this truly non-invasive and repeatable method for the assessment of ischemia.

Acknowledgments

This work was supported in part by the US National Institutes of Health (2R01HL091989-05).

Footnotes

Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org.

Contributor Information

Cristian Rusu, Email: cristian.rusu@imtlucca.it, IMT Institute for Advanced Studies Lucca, Italy.

Rita Morisi, Email: rita.morisi@imtlucca.it, IMT Institute for Advanced Studies Lucca, Italy.

Davide Boschetto, Email: davide.boschetto@imtlucca.it, IMT Institute for Advanced Studies Lucca, Italy.

Rohan Dharmakumar, Email: rohan.dharmakumar@cshs.org, Biomedical Imaging Research Institute, Cedars-Sinai Medical, CA, USA.

Sotirios A. Tsaftaris, Email: s.tsaftaris@imtlucca.it, IMT Institute for Advanced Studies Lucca, Italy. Departments of Radiology, Electrical Engineering and Computer Science, Northwestern University, IL, USA

References

  • 1.Segars WP, Lalush DS, Tsui BMW. A realistic spline-based dynamic heart phantom. IEEE Trans Nuc Sci. 1999;46(3):503–506. [Google Scholar]
  • 2.Montagnat J, Delingette H. 4D deformable models with temporal constraints: application to 4D cardiac image segmentation. Medical Image Analysis. 2005;9:87–100. doi: 10.1016/j.media.2004.06.025. [DOI] [PubMed] [Google Scholar]
  • 3.Prakosa A, Sermesant M, Delingette H, Marchesseau S, Saloux E, Allain P, Villain N, Ayache N. Generation of synthetic but visually realistic time series of cardiac images combining a biophysical model and clinical images. IEEE Trans Med Imag. 2013;32(1):99–109. doi: 10.1109/TMI.2012.2220375. [DOI] [PubMed] [Google Scholar]
  • 4.Beckmann CF, Smith SM. Probabilistic independent component analysis for functional magnetic resonance imaging. IEEE Trans Med Imag. 2004;23(2):137–152. doi: 10.1109/TMI.2003.822821. [DOI] [PubMed] [Google Scholar]
  • 5.Tsaftaris SA, Zhou X, Tang R, Li D, Dharmakumar R. Detecting myocardial ischemia at rest with Cardiac Phase-resolved Blood Oxygen Level-Dependent Cardiovascular Magnetic Resonance. Circ Cardiovasc Imaging. 2013;6(2):311–319. doi: 10.1161/CIRCIMAGING.112.976076. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Dharmakumar R, Arumana J, Tang R, Harris K, Zhang Z, Li D. Assessment of regional myocardial oxygenation changes in the presence of coronary artery stenosis with balanced SSFP imaging at 3.0T: Theory and experimental evaluation in canines. J Magn Reson Imaging. 2008;27(5):1037–1045. doi: 10.1002/jmri.21345. [DOI] [PubMed] [Google Scholar]
  • 7.Zhou X, Tsaftaris SA, Liu Y, Tang R, Klein R, Zuehlsdorff S, Li D, Dharmakumar R. Artifact-reduced two-dimensional cine steady state free precession for myocardial blood-oxygen-level-dependent imaging. J Magn Reson Imaging. 2010;31(4):863–871. doi: 10.1002/jmri.22116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Marwick T, Willemart B, D’Hondt AM, Baudhuin T, Wijns W, Detry JM, Melin J. Selection of the optimal nonexercise stress for the evaluation of ischemic regional myocardial dysfunction and malperfusion. Comparison of dobutamine and adenosine using echocardiography and 99mtc-mibi single photon emission computed tomography. Circulation. 1993;87(2):345–354. doi: 10.1161/01.cir.87.2.345. [DOI] [PubMed] [Google Scholar]
  • 9.U. S. Food and Drug Administration. FDA warns of rare but serious risk of heart attack and death with cardiac nuclear stress test drugs Lexiscan (regadenoson) and Adenoscan (adenosine) 2013 [Online]. Available: http://www.fda.gov/Drugs/DrugSafety/ucm375654.htm.
  • 10.Tsaftaris SA, Tang R, Zhou X, Li D, Dharmakumar R. Ischemic extent as a biomarker for characterizing severity of coronary artery stenosis with blood oxygen-sensitive MRI. J Magn Reson Imaging. 2012;35(6):1338–1348. doi: 10.1002/jmri.23577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Jolly MP. Automatic segmentation of the left ventricle in cardiac MR and CT images. International Journal of Computer Vision. 2006;70(2):151–163. [Google Scholar]
  • 12.Feng W, Nagaraj H, Gupta H, Lloyd SG, Aban I, Perry GJ, Calhoun DA, Dell’Italia LJ, Denney TS. A dual propagation contours technique for semi-automated assessment of systolic and diastolic cardiac function by CMR. Journal of Cardiovascular Magnetic Resonance. 2009;11:30–43. doi: 10.1186/1532-429X-11-30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Petitjean C, Dacher JN. A review of segmentation methods in short axis cardiac MR images. Medical Image Analysis. 2011;15(2):169–184. doi: 10.1016/j.media.2010.12.004. [DOI] [PubMed] [Google Scholar]
  • 14.Suinesiaputra A, Cowan BR, Al-Agamy AO, AlAttar MA, Ayache N, Fahmy AS, Khalifa AM, Medrano-Gracia P, Jolly M-P, Kadish AH, Lee DC, Margeta J, Warfield SK, Young AA. A collaborative resource to build consensus for automated left ventricular segmentation of cardiac MR images. Medical Image Analysis. 2014;18(1):50–62. doi: 10.1016/j.media.2013.09.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Feng C, Li C, Zhao D, Davatzikos C, Litt H. Segmentation of the left ventricle using distance regularized two-layer level set approach. MICCAI. 2013:477–484. doi: 10.1007/978-3-642-40811-3_60. [DOI] [PubMed] [Google Scholar]
  • 16.Bai W, Shi W, O’Regan DP, Tong T, Wang H, Jamil-Copley S, Peters NS, Rueckert D. A probabilistic patch-based label fusion model for multi-atlas segmentation with registration refinement: Application to cardiac MR images. IEEE Trans Med Imaging. 2013;32(7):1302–1315. doi: 10.1109/TMI.2013.2256922. [DOI] [PubMed] [Google Scholar]
  • 17.Makela T, Clarysse P, Sipila O, Pauna N, Pham Q, Katila T, Magnin IE. A review of cardiac image registration methods. IEEE Trans Med Imag. 2002;21(9):1011–1021. doi: 10.1109/TMI.2002.804441. [DOI] [PubMed] [Google Scholar]
  • 18.Perperidis D, Mohiaddin R, Rueckert D. Spatio-temporal free-form registration of cardiac MR image sequences. Medical Image Analysis. 2005;9(5):441–456. doi: 10.1016/j.media.2005.05.004. [DOI] [PubMed] [Google Scholar]
  • 19.Ou Y, Sotiras A, Paragios N, Davatzikos C. DRAMMS: Deformable registration via attribute matching and mutual-saliency weighting. Medical Image Analysis. 2011;15(4):622–639. doi: 10.1016/j.media.2010.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Ou Y, Ye DH, Pohl KM, Davatzikos C. Validation of DRAMMS among 12 popular methods in cross-subject cardiac MRI registration. WBIR. 2012:209–219. doi: 10.1007/978-3-642-31340-0_22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Frangi AF, Niessen WJ, Viergever MA. Three-dimensional modeling for functional analysis of cardiac images, a review. IEEE Trans Med Imag. 2001;20(1):2–5. doi: 10.1109/42.906421. [DOI] [PubMed] [Google Scholar]
  • 22.Li F, White JA, Rajchl M, Goela A, Peters TM. Generation of synthetic 4D cardiac CT images by deformation from cardiac ultrasound. In: LNCSL, et al., editors. Augmented Environments for Computer-Assisted Interventions. Vol. 7815. Springer; 2013. pp. 132–141. [Google Scholar]
  • 23.Maitra R, Riddles J. Synthetic magnetic resonance imaging revisited. IEEE Trans Med Imag. 2010;29(3):895–902. doi: 10.1109/TMI.2009.2039487. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Aja-Fernandez S, Cordero-Grande L, Albewla-Lopez C. A MRI phantom for cardiac perfusion simulation. Biomedical Imaging (ISBI), 2012 9th IEEE International Symposium on; 2012; pp. 638–641. [Google Scholar]
  • 25.Chiribiri A, Schuster A, Ishida M, Hautvast G, Zarinabad N, Morton G, Otton J, Plein S, Breeuwer M, Batchelor P, Schaeffter T, Nagel E. Perfusion phantom: An efficient and reproducible method to simulate myocardial first-pass perfusion measurements with cardiovascular magnetic resonance. Magn Res Med. 2013;69(3):698–707. doi: 10.1002/mrm.24299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Tobon-Gomez C, Sukno FM, Bijnens BH, Huguet M, Frangi AF. Realistic simulation of cardiac magnetic resonance studies modeling anatomical variability, trabeculae, and papillary muscles. Magn Res Med. 2011;65(1):280–288. doi: 10.1002/mrm.22621. [DOI] [PubMed] [Google Scholar]
  • 27.Dharmakumar R, Hong J, Brittain JH, Plewes DB, Wright GA. Oxygen-sensitive contrast in blood for steady-state free precession imaging. Magn Reson Med. 2005;53(3):574–583. doi: 10.1002/mrm.20393. [DOI] [PubMed] [Google Scholar]
  • 28.Karamitsos TD, Leccisotti L, Arnold JR, Recio-Mayoral A, Bhamra-Ariza P, Howells RK, Searle N, Robson MD, Rimoldi OE, Camici PG, Neubauer S, Selvanayagam JB. Relationship between regional myocardial oxygenation and perfusion in patients with coronary artery disease: Insights from cardiovascular magnetic resonance and positron emission tomography. Circ Cardiovasc Imaging. 2010;3(1):32–40. doi: 10.1161/CIRCIMAGING.109.860148. [DOI] [PubMed] [Google Scholar]
  • 29.McCommis KS, Goldstein TA, Abendschein DR, Herrero P, Misselwitz B, Gropler RJ, Zheng J. Quantification of regional myocardial oxygenation by magnetic resonance imaging: Validation with positron emission tomography. Circ Cardiovasc Imaging. 2010;3(1):41–46. doi: 10.1161/CIRCIMAGING.109.897546. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Ootaki Y, Ootaki C, Kamohara K, Akiyama M, Zahr F, Kopcak J, Dessoffy MWR, Fukamachi K. Phasic coronary blood flow patterns in dogs vs. pigs: an acute ischemic heart study. Med Sci Monit. 2008;14(10):193–197. [PubMed] [Google Scholar]
  • 31.Kellman P, Arai AE, McVeigh ER, Aletras AH. Phase-sensitive inversion recovery for detecting myocardial infarction using gadolinium-delayed hyperenhancement. Magn Res Med. 2002;47(2):372–383. doi: 10.1002/mrm.10051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Tsaftaris SA, Andermatt V, Schlegel A, Katsaggelos AK, Li D, Dharmakumar R. A dynamic programming solution to tracking and elastically matching left ventricular walls in cardiac cine MRI. Proc. Int. Conf. Image Proc; 2008; 2008. pp. 2980–2983. [Google Scholar]
  • 33.Donoho DL. High-dimensional data analysis: The curses and blessings of dimensionality. AMS Math Challenges Lecture. 2000 [Google Scholar]
  • 34.Criminisi A, Shotton J, Konukoglu E. Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Found Trends Comput Graph Vis. 2012;7(2–3):81–227. [Google Scholar]
  • 35.Jenatton R, Gribonval R, Bach F. Local stability and robustness of sparse dictionary learning in the presence of noise. 2012 arXiv:1210.0685. [Google Scholar]
  • 36.Xiang ZJ, Xu H, Ramadge PJ. Learning sparse representations of high dimensional data on large scale dictionaries,” in. Adv Neural Inf Process Syst. 2011:900–908. [Google Scholar]
  • 37.Silva J, Chen M, Eldar YC, Sapiro G, Carin L. Blind compressed sensing over a structured union of subspaces. 2011 arXiv:1103.2469. [Google Scholar]
  • 38.Baraniuk R, Cevher V, Duarte M, Hegde C. Model-based compressive sensing. IEEE Trans Inf Theory. 2010;56(4):1982–2001. [Google Scholar]
  • 39.Pope G, Aubel C, Studer C. Learning phase-invariant dictionaries,” in. IEEE ICASSP. 2013:5979–5983. [Google Scholar]
  • 40.Rusu C, Dumitrescu B, Tsaftaris SA. Explicit shift-invariant dictionary learning. IEEE Sig Proc Let. 2014;21(1):6–9. [Google Scholar]
  • 41.Aharon M, Elad M, Bruckstein A. K–SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Sig Proc. 2006;54(11):4311–4322. [Google Scholar]
  • 42.Rusu C, Dumitrescu B. Stagewise K–SVD to design efficient dictionaries for sparse representations. IEEE Sig Proc Lett. 2012;19(10):631–634. [Google Scholar]
  • 43.Pati Y, Rezaiifar R, Krishnaprasad P. Orthogonal matching pursuit: recursive function approximation with application to wavelet decomposition. Asilomar Conf on Signals, Systems and Comput. 1993;1:40–44. [Google Scholar]
  • 44.Redner R, Walker H. Mixture densities, maximum likelihood and the EM algorithm. SIAM Review. 1984;26(2):195–239. [Google Scholar]
  • 45.Ruan L, Yuan M, Zou H. Regularized parameter estimation in high-dimensional Gaussian mixture models. Neural Comput. 2011;23(6):1605–1622. doi: 10.1162/NECO_a_00128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Gribonval R, Vandergheynst P. On the exponential convergence of matching pursuits in quasi-incoherent dictionaries. IEEE Trans Inf Theory. 2006;52(1):255–261. [Google Scholar]
  • 47.Mazhar R, Gader PD. EK–SVD: Optimized dictionary design for sparse representations; 19th International Conference on Pattern Recognition; 2008. pp. 1–4. [Google Scholar]
  • 48.Ma Y, Yang AY, Derksen H, Fossum R. Estimation of subspace arrangements with applications in modeling and segmenting mixed data. SIAM Review. 2008;50(3):413–458. [Google Scholar]
  • 49.Burges CJC. A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov. 1998;2:121–167. [Google Scholar]
  • 50.Liu Y, Xue H, Guetter C, Jolly MP, Chrisochoides N, Guehring J. Moving propagation of suspicious myocardial infarction from delayed enhanced cardiac imaging to cine MRI using hybrid image registration. IEEE International Symposium on Biomedical Imaging: From Nano to Macro; 2011; pp. 1284–1288. [Google Scholar]
  • 51.Cordero-Grande L, Merino-Caviedes S, Alba X, Figueras i Ventura R, Frangi AF, Alberola-Lopez C. 3D fusion of cine and late-enhanced cardiac magnetic resonance images,” in. Biomedical Imaging (ISBI), 2012 9th IEEE International Symposium on; 2012; pp. 286–289. [Google Scholar]
  • 52.Mørupa M, Hansena LK, Arnfredb SM, Limc LH, Madsena KH. Shift-invariant multilinear decomposition of neuroimaging data. Neuro Image. 2008;42(4):1439–1450. doi: 10.1016/j.neuroimage.2008.05.062. [DOI] [PubMed] [Google Scholar]
  • 53.Eavani H, Filipovych R, Davatzikos C, Satterthwaite TD, Gur RE, Gur RC. Sparse dictionary learning of resting state fMRI networks,” in. Pattern Recognition in Neuro Imaging (PRNI) 2012:73–76. doi: 10.1109/PRNI.2012.25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Eavani H, Satterthwaite TD, Gur RE, Gur RC, Davatzikos C. Unsupervised learning of functional network dynamics in resting state fMRI,” in. IPMI. 2013:426–437. doi: 10.1007/978-3-642-38868-2_36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Raoy N, Recht B, Nowaky RD. Signal recovery in unions of subspaces with applications to compressive imaging. 2012 arXiv:1209.3079v1. [Google Scholar]

RESOURCES