Skip to main content
Chaos logoLink to Chaos
. 2013 Jun 26;23(2):023132. doi: 10.1063/1.4811544

Electrocardiogram classification using delay differential equations

Claudia Lainscsek 1, Terrence J Sejnowski 1
PMCID: PMC3710263  NIHMSID: NIHMS499976  PMID: 23822497

Abstract

Time series analysis with nonlinear delay differential equations (DDEs) reveals nonlinear as well as spectral properties of the underlying dynamical system. Here, global DDE models were used to analyze 5 min data segments of electrocardiographic (ECG) recordings in order to capture distinguishing features for different heart conditions such as normal heart beat, congestive heart failure, and atrial fibrillation. The number of terms and delays in the model as well as the order of nonlinearity of the model have to be selected that are the most discriminative. The DDE model form that best separates the three classes of data was chosen by exhaustive search up to third order polynomials. Such an approach can provide deep insight into the nature of the data since linear terms of a DDE correspond to the main time-scales in the signal and the nonlinear terms in the DDE are related to nonlinear couplings between the harmonic signal parts. The DDEs were able to detect atrial fibrillation with an accuracy of 72%, congestive heart failure with an accuracy of 88%, and normal heart beat with an accuracy of 97% from 5 min of ECG, a much shorter time interval than required to achieve comparable performance with other methods.


Cardiovascular diseases are the main cause of deaths worldwide and a major cost in health care. Better diagnosis methods are much needed. Here, delay differential equations (DDEs) are used to discriminate short (5 min) electrocardiography (ECG) data segments. This method does not require any preprocessing of the data, is performed on the time series themselves, is computationally fast, and could be the basis for a real time diagnostic system. DDEs reveal non-linear properties as well as spectral properties of the data. A DDE relates a differential and delay embedding in a complex manner to extract distinctive dynamical properties of the underlying dynamical system. DDE analysis is a time domain tool that combines aspects of nonlinear dynamics, Fourier analysis, and higher-order statistics.

INTRODUCTION

In global vector field reconstruction,1, 2, 3, 4 the recorded data are used to generate a model, whose dynamical behavior is equivalent to the original system. Equivalence is not required for our data analysis method. Nonetheless, the identification technique provides a global model that captures some essential features of the underlying dynamics.

The techniques introduced here are based on Delay Differential Equations (DDEs). DDEs are a generalization of ordinary differential equations (ODEs) with time delays. DDEs have to be used to describe the underlying dynamics in particular physical and biological processes. Such processes are typically characterized by a delayed reaction (see Driver5 for a list of examples). Delays also have an important role in analyzing ECG data.6

Solving even the simplest linear DDE x˙(t)=αx(tτ) is complicated (see, e.g., Ref. 7) and not within the scope of this paper. We do not seek DDEs that predict time series but rather global DDE models that capture distinguishing features of data for different heart conditions such as normal heart beat, congestive heart failure, and atrial fibrillation. The question whether an ECG is best modeled by a linear or non-linear process is directly related to the structure selection of the DDE: Is a linear DDE sufficient or are non-linear terms needed? How many terms, how many delays and what kind of non-linearity should be used?

DDEs can be seen as a flavor of a autoregressive (AR) model or an autoregressive moving average (ARMA) model8, 9, 10, 11 where the time series on the left side of the equation is replaced by the derivative. Lately delay systems have been used in the context of reservoir computing (RC).12, 13, 14, 15 RC is a recently introduced, bio-inspired, machine-learning paradigm for processing empirical data that is mimicking neuronal networks. A DDE in this context is the simplest nonlinear delay system with a singular node.

A motivation for DDE analysis of non-linear data comes from embedding theory in non-linear time series analysis. An embedding converts a single time series into a multidimensional object in an embedding space (Whitney,16 Packard et al.,17 Takens,18 and Sauer et al.2). The reconstructed attractor reveals basic properties (dimension, Lyapunov spectrum, and entropy) of the true attractor of the system. It allows valuable information to be obtained about the dynamics of the system without having direct access to all the systems variables.

There are two basic embeddings: delay and derivative embeddings. For a delay embedding, the time series itself and its delayed versions are used to construct the embedding; for the derivative embedding the time series and its successive derivatives are used. Judd and Mees19 introduced the idea of non-uniform embeddings for time series with components of multiple time-scales. DDE analysis then relates aspects of the different embeddings: the derivative of the time-series is related to functions of non-uniformly delayed versions of that time series.

DDE data analysis can be also seen as a novel way of combining Fourier analysis and higher-order statistics in a time domain framework. The relationship between frequency analysis and analysis of frequency and/or phase couplings in the time domain is poorly understood (see, e.g., Refs. 20, 21, 22, 23, 24). The linear terms of a DDE correspond to the main frequencies in the signal. For n independent frequencies in a signal a linear DDE with 2 n–1 terms is needed to describe such data. The nonlinear terms in the DDE are related to nonlinear couplings between the harmonics. DDEs can also be expanded in a Yule-Walker-like way25, 26 and the DDE coefficients then can be rewritten as functions of dynamical higher-order data correlations. These dynamical higher-order data correlations are generalizations of Nth order data moment functions such as, e.g., the auto-correlation (2nd order moment) and the bi-correlation (3rd order moment). The paper is organized as follows: Sec. 2 shows the connection between DDEs and classical Fourier analysis and higher order statistics (HOS). In Sec. 3, good classifiers for ECG data are found via DDE analysis. Section 4 is the discussion.

DELAY DIFFERENTIAL EQUATIONS AND TIME-DOMAIN FREQUENCY ANALYSIS

DDE analysis is done in the time domain on the time-series themselves and not in the frequency domain. The DDE framework combines linear and non-linear information from the data in a complex and not easily understandable way. To gain some insight in the meaning of the different terms of a DDE, we show the correspondence of the linear terms to the main time-scales or frequencies in the signal and how the non-linear terms contain information about non-linear couplings.

Linear DDEs

The simplest linear DDE is

x˙=axτ, (1)

where xτ=x(tτ). Solving this equation in general is nontrivial (see, e.g., Ref. 7) and beyond the scope of this paper. However, looking at special solutions can lead to understanding the terms in a DDE as used here for detection/classification purposes. A special solution of Eq. 1 is (see Ref. 7)

x(t)=cos(ωt);a=(1)nω;τ=π(2n1)2ω, (2)

where nN. The coefficient a is proportional to the frequency and the time delay τ is inversely proportional to the frequency. For a signal with a frequency f and ω=2πf the delay is then

τ=(2n1)4f. (3)

The delay is inversely proportional to the frequency and the coefficient a is directly proportional to the frequency.

A special solution of the linear DDE

x˙=i=1Naixτi (4)

is

x(t)=k=12N1cos(ωkt);τi=π(2n1)2ωj, (5)

where nN are arbitrary integers and all delays τi are related to one of the frequencies. The expressions for the coefficients a are more complicated than in Eq. 2 and each depends on all the frequencies in the signal.

Equations 4, 5 imply that we need a DDE with 2N–1 linear terms to describe a harmonic signal with N frequencies. If we consider Eq. 1 and a sum of three harmonics, x(t)=i=13cos(ωit), Eq. 1 cannot be solved analytically. To estimate the value of the coefficient a we expand Eq. 1 as a Yule-Walker-like equation:25, 26 We multiply both sides of Eq. 1 with xτ and apply the expectation operator F(t)limT(1T0TF(t)dt) and get

a=x˙xτxτ2. (6)

The numerator in Eq. 6 looks like a “dynamical” version of the autocorrelation function xxτ and it can be rewritten as delay derivatives of the autocorrelation function in the case of a bounded stationary signal,

x˙xτ=limT1T0Tx˙xτdt=limT1T[xxτ]0TlimT1T0Txdxτdtdt=limT1T0Txdxτdτdτ=ddτxxτ. (7)

For x(t)=i=13cos(ωit), the expressions in Eq. 6 are

x˙xτ=ω3ωj12i=13sin(ωiτ)ωi,x˙xτ=ω3=ω112ω2sin(τω2)2ω1sin(τω1),x˙xτ=ω3=ω212ω1sin(τω1)2ω2sin(τω2),xτ2=ω3ωj32,xτ2=ω3=ωj52;j=1,2. (8)

The coefficient a in Eq. 6 is a smooth function with singularities at ω3=ωi, i = 1,2. Therefore estimating a numerically can be used as a time domain frequency detection tool: To detect the two frequencies in the signal D=cos(ω1τ)+cos(ω2τ) (f1=31Hz,f2=69Hz,ω=2πf) the term cos(ω3τ) was added for a range of frequencies f3=ω32π. In Fig. 1, we estimated the coefficient a numerically by a singular value decomposition (SVD) algorithm27 and then computed the least square error (x˙axτ)2 for Eq. 6 for D+cos(ω3τ) with f3 varying from 0 to 150 Hz. The delay was 10 δt, where δt=1fs with a sampling rate fs=1000Hz. The two singularities at the two frequencies f1=31Hz and f2=69Hz are clearly visible in both plots. The choice of a different delay would change the shape of curve (see Eqs. 7, 8). The error ρ=|x˙x˙xτxτ2xτ| is nonzero since Eq. 6 is not an exact solution.

Figure 1.

Figure 1

Error ρ and coefficient a for the linear DDE x˙=axτ with τ=10δt vs. frequency f3 for the signal x(t)=cos(ω1t)+cos(ω2t)+cos(ω3t) with f1=31Hz and f2=69Hz (ω=2πf).

The possible advantages for the use this time-domain frequency analysis tool are that it can be applied for short time series and for sparse data: a can be estimated for the set of all points for which the derivative can be computed and the delayed point exists. Missing points can be left out.

The method is also fairly noise insensitive. In Fig. 2, we added to the signal white noise η with a signal-to-noise ratio of SNR = −10 dB which is more noise than signal. We then did the same numerical experiment as we did in Fig. 1: we estimated the coefficient a numerically by a singular value decomposition (SVD) algorithm and then computed the least square error for Eq. 6 for D+cos(ω3τ) with f3 varying from 0 to 150 Hz. Again, the two singularities at the two frequencies f1=31Hz and f2=69Hz are clearly visible in both plots (Fig. 2).

Figure 2.

Figure 2

Error ρ and coefficient a for the linear DDE x˙=axτ with τ=10δt vs. frequency f3. White noise η was added to the signal: D=cos(ω1t)+cos(ω2t)+η, where the signal-to-noise ration SNR = −10 dB and f1=31Hz and f2=69Hz (ω=2πf). The coefficient a and error ρ were then plotted for x(t)=D+cos(ω3t) with f3 varying between 0 and 150 Hz.

Nonlinear DDEs

In real world data, various frequency components do not always appear completely independently of one another. Those non-linear interactions of frequencies and their phases (e.g., quadratic phase coupling) cannot be detected by a power spectrum, the Fourier transform of the autocorrelation function (second-order cumulant), since phase relationships and frequency couplings of signals are lost. Such couplings are usually detected via higher order spectra or bispectral analysis.28, 29, 30, 31, 32, 33, 34, 35 The bispectrum or bispectral density is the Fourier transform of the third-order cumulant (bicorrelation function). Consider the signal x(t)=A1cos(ω1t+φ1)+A2cos(ω2t+φ2) which is passed through a quadratic nonlinear system h(t)=bx2(t) where b is a non-zero constant. On the output of the system, the signal will include the harmonic components: (2ω1,2φ1),(2ω2,2φ2),(ω1+ω2,φ1+φ2), and (ω1ω2,φ1φ2). These phase relations are called quadratic phase coupling (QPC). Since we are here interested in the couplings of frequencies we will consider the special case of quadratic frequency coupling (QFC) when the phases are zero (φ1=φ2=0): For a signal x(t)=cos(ω1t)+cos(ω2t)+cos(ω3t) frequency coupling occurs when ω3 is a multiple of one of the frequencies or of the sum or difference of the two frequencies.

A simple DDE with one non-linear term,

x˙=axτ1xτ2, (9)

cannot be solved analytically, but as shown for the linear case in Sec. 1, it can be expanded as a Yule-Walker-like equation:25, 26 We multiply both sides of Eq. 9 with xτ1xτ2 and apply the expectation operator and get

a=x˙xτ1xτ2xτ12xτ22. (10)

The numerator x˙xτ1xτ2 in Eq. 10 looks like a dynamical version of the bicorrelation36xxτ1xτ2. It can be rewritten as delay derivatives of the moments in the case of a bounded stationary signal,

x˙xτ1xτ2=limT1T0Tx˙xτ1xτ2dt=limT1T[xxτ1xτ2]0TlimT1T0Txddt(xτ1xτ2)dt=limT1T0Txdxτ1dτ1xτ2dt+limT1T0Txxτ1dxτ1dτ2dt=ddτ1xxτ1xτ2+ddτ2xxτ1xτ2. (11)

For x(t)=cos(ω1t)+cos(ω2t)+cos(ω3t), the bicorrelation is only non-zero when ω3 is a multiple of one of the frequencies or of the sum or difference of the two frequencies. In these cases, the expressions for the dynamic bicorrelation are

x˙xτ1xτ2=ω3=ω1±ω214(ω1(sin(τ2ω1±τ1ω2)sin(τ1ω1±τ2ω2)+sin(±τ2ω2τ1(ω1±ω2))+sin(±τ1ω2τ2(ω1±ω2)))+ω2(sin(τ2ω1±τ1ω2)sin(τ1ω1±τ2ω2)+sin(τ2ω1τ1(ω1±ω2))+sin(τ1ω1τ2(ω1±ω2)))),x˙xτ1xτ2=ω3=2ωi14(ωisin((τ12τ2)ωi)ωisin((2τ1τ2)ωi)2ωisin((τ1+τ2)ωi)),x˙xτ1xτ2=ω3=ωi218ωi(sin(12(τ12τ2)ωi)sin(12(2τ1τ2)ωi)2sin(12(τ1+τ2)ωi));i=1,2. (12)

and for the denominator of Eq. 10

xτ12xτ22=ω3ωi18(cos(2(τ1τ2)ω1)+4cos((τ1τ2)(ω1ω2))+cos(2(τ1τ2)ω2)+4cos((τ1τ2)(ω1+ω2))+4cos((τ1τ2)(ω1ω3))+4cos((τ1τ2)(ω2ω3))+cos(2(τ1τ2)ω3)+4cos((τ1τ2)(ω1+ω3))+4cos((τ1τ2)(ω2+ω3))+18),xτ12xτ22=ω3=ω12cos(2(τ1τ2)ω1)+2cos((τ1τ2)(ω1ω2))+18cos(2(τ1τ2)ω2)+2cos((τ1τ2)(ω1+ω2))+254,xτ12xτ22=ω3=ω218cos(2(τ1τ2)ω1)+2cos((τ1τ2)(ω1ω2))+2cos(2(τ1τ2)ω2)+2cos((τ1τ2)(ω1+ω2))+254. (13)

The coefficient a in Eq. 10 is only non-zero for QFC frequencies. We therefore can use this equation to detect non-linear couplings in the time domain in the same way as we used the linear DDE Eq. 6 to detect frequencies.

Fig. 3 shows how non-linear terms can detect frequency couplings. For x(t)=cos(ω1t)+cos(ω2t)+cos(ω3t) (ω=2πf) where the frequencies f1 and f2 were 31 Hz and 69 Hz, respectively, and f3 was varied from 0 to 150 Hz. For each f3 we estimated the coefficient a numerically by a singular value decomposition (SVD) algorithm27 and then computed the least square error (x˙axτ1xτ2)2 for Eq. 9. The nonlinear coefficient a (top plot in Fig. 3) is only non-zero when there is QFC and is zero otherwise. The least square error (bottom plot in Fig. 3) shows a slope with spikes when there is QFC as well as at the 2 frequencies f1 and f2.

Figure 3.

Figure 3

Error and coefficient a for the nonlinear DDE x˙=axτ1xτ2 with τ1=5δt and τ2=11δt vs. frequency f3 for the signal x(t)=cos(ω1t)+cos(ω2t)+cos(ω3t) with f1=31Hz and f2=69Hz (ω=2πf).

Equation 9 can therefore be used to detect QFC. Since the coefficient a in Eq. 10 is not an exact solution of Eq. 9, the error (lower plot in Fig. 3)) is ρ=|x˙axτ1xτ2|. For all values of f3 that are different from any coupling cases (sum or multiples of the two other frequencies) it should be ρ=x˙ since a = 0 for all non-coupling values of f3. Numerically a will never be exactly zero, but a small value. Therefore, the error is ρ=|x˙x˙xτ1xτ2xτ12xτ22xτ1xτ2|. xτ12xτ22 has a spike when f3 is equal to f1 and f2 and x˙xτ1xτ2 has spikes for QFC cases. Therefore the error (lower plot in Fig. 3)) shows bumps for QFC cases and the frequencies. The delays τ1,2 only change the shape of the error function.

In this section, we wanted to show the connection of DDEs to spectral analysis: Delays connected to linear terms in the DDE relate to frequencies (see Eqs. 6, 7, 8 and Fig. 1) and delays connected to non-linear terms in the DDE relate to couplings between those frequencies (see Eqs. 10, 12, 13, and Fig. 3). Here, we are not aiming to interpret these plots quantitatively. Different delays would only change the shape of the curves, but not the fact that there are spikes for the frequencies in the linear case and spikes for the frequency couplings in the non-linear case. Therefore, any arbitrary choice of delays will give the same qualitative behavior.

DDE ANALYSIS OF ECG DATA

In Sec. 2, we showed how linear terms of a DDE relate to the time scales or frequencies in the signal and the non-linear terms relate to non-linear couplings of those time scales or frequencies. Any DDE can be expanded in a Yule-Walker-like way and the coefficients will then be combinations of dynamical data correlations.

A DDE can also be interpreted as generic non-uniform embedding. Such a generic non-uniform embedding (the DDE model) unfolds timescales for the linear terms and couplings between frequencies for the non-linear terms as a spectrogram unfolds frequencies and higher order statistics (HOS) unfolds the couplings between frequencies. DDEs and spectral analysis are connected as we showed in the previous section. They are just operating in two different domains, the time domain and the spectral domain.

A DDE is a nonlinear extension of a non-uniform embedding with linear and/or nonlinear functions of the timeseries. Therefore, it can be tailored to the dynamics. HOS would have to be extended to a network of all possible HOS moments combined with linear spectral analysis to do the same. Therefore, a DDE is the simpler approach.

The structure or model form of the DDE and the delays for classification of heart data were in this section selected by an exhaustive search.

Data

We analyzed 24 h data from 15 young healthy persons in normal sinus rhythm (NSR) (ECG sample frequency: 128 Hz) of 15 congestive heart failure (CHF) patients (ECG sample frequency: 250 Hz) as well as of 15 subjects suffering from atrial fibrillation (AF) (ECG sample frequency: 128 Hz) selected from the Physionet database.37 Table TABLE I. lists the files. The first five subjects of each group were used for the CHAOS Controversial Topics in Nonlinear Dynamics “Is the Normal Heart Rate Chaotic?” (http://physionet.org/challenge/chaos). The other ten subjects from each group are randomly selected records from the same databases.

TABLE I.

ECG data used. The three conditions are normal sinus rhythm (NSR), congestive heart failure (CHF), and atrial fibrillation (AF). The data were downloaded from the physionet database.37

graphic file with name CHAOEH-000023-023132_1-i0d1.jpg

Supervised structure selection

Typically, a nonlinear delay differential equation has the form

x˙=f(ai,xτj)=a1xτ1+a2xτ2+a3xτ3++ai1xτn+aixτ12+ai+1xτ1xτ2+ai+2xτ1xτ3++aj1xτn2+ajxτ13+aj+1xτ12xτ2++alxτnm, (14)

where x=x(t) and xτj=x(tτj). The DDE Eq. 14 has n delays, l monomials with coefficients a1,a2,...,al, and a degree m of nonlinearity. By a k-term DDE, we mean a DDE with k monomials selected from the right-hand side of Eq. 14. Although quite flexible, as for any global modeling technique, there is a significant gain in accuracy by carefully selecting the structure of the model.38, 39, 40 By structure selection or model learning, we mean retaining only those monomials that make the most significant contribution to the data dynamics. An equally important task is to select the right time-delays, since they are directly related to the primary time-scales and non-linear couplings between them of the dynamics under study.

Lainscsek et al.40 used a genetic algorithm to find a single DDE model for the classification of Parkinson movement data. Here, we want to do an exhaustive search of models and delays and find the models and delays that best separate classes of data. To do so we look at all possible polynomial DDE models

x˙=a1xτ1+a2xτ2+a3xτ12+a4xτ1xτ2+a5xτ22+a6xτ13+a7xτ12xτ2+a8xτ1xτ22+a9xτ23, (15)

with some of the ai equal to zero. Only models with up to three terms were considered. If the analysis did not give satisfactory results we added additional delays, increased the order of non-linearity and/or used DDEs with more than three terms. There were 5 one-term models, 18 two-term models, and 32 three-term models.

Tables 2, TABLE III. list all these models. Note that e.g. the DDE models x˙=a1xτ1+a2xτ1xτ2 and x˙=a1xτ2+a2xτ1xτ2 are the same with exchanged delays τ1 and τ2. Therefore, only the first of these two models were used. All such redundant DDE models were omitted. There were only two linear DDEs (model 1 and 5) while all others are non-linear. Seven of the DDEs had only one delay (models 1, 2, 4, 7, 9, 17, and 30) and nine models were symmetric (models 3, 6, 16, 22, 23, 25, 43, 52, 53) with two interchangeable delays.

TABLE II.

One- and two-term models. An “x” denotes that the term a is nonzero. The different types of models are: “L”—linear, “S”— symmetric, “1”—single delay DDE. All other DDEs are non-linear and have two non-interchangeable delays. To save space in the table xτi was written in the short form xi.

graphic file with name CHAOEH-000023-023132_1-i0d2.jpg

TABLE III.

Three term models. An “x” denotes that the term a is nonzero. The different types of models are: “L”—linear, “S” —symmetric, “1”—single delay DDE. All other DDEs are non-linear and have two non-interchangeable delays. To save space in the table xτi was written in the short form xi.

graphic file with name CHAOEH-000023-023132_1-i0d3.jpg

Data analysis

The data were analyzed without filtering and no artifacts were removed from the data. The downloaded NSR and AF data were sampled at 128 Hz, but the CHF data were sampled at 250 Hz. To use the same DDE with the same delays for all data, the NSR and AF data were up-sampled using the matlab function resample41 with the default options. Throughout this paper, we use 5 min non-overlapping data windows for our analysis. Each window was re-normalized to zero mean and unit variance to be able to compare data of different origin.

For the model selection task, we have to choose a classifier, select training data, select a classification tool, and do some cross-validation to take the small number of subjects into account. In this manuscript, we chose seven different classifiers and tested the performance of each separately. Those classifiers were: (1) NSR vs. AF vs. CHF, (2) NSR vs. AF, (3) NSR vs. CHF, (4) AF vs. CHF, (5) NSR vs. (AF and CHF), (6) AF vs. (NSR and CHF), and (7) CHF vs. (NSR and AF). As training data we selected one 5 min data window every 20 min (e.g., for a 20 h recording of one subject 60 5 min data windows were used). We used a repeated random sub-sampling validation42 where we trained on 10 subjects of each group and tested on the remaining 5 subjects of each group. This was repeated 300 times with each subject equally often used as training and testing subject. As classifier we used singular value decomposition (SVD).27 As measure of performance we use Cohen's kappa κ43, 44, 45, 46 which can be computed directly from the confusion matrix.47 A confusion matrix (also known as matching matrix, contingency table, or error matrix) is a specific table layout that allows visualization of the performance. Each column of the matrix represents the instances in a predicted class, while each row represents the instances in an actual class.

The random sub-sampling validation gives 300 values for Cohen's kappa for each model and each set of delays. To choose the best model we searched the highest minimum of the mean of the 300 values for each classifier. The best models and delays for the seven classification tasks are listed in Table TABLE IV..

TABLE IV.

Best DDE models selected for the seven classification tasks. The kappa values κ reported here were computed directly from the confusion matrices.47 The area under the ROC curve A can only be computed for the 6 binary classifiers. A was computed from the ROC curves in Fig. 5. The units of the delays are time steps δτ=1fs, where fs is the sampling frequency.

graphic file with name CHAOEH-000023-023132_1-i0d4.jpg

To test the performance of the classifiers in Table TABLE IV., we computed for each of the 7 models the 300 SVD-weights from the training data and then took the mean of those weights. These weights were then applied to the whole data set. Fig. 4 shows the computed features for all data for all seven classification tasks. The corresponding Cohen kappa values as well as the areas under the ROC curves are listed in Table TABLE IV.. The separating hyperplanes between the conditions were selected by SVD. In Fig. 4, the distance d from these hyperplanes are shown. NSR is best separated from the 2 diseases (κ=0.960.99). All other classifiers were also quite good.

Figure 4.

Figure 4

Distances d from the separating hyperplanes for all 5 min data windows for all 15 subjects for the three conditions (left plots) for the best DDE models reported in Tab. TABLE IV.. The mean value for each subject is shown as a black line. The histograms of these plots are shown on the right column. Blue refers to NSR, red to AF, and green to CHF. The black horizontal lines indicate the separating lines between the conditions selected by SVD. The vertical lines separate the subjects.

The model quality was also assessed by computing the area under the receiver operating characteristic (ROC) curves A (see Fig. 5).

Figure 5.

Figure 5

Receiver operating characteristic (ROC curves) and A' (area under the ROC curve) for the classifiers NSR vs. AF, NSR vs. CHF, AF vs. CHF, NSR vs. (AF, CHF), AF vs. (NSR,CHF), and CHF vs. (NSR,AF). The abbreviations N, A, and C in the legend correspond to NSR, AF, and CHF.

A receiver operating characteristic (ROC), or simply ROC curve,48, 49, 50, 51 is a graphical plot which illustrates the performance of a binary classifier system as its discrimination threshold is varied. It is created by plotting the fraction of true positives out of the positives vs. the fraction of false positives out of the negatives at various threshold settings. An A above 0.5 indicates classification performance above chance. In Fig. 5, all curves for the classifiers NSR vs. AF, NSR vs. CHF, AF vs. CHF, NSR vs. (AF,CHF), AF vs. (NSR,CHF), and CHF vs. (NSR,AF) are shown and A is for all binary classifiers above 0.92, which is excellent. Normal heart rate is better distinguishable from the two diseases: A is above 0.97 for NSR vs. AF, NSR vs. CHF, and NSR vs. (AF,CHF).

SVD is a simple classification tool. Performance could be easily improved by using a more sophisticated classifier such as support vector machine. Here, we wanted to emphasize on the results from the DDE analysis rather than best performance, which is already excellent.

DISCUSSION

We analyzed 24 h ECG data from healthy subjects and patients with either atrial fibrillation or congestive heart failure downloaded from the physionet database.37 These data were analyzed using delay differential equations (DDEs). First, we made a connection between DDEs in the time domain and spectral analysis tools such as Fourier analysis and higher-order statistics. We then used the outputs of DDE models with a maximum of three terms to build good classifiers for the three conditions. For 5 min data windows of the ECG data from 2 electrodes we were able to separate the three heart conditions with high accuracy.

In other studies using the same dataset, separation of the three heart conditions was only achieved using all of the data for each subject;52, 53, 54 in comparison, we only needed 5 min of data from each condition. In our analysis, we needed non-linear terms in the DDE models to separate the three heart conditions. Non-linear methods were also better at distinguishing the data classes in Refs. 52, 53. We found that a purely non-linear model with three non-linear terms was needed for the classification of NSR vs. AF vs. CHF, but a model with two linear and only one nonlinear term was needed for distinguishing CHF from NSR and AF in Refs. 55, 54. This is consistent with other studies showing that CHF is a more regular heart condition.

We have thus shown that the three heart conditions are dynamically different (different DDE models are selected for the different classifiers), that the three heart conditions have different characteristic time-scales (different delays are selected for the different classifiers), and that non-linear models are needed for classification. In other studies, the same dataset was used to determine whether the heart rate is more chaotic in normal subjects, but this remains an open question.

Our analysis shows that a DDE can be considered a generic nonuniform embedding: “non-uniform” because the delays should reflect the dominant time-scales of the dynamical system and “generic” because it is a combination of a delay and a derivative embedding. The models combine functions of delayed versions of the signal with the derivative of the signal. A DDE model can unfold dynamical structures that are relevant for a single time series and the underlying dynamical system, which may be unknown as in the ECG.

ACKNOWLEDGMENTS

This work was supported by the Howard Hughes Medical Institute, NIH (grant NS040522), and the Swartz Foundation. C.L. also would like to thank Manuel Hernandez and Jonathan Weyhenmeyer for valuable discussions.

References

  1. Gouesbet G., Phys. Rev. A 43, 5321 (1991). 10.1103/PhysRevA.43.5321 [DOI] [PubMed] [Google Scholar]
  2. Sauer T., Yorke J. A., and Casdagli M., J. Stat. Phys. 65, 579 (1991). 10.1007/BF01053745 [DOI] [Google Scholar]
  3. Bezruchko B. and Smirnov D., Phys. Rev. E 63, 016207 (2000). 10.1103/PhysRevE.63.016207 [DOI] [PubMed] [Google Scholar]
  4. Lainscsek C., Phys. Rev. E 84, 046205 (2011). 10.1103/PhysRevE.84.046205 [DOI] [PubMed] [Google Scholar]
  5. Driver R., Ordinary and Delay Differential Equations, Applied Mathematical Sciences (Springer-Verlag, 1977), Vol. 20. [Google Scholar]
  6. Clifford G., Azuaje F., and McSharry P., Advanced Methods and Tools for ECG Data Analysis (Artech House, 2006). [Google Scholar]
  7. Falbo C. E., in Joint meeting of the Northern and Southern California sections of the MAA (San Luis Obispo, CA, 1995). [Google Scholar]
  8. Whittle P., Hypothesis Testing in Time Series Analysis (Almquist and Wicksell, 1951). [Google Scholar]
  9. Whittle P., Prediction and Regulation (English Universities Press, 1963). [Google Scholar]
  10. Whittle P., Prediction and Regulation by Linear Least-Square Methods (University of Minnesota Press, 1983). [Google Scholar]
  11. Box G. and Jenkins G., Time Series Analysis: Forecasting and Control (Holden-Day, 1971). [Google Scholar]
  12. Appeltant L., Soriano M., Van der Sande G., Danckaert J., Massar S., Dambre J., Schrauwen B., Mirasso C. R., and Fischer I., Nature Commun. 2, 1 (2011). 10.1038/ncomms1476 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Larger L., Soriano M., Brunner D., Appeltant L., Gutierrez J., Pesquera L., Mirasso C. R., and Fischer I., Opt. Express 20(3), 3241 (2012). 10.1364/OE.20.003241 [DOI] [PubMed] [Google Scholar]
  14. Paquot Y., Duport F., Smerieri A., Dambre J., Schrauwen B., Haelterman M., and Massar S., Sci. Rep. 2, 287 (2012). 10.1038/srep00287 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Martinenghi R., Rybalko S., Jacquot M., Chembo Y., and Larger L., Phys. Rev. Lett. 108, 244101 (2012). 10.1103/PhysRevLett.108.244101 [DOI] [PubMed] [Google Scholar]
  16. Whitney , Ann. Math. 37, 645 (1936). 10.2307/1968482 [DOI] [Google Scholar]
  17. Packard N. H., Crutchfield J. P., Farmer J. D., and Shaw R. S., Phys. Rev. Lett. 45, 712 (1980). 10.1103/PhysRevLett.45.712 [DOI] [Google Scholar]
  18. Takens F., Dynamical Systems and Turbulence, Warwick 1980, Lecture Notes in Mathematics, Vol. 898, edited by Rand D. A. and Young L.-S. (Springer Berlin/Heidelberg, 1981), pp. 366–381. [Google Scholar]
  19. Judd K. and Mees A., Physica D 120, 273 (1998). 10.1016/S0167-2789(98)00089-X [DOI] [Google Scholar]
  20. Hjorth B., Electroencephalogr. Clin. Neurophysiol. 29, 306 (1970). 10.1016/0013-4694(70)90143-4 [DOI] [PubMed] [Google Scholar]
  21. Chan Y. and Langford R., IEEE Trans. Acoust., Speech, Signal Process. 30, 689 (1982). 10.1109/TASSP.1982.1163946 [DOI] [Google Scholar]
  22. Raghuveer M. and Nikias C., IEEE Trans. Acoust., Speech, Signal Process. 33, 1213 (1985). 10.1109/TASSP.1985.1164679 [DOI] [Google Scholar]
  23. Raghuveer M. R. and Nikias C. L., Signal Proces. 10, 35 (1986). 10.1016/0165-1684(86)90063-0 [DOI] [Google Scholar]
  24. Stankovic L., IEEE Trans Signal Process. 42, 225 (1994). 10.1109/78.258146 [DOI] [Google Scholar]
  25. Boashash B., Higher-Order Statistical Signal Processing (John Wiley & Sons, 1995). [Google Scholar]
  26. Kadtke J. and Kremliovsky M., Phys. Lett. A 260(3–4), 203 (1999). 10.1016/S0375-9601(99)00527-7 [DOI] [Google Scholar]
  27. Press W., Flannery B., Teukolsky S., and Vetterling W., Numerical Recipes in C (Cambridge University Press, 1990). [Google Scholar]
  28. Tukey J., in The Collected Works of John W. Tukey, edited by Brillinger D. (Wadsworth, Belmont, 1953), Vol. 1, pp. 165–184. [Google Scholar]
  29. Kolmogorov A. and Rozanov Y., Theor. Probab. Appl. 5, 204 (1960). 10.1137/1105018 [DOI] [Google Scholar]
  30. Leonov V. and Shiryeav A., Theor. Prob. Appl. 4, 319 (1959) 10.1137/1104031. [DOI] [Google Scholar]
  31. Rosenblatt M. and Van Ness J., Ann. Math. Stat. 36, 1120 (1965). 10.1214/aoms/1177699987 [DOI] [Google Scholar]
  32. Brillinger D. and Rosenblatt M., in Spectral Analysis of Time Series, edited by Harris B. (Wiley, New York, 1967), pp. 153–188. [Google Scholar]
  33. Swami A., see http://www.mathworks.com/matlabcentral/fileexchange/3013 for HOSA—higher order spectral analysis toolbox (2003).
  34. Mendel J., in Proc. IEEE (1991), Vol. 79, pp. 278–305.
  35. Fackrell J. and McLaughlin S., in Proceedings of the IEE Colloquium on Higher Order Statistics (1995), p. 9.
  36. Nikias C. and Raghuver M., in Proc. IEEE (1987), Vol. 75, pp. 869–891.
  37. Goldberger A., Amaral L., Glass L., Hausdorff J., Ivanov P., Mark R., Mietus J., Moody G., Peng C., and Stanley H., Circulation 101, E215 (2000). 10.1161/01.CIR.101.23.e215 [DOI] [PubMed] [Google Scholar]
  38. Aguirre L. A. and Billings S. A., Int. J. Control 62(3), 569 (1995). 10.1080/00207179508921557 [DOI] [Google Scholar]
  39. Lainscsek C., Letellier C., and Gorodnitsky I., Phys. Lett. A 314, 409 (2003). 10.1016/S0375-9601(03)00912-5 [DOI] [Google Scholar]
  40. Lainscsek C., Rowat P., Schettino L., Lee D., Song D., Letellier C., and Poizner H., Chaos 22, 013119 (2012). 10.1063/1.3683444 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. See http://www.mathworks.com/help/toolbox/ident/ref/resample.html for resampling of time-domain data by decimation or interpolation.
  42. Kohavi R., in Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2 (Morgan Kaufmann Publishers Inc., 1995), IJCAI'95, pp. 1137–1143.
  43. Cardillo G., see http://www.mathworks.com/matlabcentral/fileexchange/15365-cohens-kappa/content/kappa.m (2009) for computation of the Cohen's kappa coefficient.
  44. Scott W., Public Opin. Q. 19, 321 (1955) 10.1086/266577. [DOI] [Google Scholar]
  45. Cohen J., Educ. Psychol. Meas. 20, 37 (1960). 10.1177/001316446002000104 [DOI] [Google Scholar]
  46. Fleiss J. and Cohen J., Educ. Psychol. Meas. 33, 613 (1973). 10.1177/001316447303300309 [DOI] [Google Scholar]
  47. Kohavi R. and Provost F., Mach. Learn. 30, 271 (1998). 10.1023/A:1017181826899 [DOI] [Google Scholar]
  48. van Meter D. and Middleton D., Transactions of the IRE Professional Group on Information Theory 4, 119 (1954). 10.1109/TIT.1954.1057471 [DOI] [Google Scholar]
  49. Peterson W. W., Birdsall T., and Fox W., Transactions of the IRE Professional Group on Information Theory 4, 171 (1954). 10.1109/TIT.1954.1057460 [DOI] [Google Scholar]
  50. Tanner W. J. and Swets J., Psychol. Rev. 61, 401 (1954) 10.1037/h0058700. [DOI] [PubMed] [Google Scholar]
  51. Metz C., Radiol. Phys. Technol. 1, 2 (2008). 10.1007/s12194-007-0002-1 [DOI] [PubMed] [Google Scholar]
  52. Wessel N., Riedl M., and Kurths J., Chaos 19, 028508 (2009). 10.1063/1.3133128 [DOI] [PubMed] [Google Scholar]
  53. Freitas U., Roulin E., Muir J.-F., and Letellier C., Chaos 19, 028505 (2009). 10.1063/1.3139116 [DOI] [PubMed] [Google Scholar]
  54. Alvarez-Ramirez J., Rodriguez E., and Echeverria J. C., Chaos 19, 028502 (2009). 10.1063/1.3152005 [DOI] [PubMed] [Google Scholar]
  55. Hu J., Gao J., and Tung W.-W., Chaos 19, 028506 (2009). 10.1063/1.3152007 [DOI] [PubMed] [Google Scholar]

Articles from Chaos are provided here courtesy of American Institute of Physics

RESOURCES