Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Jul 16.
Published in final edited form as: Conf Proc IEEE Eng Med Biol Soc. 2019 Jul;2019:1488–1491. doi: 10.1109/EMBC.2019.8856504

Markov Models for Detection of Ventricular Arrhythmia

Zhi Li 1, Harm Derksen 2, Jonathan Gryak 1, Mohsen Hooshmand 1, Alexander Wood 1, Hamid Ghanbari 4, Pujitha Gunaratne 6, Kayvan Najarian 1,3,5
PMCID: PMC7364610  NIHMSID: NIHMS1604644  PMID: 31946175

Abstract

The advent of portable cardiac monitoring devices has enabled real-time analysis of cardiac signals. These devices can be used to develop algorithms for real-time detection of dangerous heart rhythms such as ventricular arrhythmias. This paper presents a Markov model based algorithm for real-time detection of ventricular tachycardia, ventricular flutter, and ventricular fibrillation episodes. The algorithm does not rely on any noise removal pre-processing or peak annotation of the original signal. When evaluated using ECG signals from three publicly available databases, the model resulted in an AUC of 0.96 and F1-score of 0.91 for 5-second long signals and an AUC of 0.97 and F1-score of 0.93 for 2-second long signals.

Keywords: Ventricular Tachycardia, Ventricular Fibrillation, Machine Learning, Markov model, ECG, signal processing

I. INTRODUCTION

Ventricular arrhythmia (VA) encompasses a spectrum of abnormal heart rhythms originating from the ventricles, the heart’s lower chambers. These arrhythmias have rates of over 100 beats per minute [1]. Types of VA include ventricular tachycardia (VT), ventricular flutter (VFlutter), and ventricular fibrillation (VF). Serious ventricular arrhythmia is associated with ischemic heart disease and can contribute to sudden cardiac death (SCD) events. These events constitute approximately 230,000 to 350,000 deaths annually in the United States and 50% of all cardiovascular deaths [2], [3]. Approximately half of SCD events can be attributed to VT or VF [4]. Therefore, monitoring and detecting VT and VF is critical for the prevention of SCD events.

Portable cardiac monitoring devices now exist that are capable of producing continuous, real-time cardiac signals [5]. Algorithms implemented on these devices could enable real-time detection of ventricular arrhythmia. Development of these algorithms calls for the classification of heart rhythms and arrhythmia detection based on these classifications.

Past algorithms developed for classification and detection of VA have utilized time domain techniques [6], information theory [7], [8], the Hilbert transform [9], [10], spectral parameters [11], and machine learning [12], [13], [14], [15], [16]. Machine learning techniques include VF filter “leakage” combined with a support vector machine (SVM) [12], [13], one-dimensional convolutional neural networks (CNN) [14], and an 11-layer CNN with 10-fold cross validation [15]. Features used during machine learning include threshold crossing sample count, sample entropy, and features extracted via variational mode decomposition [17]. Most of these algorithms are based on traditional machine learning methods that implement pre-processing, feature extraction, and feature selection, followed by training a classifier. Their performance often depends on error-prone techniques that remove noisy signals through peak detection and pre-processing. A previous version of the proposed algorithm has shown strong results for the detection and prediction of atrial fibrillation episodes [16]. Our proposed algorithm can also be adapted for real-time detection within an in-vehicle setting.

This paper first presents the Markov Chain Automatically Generated States (MCGENS) algorithm for the detection of ventricular tachycardia, ventricular flutter, and ventricular fibrillation. The algorithm does not depend on R peak detection algorithms or any pre-processing steps for noise removal, which are handled instead during signal encoding. Next, the algorithm is tested on patients from three data sets using 5-fold cross validation over cohorts partitioned at the patient level, resulting in an AUC of 0.96 and F1-score of 0.91 for 5-second long signals and an AUC of 0.97 and F1-score of 0.93 for 2-second long signals. Finally, the results and methods are discussed and compared with existing methods.

II. Data

Three publicly available data sets with ventricular arrhythmia annotations were used in the evaluation of the proposed method. As all three databases have been examined by other algorithms, the proposed algorithm can be directly compared with the performance of existing algorithms.

The MIT-BIH Arrhythmia Database (mitdb) contains 48 half-hour excerpts of two-channel ambulatory ECG recordings obtained from 47 human subjects[18]. The second database, the MIT-BIH Malignant Ventricular Arrhythmia Database (vfdb), contains 22 half-hour ECG recordings of subjects who experienced episodes of sustained VT, VFlutter, and VF [18]. Lastly, the Creighton University Ventricular Tachycardia Database (cudb) includes 35 8-minute long ECG recordings of human subjects who experienced episodes of sustained VT, ventricular flutter, or VF [19]. The recordings from mitdb are sampled at 360 Hz, while those from vfdb and cudb are sampled at 250 Hz.

III. Methods

A Markov chain algorithm was developed for the classification and detection of the VA intervals of interest. The first part of this section provides details on the MCGENS algorithm. The second part describes the setup of the experiments and the data partitioning into training set, validation set, and testing set.

A. MCGENS Algorithm

Unlike more traditional Markov chain based models, the proposed MCGENS model performs computation via frequency analysis. The transition probabilities of the Markov chains and their underlying network structure, including the state space, were computed using this frequency analysis method. Consequently, this model was more adaptive and more faithfully reflected patterns within the signals.

Figure 1 provides an overview of the MCGENS algorithm. Essentially, two Markov models, MVA and MNon-VA, were learned from training data sets for VA and non-VA signals, respectively. ECGs from the training set were encoded and used to create a Markov chain, as shown in Figure 2. A new ECG signal, encoded as a discrete signal Q, was then assigned to the class ‘VA’ or ‘Non-VA’ by applying the two trained Markov models and then comparing the resulting conditional probabilities (QMVA) and (QMNon-VA). The algorithms for encoding the data and creating a Markov chain proceed as follows:

Fig. 1:

Fig. 1:

Overview of MCGENS

Fig. 2:

Fig. 2:

Training Scheme

1). Encoding the ECG as a word distribution:

Raw ECG signals were encoded into ternary word distributions through six steps. A ternary alphabet was chosen to represent three signal components: R-peak like dominant waves, minor peaks like T and P waves, and non-peak portions of the signal.

  1. Subtract moving average: The average of the signal over (t − 0.15, t + 0.15) time intervals was computed as
    fav(t)=10.3t.15t+.15f0(x)dx (1)
    and then subtracted from the original signal f0
    f(t)=f0(t)fav(t). (2)
  2. Peak filter: The only peaks retained were those with heights that were positive relative to the end points of the intervals (t − 0.1, t + 0.1), computed as
    f(t)=max{0,f(t)max{f(t0.1),f(t+0.1)}}. (3)
  3. Discretization: The potential state space of the relevant Markov chains was reduced to a finite space via
    xk=max{f(t)0.05×(k1)t0.05×k}. (4)
  4. Normalization: The signal was normalized by dividing by the local absolute maximum.

  5. Soft-thresholding: A soft-thresholding procedure was applied to systematically convert the signal into a sequence of probability vectors. The output of the soft-thresholding step was a sequence of 3-dimensional probability vectors, each of which is of the form
    [(R-peak)(TP-peak)(Non-peak)].
    In this matrix, ‘R-peak’ represents dominant R-wave like peaks and ‘TP-peak’ represents smaller waves more likely to be T or P waves. The two soft-thresholding functions, ϕR(x)=(R-peak) and ϕTP(x)=(TP-peak) were defined as follows:
    ϕR(x)={1if x>.85x3if .6x.80if x<.6, (5)
    and
    ϕTP(x)=(1maxt-local(ϕR(x)))×ϕTP0(x), (6)
    where maxt-local denotes the maximum in the relevant (i.e. kth) window of the signal where xk is defined and
    ϕTP0(x)={1if x>.0540x1if .025x.050if 0<x<.025. (7)

2). Create Markov Chain:

The states of the Markov chain consisted of all the words generated by the encoding steps with frequency of occurrence above a fixed threshold. Each state could transition into three possible states depending on the next letter to appear in the sequence. The new state was the largest suffix. The frequency thresholds and time interval for encoding were parameters that could be tuned on the training data sets.

B. Data Partitioning

The vfdb and cudb databases are sampled at 250 Hz, while the mitdb database is sampled at 360 Hz. Therefore, signals from mitdb were first re-sampled to 250 Hz. VA episodes including VT, ventricular flutter, and VF were extracted from the re-sampled signals according to the ground-truth annotations. VA episodes from a total of 28 patients were extracted. The signals from 22 (80%) patients were included in the training data set and the remaining 6 patients were grouped into the testing data set.

Signals were segmented into both 5-second and 2-second long episodes. A total of 1409 5-second episodes were in the training data set and 261 episodes were in the testing data set. For the 2-second analysis, a total of 3667 episodes were in the training data set and 662 episodes were in the testing dataset. Non-VA data was partitioned in a similar way to ensure that the testing data set had a patient population disjoint from the training data set.

Five-fold cross validation was performed at the patient level for parameter tuning and to prevent over-fitting. The entire training data set was equally partitioned into 5 parts on the patient level. The first 4 parts were used as training data for generating the Markov models and the last part was the validation data set. This process was repeated five times. Average results for classification from all five experiments were used to assess the performance. The sensitivity, specificity, F1-score, and area under the ROC curve (AUC) were computed based on the training data set. The Markov model with the highest AUC over the training data set was then applied to the testing data set to obtain the final results.

IV. Results

Performance was evaluated using 5-fold cross validation. Within the training data set, the best result had an AUC of 0.92 ± 0.05 and F1 score of 0.89 ± 0.03 for 5-second long episodes and AUC of 0.93 ± 0.03 and F1 score of 0.88 ± 0.04 for 2-second long episodes.

The parameters in the model with highest AUC in the training data were then applied to the testing data set. This testing set had a patient cohort separate from the training data.

When evaluated over the testing data set, the proposed algorithm correctly identified 243 of 261 (0.93 sensitivity) VA episodes and 227 of 261 (0.87 specificity) non-VA episodes for 5-second long signals. The AUC was 0.96 and the F1-score was 0.91. For two-second long signals, the algorithm correctly identified 625 of 662 (0.94 sensitivity) VA episodes and 600 of 662 (0.91 specificity) non-VA episodes (Table I) with an AUC of 0.97 and an F1-score of 0.93 (Figure 3).

TABLE I:

Confusion Matrix for VA Detection Markov Model

Annotation 5 seconds Annotation 2 Seconds
Prediction VA Non-VA Total VA Non-VA Total
VA 243 34 277 625 62 687
Non-VA 18 227 245 37 600 637
Total 261 261 552 662 662 1324

Fig. 3:

Fig. 3:

AUC-ROC, Testing Data (2 Seconds)

V. Discussion

The presented Markov model did not require any pre-processing or peak annotation of the signals. It was able to detect 5-second long VA episodes with a high AUC of 0.96 and F1-score of 0.91, and with an AUC of 0.97 and F1-score of 0.93 using 2-second long signals. Table II provides a performance comparison between our algorithm and other algorithms. Note that the precise conditions and set-up of these studies are not exactly the same.

TABLE II:

Comparison to Other Methods

Author, (Year) Data Classification Length(s) Algorithm Performance
Jekova, 2004 AHAVF vfdb Non-shockable vs. Shockable (VT >180 bpm + VF) 10 preprocess, criteria based, bandpass digital filtration Sen=0.96
Spec=0.94
Alonso, 2014 mitbih
cudb
vfdb
VF vs. Non-VF 8 preprocess, feature extraction, SVM Sen=0.92
Spcc=0.97
AUC=0.987
Tripathy, 2016 mitbih
cudb
vfdb
Non-shockable vs. Shockable (VF/VT) 5 variational mode decomposition, feature extraction, random forest Sen=0.96
Spec=0.98
AUC=0.97
Acharya,2018 mitbih
cudb
vfdb
Non-shockable vs. Shockable (VFL, VT, VF) 2 CNN Sen=0.95
Spec=0.91
MCGENS mitbih
cudb
vfdb
VFL/VT/VF vs. all others 2 Markov model (MCGENS) Sen=0.94
Spec=0.91
AUC=0.97

There are multiple advantages of the proposed method over traditional feature extraction with machine learning algorithm based approaches. First, the proposed algorithm did not rely on the efficacy of any pre-processing algorithms to remove noise and baseline wandering. Instead, the encoding algorithm uses filters and normalizes the signals. Most pre-processing algorithms require prior knowledge of noisy signals to build the best thresholds for filtering purpose. Our algorithm, by utilizing a word distribution algorithm, is more adaptive to different types of noisy signals. The second advantage was that the proposed algorithm did not require usage of an ECG peak annotation algorithm. Thirdly, this algorithm did not need extensive prior knowledge of the signals in order to build and extract features. Furthermore, even though recent novel algorithms using CNNs do not require pre-processing or feature extraction either, they still require longer training times and larger computational resources. Finally, the proposed model was flexible, robust, and adaptable to other types of arrhythmia like atrial fibrillation [16] and supraventricular tachycardia. It has potential applications to portable devices that could perform detection in real-time.

One limitation of the model was that it required a large number of good quality annotated signals for training. However, for severe types of arrhythmia with low prevalence such as VF, the number of annotated signals are limited.

Future work will utilize the proposed method to predict the onset of VA events several minutes in advance with real-time data from portable ECG devices.

VI. Conclusion

Ventricular arrhythmias, which originate from the ventricles, are a dangerous form of abnormal heart rhythm. This study applied a Markov model based approach to the detection of VA (including VT, VF and VFlutter) in 5-second long ECG signals and in 2-second long ECG signals. The proposed approach did not require peak annotation algorithms, nor any noise removal pre-processing of the signals. The proposed algorithm yielded an AUC of 0.96 and F1 of 0.91 for 5-second long signals and 0.97 AUC and F1 of 0.93 for 2-second long signals.

Acknowledgement

The research work in this paper was funded by Toyota Motor North America.

References

  • [1].Al-Khatib Sana M, Stevenson William G, Ackerman Michael J, Bryant William J, Callans David J, Curtis Anne B, Deal Barbara J, Dickfeld Timm, Field Michael E, Fonarow Gregg C, et al. 2017 aha/acc/hrs guideline for management of patients with ventricular arrhythmias and the prevention of sudden cardiac death: a report of the american college of cardiology/american heart association task force on clinical practice guidelines and the heart rhythm society. Journal of the American College of Cardiology, 72(14):e91–e220, 2018. [DOI] [PubMed] [Google Scholar]
  • [2].Myerburg Robert J. Sudden cardiac death: exploring the limits of our knowledge. Journal of cardiovascular electrophysiology, 12(3):369–381, 2001. [DOI] [PubMed] [Google Scholar]
  • [3].Myerburg Robert J and Junttila M Juhani. Sudden cardiac death caused by coronary heart disease. Circulation, 125(8):1043–1052, 2012. [DOI] [PubMed] [Google Scholar]
  • [4].John Roy M, Tedrow Usha B, Koplan Bruce A, Albert Christine M, Epstein Laurence M, Sweeney Michael O, Miller Amy Leigh, Michaud Gregory F, and Stevenson William G. Ventricular arrhythmias and sudden cardiac death. The Lancet, 380(9852):1520–1529, 2012. [DOI] [PubMed] [Google Scholar]
  • [5].Mukhopadhyay Subhas Chandra. Wearable sensors for human activity monitoring: A review. IEEE sensors journal, 15(3):1321–1330, 2015. [Google Scholar]
  • [6].Thakor Nitish V, Zhu Y-S, and Pan K-Y. Ventricular tachycardia and fibrillation detection by a sequential hypothesis testing algorithm. IEEE Transactions on Biomedical Engineering, 37(9):837–843, 1990. [DOI] [PubMed] [Google Scholar]
  • [7].Zhang Xu-Sheng, Zhu Yi-Sheng, Thakor Nitish V, and Wang Zhi-Zhong. Detecting ventricular tachycardia and fibrillation by complexity measure. IEEE Transactions on biomedical engineering, 46(5):548–555, 1999. [DOI] [PubMed] [Google Scholar]
  • [8].Li Haiyan, Han Wenguang, Hu Chao, and Meng Max Q-H. Detecting ventricular fibrillation by fast algorithm of dynamic sample entropy. In Robotics and Biomimetics (ROBIO), 2009 IEEE International Conference on, pages 1105–1110. IEEE, 2009. [Google Scholar]
  • [9].Amann Anton, Tratnig Robert, and Unterkofler Karl. A new ventricular fibrillation detection algorithm for automated external defibrillators. database, 1(2):3, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Amann Anton, Tratnig Robert, and Unterkofler Karl. Detecting ventricular fibrillation by time-delay methods. IEEE Transactions on Biomedical Engineering, 54(1):174–177, 2007. [DOI] [PubMed] [Google Scholar]
  • [11].Aramendi E, Irusta U, Pastor E, Bodegas A, and Benito F. Ecg spectral and morphological parameters reviewed and updated to detect adult and paediatric life-threatening arrhythmia. Physiological measurement, 31(6):749, 2010. [DOI] [PubMed] [Google Scholar]
  • [12].Alonso-Atienza Felipe, Morgado Eduardo, Fernandez-Martinez Lorena, García-Alberola Arcadi, and Rojo-Alvarez José Luis. Detection of life-threatening arrhythmias using feature selection and support vector machines. IEEE Trans. Biomed. Eng, 61(3):832–840, 2014. [DOI] [PubMed] [Google Scholar]
  • [13].Li Qiao, Rajagopalan Cadathur, and Clifford Gari D. Ventricular fibrillation and tachycardia classification using a machine learning approach. IEEE Transactions on Biomedical Engineering, 61(6):1607–1613, 2014. [DOI] [PubMed] [Google Scholar]
  • [14].Kiranyaz Serkan, Ince Turker, and Gabbouj Moncef. Real-time patient-specific ecg classification by 1-d convolutional neural networks. IEEE Transactions on Biomedical Engineering, 63(3):664–675, 2016. [DOI] [PubMed] [Google Scholar]
  • [15].Acharya U Rajendra, Fujita Hamido, Oh Shu Lih, Raghavendra U, Tan Jen Hong, Adam Muhammad, Gertych Arkadiusz, and Hagiwara Yuki. Automated identification of shockable and non-shockable life-threatening ventricular arrhythmias using convolutional neural network. Future Generation Computer Systems, 79:952–959, 2018. [Google Scholar]
  • [16].Li Zhi, Derksen Harm, Gryak Jonathan, Ghanbari Hamid, Gunaratne Pujitha, and Najarian Kayvan. A novel atrial fibrillation prediction algorithm applicable to recordings from portable devices. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 4034–4037. IEEE, 2018. [DOI] [PubMed] [Google Scholar]
  • [17].Tripathy RK, Sharma LN, and Dandapat Samarendra. Detection of shockable ventricular arrhythmia using variational mode decomposition. Journal of medical systems, 40(4):79, 2016. [DOI] [PubMed] [Google Scholar]
  • [18].Goldberger Ary L, Amaral Luis AN, Glass Leon, Hausdorff Jeffrey M, Ivanov Plamen Ch, Mark Roger G, Mietus Joseph E, Moody George B, Peng Chung-Kang, and Stanley H Eugene. Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation, 101(23):e215–e220, 2000. [DOI] [PubMed] [Google Scholar]
  • [19].Nolle FM, Badura FK, Catlett JM, Bowser RW, and Sketch MH. Creigard, a new concept in computerized arrhythmia monitoring systems. Computers in Cardiology, 13:515–518, 1986. [Google Scholar]

RESOURCES