Abstract
The development of induced pluripotent stem cell-derived cardiomyocytes (iPSC-CMs) has been a critical in vitro advance in the study of patient-specific physiology, pathophysiology, and pharmacology. We designed a new deep learning multitask network approach intended to address the low throughput, high variability, and immature phenotype of the iPSC-CM platform. The rationale for combining translation and classification tasks is because the most likely application of the deep learning technology we describe here is to translate iPSC-CMs following application of a perturbation. The deep learning network was trained using simulated action potential (AP) data and applied to classify cells into the drug-free and drugged categories and to predict the impact of electrophysiological perturbation across the continuum of aging from the immature iPSC-CMs to the adult ventricular myocytes. The phase of the AP extremely sensitive to perturbation due to a steep rise of the membrane resistance was found to contain the key information required for successful network multitasking. We also demonstrated successful translation of both experimental and simulated iPSC-CM AP data validating our network by prediction of experimental drug-induced effects on adult cardiomyocyte APs by the latter.
Research organism: Human
Introduction
The development of novel technologies has resulted in new ways to study cardiac function and rhythm disorders (Shaheen et al., 2018). One such technology is the induced pluripotent stem cell-derived cardiomyocyte (iPSC-CMs) in vitro model system (Leyton-Mange et al., 2014). The iPSC-CM system constitutes a powerful in vitro tool for preclinical assessment of cardiac electrophysiological impact and drug safety liabilities in a human physiological context (Sun et al., 2012; Lan et al., 2013; Burridge et al., 2016; Doss and Sachinidis, 2019; Collins et al., 2020; Wu et al., 2019). Moreover, because iPSC-CMs can be cultured from patient-specific cells, it has shown to be an ideal model system for patient-based medicine (Wu et al., 2019; Sayed et al., 2016; Matsa et al., 2016).
While utilization of in vitro iPSC-CMs allows for testing of responses to drugs and understanding physiological mechanisms (Tveito et al., 2018; Tveito et al., 2020; Sube and Ertel, 2017; Navarrete et al., 2013), there is still a major inherent limitation of the approach: the complex differentiation process to create iPSC-CMs results in a model of cardiac electrical behavior that resembles fetal cardiomyocytes. Hallmarks of the immature phenotype include spontaneous beating, immature calcium handling, presence of developmental currents, and significant differences in the relative contributions of repolarizing potassium currents compared to adult cardiomyocytes (adult-CMs) (Lieu et al., 2013; Veerman et al., 2015; Tu et al., 2018). The profound differences between the immature iPSC-CMs and the adult-CMs have led to persistent questions about the utility and applicability of the iPSC-CM action potential (AP) to predict relevant drug impacts on adult human electrophysiology (Blinova et al., 2018; Sala et al., 2017).
Several recent studies have proposed computational frameworks to address the primary limitation in using iPSC-CMs and animal cardiomyocytes for drug screening (Tveito et al., 2018; Tveito et al., 2020; Gong and Sobie, 2018; de Korte et al., 2020). The innovative studies described by Tvieto and colleagues (Sayed et al., 2016; Matsa et al., 2016) presented a translation algorithm that identified a mapping function to identify the relationships between the parameters that are defined by key ion channel conductances in the iPSC-CM APs and the adult-CM APs. In another study by Gong and Sobie, additional insights were revealed through application of an efficient partial least squares regression (PLSR) methodology to translate key physiological features between iPSC-CMs and adult-CMs. They also demonstrated the potential to translate between species, between drug-free and simple drugged models, as well as between healthy and diseased phenotypes (Gong and Sobie, 2018). Koivumäki et al. also tried to address the problem of iPSC-CMs immaturity by establishing a novel in silico mathematical model for iPSC-CMs, which can estimate adult-CM behavior (Koivumäki et al., 2018).
The efficacy of the linear translation algorithms used in the earlier studies relies on a collection of underlying assumptions (Gong and Sobie, 2018). One described by Tvieto et al. is that cardiac protein expression levels would differ but their functional properties remain invariant during maturation and that a drug will modify protein function in the same way for iPSC-CMs and the adult-CMs (Tveito et al., 2018). Tvieto et al. also acknowledged the difficulty in minimizing the cost function that measures the differences between the initial and target parameters, which therefore required a brute force search algorithm for minimization. One possible explanation for the difficulty in cost function minimization is that linear translation may not capture the nonlinearities comprising the actual underlying physiological differences (Gong and Sobie, 2018). Another underlying assumption with linear translation is the required representation of drug effects as a simple pore block, modeled as a reduction in the maximal conductance of the channel (Tveito et al., 2018; Gong and Sobie, 2018). The earlier studies employed a biased method in that they rely on a priori parameter identification and extraction from voltage and calcium traces to allow feature mapping from immature to mature conditions (Tveito et al., 2018; Gong and Sobie, 2018). Earlier translators must also consider drug-free and drugged conditions independently.
In this study, we describe a deep learning multitask network that simultaneously performs translation and classification of signals from simulated cardiac myocytes for both drug-free and drugged conditions and demonstrate its utility for translating and predicting experimental data as well. The multitask network is an unbiased approach in that the user does not predefine the important parameters of the system. Rather, the network learns from the data to define important parameter regimes and data ranges. The new approach is indifferent to the underlying form of the models and can translate time-series data from any source. Moreover, the deep learning approach accepts nonlinearity of the system, makes no assumptions about changes in cardiac protein expression and function during maturation, and can successfully translate simple pore block and complex conformation state-dependent channel–drug interaction. The network learns from all of these data sources suggesting its broad applicability, but it requires multiple quality datasets for robust and successful translation. In addition, the multitask behavior of the network presents a single process that can perform translating any cardiac AP into the subject of perturbation.
There are multiple reasons why cell classification was considered in the study. Importantly, iPSC-CMs are generally used to understand how perturbation to the cells will result in a change to cardiac electrophysiology. Genetic and drug-induced perturbations have been commonly studied using iPSC-CM lines. An important aspect iPSC-CMs is the inherent variability reported in measurements. Indeed, wide-ranging behavior has been reported to spontaneously occur even from cells cultured from the same genetic line. Thus, it can be difficult at times to determine if a perturbation indeed has an effect compared to a control cell. Therefore, one purpose of the classification task as described in this study is to allow sorting of cells into categories without perturbation and cells that have undergone perturbation. The classification task allows us to then address the question of whether translation is effective in the setting of perturbation. We demonstrate here the efficacy of a deep learning network to perform classification in the example setting of a drug-induced perturbation.
Artificial neural networks (ANNs) are increasingly used to advance personalized medicine (Alhusseini et al., 2020; Rogers, 2020; Sevakula et al., 2020; Jin et al., 2009; Trayanova et al., 2021). Long-short-term-memory (LSTM)-based networks, which are capable of learning order dependence in sequence prediction problems (Hochreiter and Schmidhuber, 1997), have been widely used for cardiac monitoring purposes (Guo et al., 2021; Shi et al., 2021; Picon et al., 2019). They have been used to extract important biomarkers from raw ECG signals (Ballinger, 2018; He et al., 2019; Hou et al., 2019) and help clinicians to accurately detect common heart failure biomarkers in ECG screenings (Ballinger, 2018; Warrick and Homsi, 2017; Oh et al., 2018; Chen et al., 2020; Wang and Zhou, 2019; Bian, 2019). LSTM networks, which can catch existing temporal information in the electronic health records (EHRs), have been highlighted as the best predictive models using real-time data (Maragatham and Devi, 2019). LSTM-based classifiers have also empowered early arrhythmia detection by automatically classifying arrhythmias using ECG features (Yildirim et al., 2019; Wang et al., 2019; Martis et al., 2013; Liu et al., 2019; Yildirim, 2018). In addition, deep learning algorithms have been employed to predict drug-induced arrhythmogenicity associated with blockade of the delayed rectifier K+ channel current (IKr) in the CMs encoded by human ether-à-go-go-related gene (hERG) (Yang et al., 2020) for sets of small molecules in drug discovery and screening process (Yang et al., 2020; Cai et al., 2019; Zhang et al., 2019; Dickson et al., 2020; Ryu et al., 2020; Li et al., 2020).
Here, we implemented a deep learning LSTM-based multitask network to classify iPSC-CM AP traces into drug-free and drugged categories and translate them into adult-CM AP waveforms. To collect robust realistic simulated data for training the multitask network, we paced simulated cardiac myocytes with the addition of a physiological noise current at matching cycle lengths for Kernik in silico iPSC-CMs (Kernik et al., 2019) and O’Hara–Rudy in silico human adult-CMs (O'Hara et al., 2011) to generate a population of drug-free simulated cardiac myocyte data. To ensure that our model could perform for both drug-free and drugged iPSC-CM and adult-CM APs simultaneously, we simulated drugged samples via both a simple drug-induced IKr block model of hERG channel conduction, GKr, reduction by 1–50% and a complex Markov model of conformation-state dependent IKr block in the presence of a clinical concentration, 2.72 ng/mL, of a potent hERG blocking drug dofetilide from our recent study (Yang et al., 2020). We evaluated the multitask network performance on a test dataset and showed excellent performance to translate and classify signals in the form of time-resolved AP traces. We performed ablation studies to reveal the most important iPSC-CM AP information for classifying AP traces into drug-free and drugged categories and network translation into adult-CM APs by removing iPSC-CM AP values during various time frames (feature ablation). We also explored the importance of individual LSTM network building blocks and how decoupling of the translation and classification tasks affected overall network performance. We then showed how the proposed multitask network can be applied even to scarce experimental data, which was also used to validate the model.
In this study, we show that developments in iPSC-CM experimental technology and cardiac electrophysiological modeling and simulation of iPSC-CMs can be leveraged for the application of ANNs as a universal approximator (Goodfellow et al., 2016) to find the most accurate mapping function that is capable of learning nonlinear relationships to predict disease phenotype and drug response in cardiac myocytes from immaturity to maturation.
Results
In this study, we set out to build a multitask network that would perform two distinct tasks: the first task is to classify iPSC-CM APs into drug-free and drugged categories. The second goal is to translate iPSC-CM APs into corresponding adult-CM AP waveforms. To collect the data for training the multitask network, we simulated a population of 208 AP waveforms for both Kernik in silico human iPSC-CMs (Kernik et al., 2019; Figure 1E, blue) and O’Hara–Rudy in silico human adult-CMs (O'Hara et al., 2011; Figure 1F, blue). We ensured consistency across a population of simulated myocytes by applying physiological noise at matching the cycle lengths into the iPSC-CMs and adult-CMs. The cell variability in each population is intended to represent the individual variability that is observed in a drug-free human population (Kernik et al., 2019; O'Hara et al., 2011; Tanskanen and Alvarez, 2007). An average AP trace from the population is shown in Figure 1A for iPSC-CMs and Figure 1B for adult-CMs. In Figure 1C, D, the ionic currents underlying the in silico iPSC-CM APs and adult-CM APs show marked differences, one reason for the broadly expressed concerns about the applicability of utilizing immature iPSC-CM APs in the study of human disease and pharmacology. The substantial current differences illustrate the necessity of a generalized approach to perform translation from immature myocytes into mature myocytes. To ensure that our multitask network could perform over a range of conditions and model forms, we simulated drugged iPSC-CM and adult-CM APs via both a simple IKr drug block model of GKr reduction by 1–50% (250 samples in Figure 1E, F, green) and a complex model of conformation-state dependent IKr block in the presence of 2.72 ng/mL dofetilide (300 samples in Figure 1E, F, purple). We combined the drug-free and drugged models with simple and complex IKr block model schemes (758 samples) for training the multitask network. The differences in key parameters, upstroke velocity (Vmax), maximum diastolic potential (MDP), and action potential durations (APD) across the three conditions are tabulated and shown in Figure 1G.
Next, we applied a digital forward and backward data filtering technique (Gustafsson, 1996) to the simulated iPSC-CM and adult-CM AP traces (Figure 2, left panels). Since we applied physiological noise to introduce a source of variability (as observed in human populations) in our model simulations, we assessed the possible phase distortion for AP waveforms following noise filtering. In Figure 2 (right panels), the distribution of iPSC-CM and adult-CM AP duration at 90% repolarization (APD90) values is shown. The near superimposition of the histogram distributions assures that noise filtering does not change the AP waveform morphology or time course and primarily removes existing vertical noises. Figure 2A, B shows simulated drug-free iPSC-CM and adult-CM APs and corresponding APD90 distribution with physiological noise in blue and after applying the noise filtering technique in black for iPSC-CM APs and red for adult-CM APs. The same plots are illustrated for drugged AP traces with simple 1–50% IKr block (Figure 2C, D) and with complex IKr block model in the presence of 2.72 ng/mL dofetilide (Figure 2E, F). Next, we normalized drug-free and drugged noise-filtered iPSC-CM APs and adult-CM APs to use them as input and output, respectively, for training the multitask network.
The building blocks of the multitask network are illustrated in Figure 3A. The multitask network receives preprocessed simulation-generated iPSC-CM AP waveforms (noise-filtered and normalized) as input and scans whole AP time-series values through two stacked LSTM layers (Figure 3A, D). The LSTM layers remember the most important iPSC-CM AP values (features) they need to perform the translation and classification tasks and pass the information to two fully connected layers (Figure 3A, E), one for the translation task to predict the corresponding adult-CM AP waveform (Figure 3B) and one for the classification task to classify iPSC-CM APs into drug-free and drugged categories (Figure 3C).
The workflow for training and evaluating the multitask network is depicted in Figure 4. As described above, we generated simulated drug-free and drugged iPSC-CM and adult-CM APs and applied a noise filtering technique to the AP waveforms. The waveforms were then normalized in a data preprocessing step for more efficient training of the multitask network. We used preprocessed iPSC-CM APs as the network input and adult-CM APs along with corresponding drug-free and drugged labels as network outputs, respectively. Next, we randomly split input and output data in 70:10:20 ratio into three subcategories: training, validation, and test datasets. We used the training dataset for training the multitask network to simultaneously perform translation and classification. The mean squared error, R2 score (Devore, 2011), and error in adult-CM APD90 prediction were used as evaluation metrics for the translation task. For the classification task, area under the receiver operating characteristic (AUROC) curve (Fawcett, 2006), network prediction accuracy, precision, and recall (Powers, 2011) were used to evaluate the network performance. To prevent overfitting, we calculated the evaluation metrics for both tasks using validation data during each iteration of training and compared those with values from the training dataset. When the model performance on the training dataset exhibited degradation relative to the validation dataset, we ceased training and tuning of the network hyperparameters. We evaluated the underlying mechanisms that inform the network performance by using a holdout test dataset to perform an ablation study. The ablation study allowed us to identify the most important information for network performance and is an indicator of the data that the network deems most important to remember to classify AP traces into drug-free and drugged categories and allow accurate translation into adult-CM APs (feature ablation). Finally, we performed a type of network component dissection by sequentially eliminating individual LSTM layers or the classification task to determine if all elements of the network are important to the overall performance.
Figure 5 and Table 1 illustrate the overall multitask network performance for translation and classification tasks for the training and test datasets. Figure 5A, B represents iPSC-CM APs (black), which were used for training and testing the multitask network, respectively. Figure 5B, E depicts the comparison between simulated (red) and translated (cyan) adult-CM APs used for the training and testing the network. The comparison between histogram distribution of APD90 values for simulated and translated adult-CM APs in Figure 5C, F shows good agreement in terms of the frequency of virtual cells with similar APD.
Table 1. Statistical measures for evaluating the performance of the multitask network for both iPSC-CM AP trace classification into drug-free and drugged categories and their translation into adult-CM APs for training and test datasets as well as the effect of removing LSTM layers and classification task on the network performance.
Translation | |||||||
---|---|---|---|---|---|---|---|
Performance metrics | MSE | R2 score | Error in APD90 prediction | ||||
Training dataset | 0.0027 | 0.992 | 3.41% | ||||
Test dataset | 0.0029 | 0.991 | 3.60% | ||||
Remove LSTM layers test dataset | 0.0031 | 0.991 | 3.78% | ||||
Remove classification task test dataset | 0.0034 | 0.990 | 4.33% | ||||
Classification | |||||||
Performance metrics | AUROC | Accuracy | Recall | Precision | |||
Training dataset | 0.93 | 92% | 0.92 | 0.93 | |||
Test dataset | 0.91 | 92% | 0.92 | 0.92 | |||
Remove LSTM layers test dataset | 0.90 | 92% | 0.90 | 0.91 | |||
iPSC-CM: induced pluripotent stem cell-derived cardiomyocyte; AP: action potential; adult-CM: adult cardiomyocyte; AP: action potential; AUROC: area under the receiver operating characteristic curve; LSTM: long-short-term-memory. |
The performance evaluation metrics for both the translation and classification tasks are listed in Table 1. The multitask network exhibits high accuracy in performing translation, despite large variability in APDs and regardless of the underlying model form. The network is able to translate iPSC-CM APs into adult-CM APs with less than 0.003 mean-squared error (MSE), 0.99 R2 score, and <4% error in APD90 prediction for both training and test datasets. To evaluate the network performance for the classification task, we compared the AUROC, prediction accuracy, recall, and precision for both training and test datasets. The multitask network proved to perform well in categorizing iPSC-CM APs into drug-free and drugged waveforms with approximately 90% accuracy (Table 1). Finally, we performed a type of network component dissection by sequentially eliminating individual LSTM layers or the classification task to determine if all elements of the network are important to the overall performance. The impact of removing these elements of the network on the network performance is shown in Table 1.
Next, we performed a ‘computational’ ablation study as a correlate to the types of physiological ablations that are used to examine the roles and functions of a physiological system (LeCun et al., 1989; Reale et al., 1987). We tested how the performance of the multitask network would change by removing various information contained within specified time frames as shown in Figure 6A, B. To reveal the most important iPSC-CM AP information for classifying iPSC-CM APs into drug-free and drugged traces and translation into adult-CM APs, we did not allow the network to process data from within designated time frames from the iPSC-CM APs (feature ablation). We then retrained the multitask network by setting the missing information equal to zero and compared the calculated AUROC for classification task and MSE in adult-CM APs translation (red bars) with the recorded corresponding values for multitask network (green line) when it was provided full access to the complete iPSC-CM AP data. We observed that the network is extremely sensitive to information contained within the 400–500 ms time frame (blacked dashed bar in Figure 6A, B).
This result suggests that the most important information needed to classify iPSC-CM APs into drug-free and drugged traces and distinguish adult-CM AP signals from iPSC-CM AP signals is contained in a particular region of the AP plateau. The time frame of the AP between 400 and 500 ms (Figure 6A, B) corresponds to a phase of exquisite sensitivity to perturbation. We have identified this particular AP range in an earlier study as the phase when the membrane resistance of the myocyte increases markedly (Figure 6C; Yang et al., 2015). This occurs as the inward and outward currents balance each other, leading to a net whole cell current that is nearly constant so that dI → 0, dV/dI → ∞ (Figure 6D), followed by a rapid reduction in outward current. Figure 6E demonstrates that individual current densities have a period of inward and outward current balance followed by rapid changes in IKr and other repolarizing currents at 400–500 ms time interval.
We next set out to demonstrate the real-world utility of the multitask classification and translation network by applying the network to experimental data. We used experimental iPSC-CM APs from the Kurokawa lab (Figure 7A) as the input data into the multitask network and translated to predicted adult-CM APs as shown in Figure 7B. The translation notably resulted in a reduction in variability in APD in the adult translated cells, consistent with our simulated results and with previous experimental observations (Blinova et al., 2018; Fabbri et al., 2019). In an additional validation of the multitask network, we undertook a test of the network to accurately translate drug block in iPSC-CMs to adult AP effects and then compared the predicted results with measured experimental data (O'Hara et al., 2011). We first simulated iPSC-CM APs with 50% block of IKr. We then used these simulated APs as an input for the multitask network and used the output from the translation task to predict 50% block on adult-CMs. In Figure 7C, the translated drugged APD90 values are shown as turquoise asterisks from spontaneously beating (~1000 ms cycle length) simulated iPSC-CMs plotted against simulations from O’Hara–Rudy adult-CM APs with 50% IKr block (red curve) and experimental 50% block of IKr by 1 μM E-4031 (blue squares) (O'Hara et al., 2011). These data validate that the effects of drug block in iPSC-CMs can be successfully translated to predict drug effect on adult human cardiomyocyte APs.
Discussion
In this study, we developed a data-driven deep learning approach to address the well-known shortcomings in the iPSC-CM platform. A concern with iPSC-CM is that the data collection results in measurements from immature APs, and it is unclear if these data reliably indicate impact in the adult cardiac environment (Navarrete et al., 2013; Casini et al., 2017; Goversen et al., 2018; Knollmann, 2013; Sinnecker et al., 2013; Blinova et al., 2017). Here, we set out to demonstrate a new way to allow translation of results from the iPSC-CM to a mature adult cardiac response. The deep learning network also revealed new mechanisms that are critical to convert iPSC-CM APs to mature adult cardiac APs.
Application of a deep learning ANN to simultaneously translate and classify signals from simulated iPSC-CMs for both drug-free and drugged conditions has several key advantages. Because there is no need for the multitask network user to a priori define the important system parameters, the approach is by definition an unbiased model. A key part of the ‘artificial intelligence’ is learning from the data to make decisions about which elements of the data are the most important. Another benefit is the model-agnostic approach in that the learning network is indifferent to the underlying form of the models and can readily translate time-series data from any source. The nonlinearity of the system is accepted by the deep learning approach, and there are no assumptions made about cardiac protein expression levels and changes in their function during cardiomyocyte maturation. The deep learning ANN can successfully translate simple pore block and complex conformation state-dependent channel–drug interaction models. The network can learn from multiple sources of data even when they are generated from different models and learns from all the data sources concurrently for robust and successful translation. All of these aspects of the technology presented here suggest broad applicability for use across ages, species, and conditions, and we demonstrate its utility for translating and predicting experimental data.
The multitask network presented here performed well in the setting of the noted variability in measurements from iPSC-CM APs. As described in Figure 1, we utilized a modeling and simulation approach from our recent studies (Kernik et al., 2019; Kernik et al., 2020) to generate a population of iPSC-CM APs that incorporate variability comparable to that in experimental measurements. Utilizing simulated data presented a unique opportunity: we were able to generate large amounts of data that were used both to train and optimize the network and then to test the network with specifically designated distinct simulated datasets. Utilizing simulated data to train a deep learning network may constitute a widely applicable approach that could be used to train a variety of networks to perform multiple functions where access to comparable experimental data is not feasible.
The multitask network exhibits high accuracy in performing translation, despite large variability in APDs and regardless of the underlying model form (Figure 5 and Table 1). The network was able to translate iPSC-CM APs into adult-CM APs with less than 0.003 MSE, 0.99 R2 score, and less than 4% error in APD90 prediction for both the training and test datasets. To evaluate the network performance for the classification task, we compared the AUROC, prediction accuracy, recall, and precision for both training and test datasets. The multitask network proved to perform well in categorizing iPSC-CM APs into drug-free and drugged waveforms with approximately 90% accuracy (Table 1). Finally, we performed a type of network component dissection by sequentially eliminated individual LSTM layers or the classification task to determine if all elements of the network are important to the overall performance. The impact of removing these elements of the network on its performance is shown in Table 1. The studies show that the multitask network conferred additional benefit over considering the translation task alone. For example, we noted that adding the classification task to distinguish drug-free and drugged APs could improve the performance of the translation task (Table 1).
When we performed an ablation study to prevent the deep learning network from using information within prespecified time windows, the results revealed that the most important information needed to classify iPSC-CM APs into drug-free and drugged traces and predict adult-CM APs from iPSC-CM AP signals is contained in the phase of the AP between 400 and 500 ms (Figure 6). This result suggests that the most important information needed to classify and distinguish iPSC-CM AP signals from adult-CM AP signals is contained in the range of the AP that corresponds to a phase of exquisite sensitivity to perturbation. We have identified this particular AP range in an earlier study as the phase when the membrane resistance of the myocyte increases markedly (Figure 6C; Yang et al., 2015). This occurs as the inward and outward currents balance each other, leading to a net whole cell current that is unchanging (dI → 0, dV/dI → ∞), followed by a rapid reduction in the outward current (Figure 6D, E). It is not surprising that this time frame is shown to contain the most important information to perform the classification task as the effect of IKr block is critical during the high resistance phase of the membrane potential. It is possible that other types of perturbations (e.g., Na channel blocker, ischemia) may lead to a different outcome, and we will pursue those questions in future studies.
Following the optimization and demonstration of the network as an accurate tool for both translating and classifying data, we then used the same network to translate experimentally obtained data. We showed that the proposed network can effectively take experimental data as an input from immature iPSC-CM APs and translate those data to produce adult AP waveforms. It is notable that the variation observed in the adult-CM AP duration is smaller compared to iPSC-CM APDs (Figure 7A, B). This has been observed both experimentally (Blinova et al., 2018; Fabbri et al., 2019) and in our simulated cell environment (Kernik et al., 2019; Kernik et al., 2020). Although the simulated iPSC-CM has a large initial calcium current (Figure 1C) compared to the simulated adult-CM (Figure 1D), the amplitude of currents flowing through adult-CM AP plateau is notably larger. The immature iPSC-CM cells have low conductance during the AP plateau, rendering it comparably higher resistance. For this reason, small perturbations to the iPSC-CM APs have a larger impact on the resulting AP duration than observed in adult cells (Yang et al., 2015). We also used simulated iPSC-CM APs subject to 50% block of IKr. We translated those data to adult-CM APs and then compared with the previously reported impact of 50% IKr block on adult human cell APs from experiments (O'Hara et al., 2011) and noted excellent agreement, thereby providing validation of our network.
The deep learning algorithm presented here has the benefit of automating feature extraction without any predetermination of the feature. It also allows for the translation of time-course data from simulated or experimental datasets. However, there are some limitations to the approach. One limitation is the requirement for multiple datasets that are of sufficient quality for training – the more robust the training set, the higher the accuracy of the task. It is possible that this limitation can be addressed in future studies by developing new methods for data extraction and data interpolation in sparse datasets. We addressed this limitation by utilizing simulated data to train the network, and this approach might be applicable for a variety of physiological problems. Simulated data can be generated to constitute a robust dataset that can be used to train the multitask model and allow extraction of the relevant features from any time-course dataset.
In this study, we show that a deep learning network can be applied to classify cells into the drug-free and drugged categories and can be used to predict the impact of electrophysiological perturbation across the continuum of aging from the immature iPSC-CM AP to the adult ventricular myocyte AP. By extension, the classification task might even be applied to distinguish cellular-level signals derived from cells cultured using different protocols. We translated experimental immature APs into mature APs using the proposed network and validated the output of some key model simulations with experimental data. The multitask network in this study was used for translation of iPSC-CMs to adult APs but could be readily extended and applied to translate data across species and classify data from a variety of systems. Also, another extension of the technology presented here is to predict the impact of naturally occurring mutations and other genetic variations (Yoshinaga et al., 2019).
Materials and methods
Simulated data for training and testing the multitask network
The drug-free iPSC-CM and adult-CM APs
The Kernik in silico iPSC-CM baseline cells were paced from resting steady state. The O’Hara–Rudy in silico endocardial cell model was used for the baseline adult-CMs (O'Hara et al., 2011). The control adult-CMs were paced at the cycle length of 982 ms to match the cycle length of the last beat of the spontaneously depolarizing iPSC-CM AP. The iPSC-CM AP populations (n = 208) were generated by incorporating physiological noise (see Simulated physiological noise currents section below). The adult-CMs were paced with noise for 100 beats after reaching steady state at the matching cycle length of the last beat of iPSC-CM AP populations. The numerical method used for updating the voltage was Forward Euler method (Atkinson, 2008).
A simple drug-induced 1–50% IKr block model through GKr reduction
The iPSC-CMs and the adult-CMs populations were paced with 1–50% IKr block with 1% increments. This was accomplished by scaling down hERG channel (IKr) conduction, GKr, by the fraction of the block, GKrscale, in the 0.50–0.99 range with 0.01 decrements (see central rows in Figure 1G). The adult-CM model was simulated at five varying beating rates for each percentage of block that matches to the last beat of iPSC-CMs with 1–50% IKr block (n = 250). For example, one drugged adult-CM (50% IKr inhibition) was paced at cycle length of 1047 ms to match the cycle length of the last beat of iPSC-CMs AP with 50% IKr block.
Complex model of conformation-state dependent IKr block in the presence of 2.72 ng/mL dofetilide
The IKr channel Hodgkin–Huxley model in both iPSC-CM and adult-CM AP models was replaced with a drug–hERG channel interaction Markov model (see bottom rows in Figure 1G) that we have previously published (Yang et al., 2020). iPSC-CM (n = 300) and adult-CM AP populations (n = 300) were generated with physiological noise in the presence of 2.72 ng/mL dofetilide, a potent hERG channel blocker. The adult-CM populations were paced with dofetilide for 100 beats after reaching steady state at the matching cycle length of the last beat of iPSC-CM AP populations with dofetilide as described above. The simulated drugged and drug-free iPSC-CM and adult-CM AP data used for training and testing the multitask network have been made publicly available at Clancy lab GitHub. (https://github.com/ClancyLabUCD/Multitask_network/tree/master/data, copy archived at swh:1:rev:7f2b653a91f552d66ae2d9b70b720f8706b36da3, Aghasafari, 2021).
Simulated physiological noise currents
Simulated noise current was added to the last 100 paced beats in the simulated AP models, and simulated APs were recorded at the 2000th paced beat in single cells. This noise current was modeled using the equation from Tanskanen and Alvarez, 2007
(1) |
where n∈N(0,1) is a random number from a Gaussian distribution, and ∆t is the time step. ξ = 0.3 is the diffusion coefficient, which is the amplitude of noise. The noise current was generated and applied to membrane potential, Vt, throughout the last 100 beats of simulated time course.
Experimental iPSC-CMs
Human iPSC-CMs (201B7, RIKEN BRC, Tsukuba, Japan) were cultured and subcultured on SNL76/7 feeder cells as described in detail previously (Li et al., 2017). Cardiomyocyte differentiation was performed as described (Li et al., 2017). Commercially available iCell-cardiomyocytes (FUJIFILM Cellular Dynamics, Inc, Tokyo, Japan) were cultured according to the manual provided from the company. APs were recorded with the perforated configuration of the patch-clamp technique as described in detail previously (Li et al., 2017). Measurements were performed at 36 ± 1°C with the external solution composed of (in mM) NaCl (135), NaH2PO4 (0.33), KCl (5.4), CaCl2 (1.8), MgCl2 (0.53), glucose (5.5), and HEPES, pH 7.4. To achieve patch perforation (10–20 MΩ; series resistances), amphotericin B (0.3–0.6 µg/mL) was added to the internal solution composed of (in mM) aspartic acid (110), KCl (30), CaCl2 (1), adenosine-5′-triphosphate magnesium salt (5), creatine phosphate disodium salt (5), HEPES (5), and EGTA (11), pH 7.25. In quiescent cardiomyocytes, APs were elicited by passing depolarizing current pulses (2 ms in duration) of suprathreshold intensity (120% of the minimum input to elicit APs) with a frequency at 1 Hz unless noted otherwise. The experimental data used for the model validation have been made publicly available at Clancy lab GitHub. (https://github.com/ClancyLabUCD/Multitask_network/blob/master/data/clean_data/experiments.csv).
The multitask network architecture
The multitask network comprised two stacked LSTM layers followed by independent fully connected layers (Figure 3A) for the classification and translation tasks. The LSTM layers memorized the important information the network needed to perform two discussed tasks and then transferred the extracted information (features) into the subsequent fully connected layers to translate iPSC-CM APs into adult-CM AP waveforms (Figure 3B) and classify iPSC-CM APs into drug-free and drugged categories (Figure 3C).
LSTM layers (Figure 3D)
We used LSTM layers as the first two layers of the multitask network to promote network temporal information learning which data in a sequence was important to keep or to throw away. At each time step, the LSTM cell took in three different pieces of information, the current input data , incoming short-term memory (hidden state) and incoming long-term memory (cell state) . The LSTM layers were responsible for extracting the most important information while scanning the AP traces using the short- and long-term memory components. The short-term memory weighted the importance of AP values at subsequent time steps and long-term memory has been using the short-term memory to decide the overall importance of all AP values from the beginning (t = 0 ms) to the end (t = 701 ms) for performing classification and translation tasks. The LSTM cells contained internal mechanisms called gates. The gates were neural network with weights (w) and bias terms (b) that regulated the flow of information at each time step before passing on the long-term and short-term information to the next cell (Cheng et al., 2016). These gates are called input gate, forget gate, and output gate (Figure 3D).
The forget gate, as the name implies, determined which information from the long-term memory should be kept or discarded. This was done by multiplying the incoming long-term memory by a forget vector generated by the current input () and incoming short-term memory (). To obtain the forget vector, the incoming short-term memory and current input were passed through a sigmoid function () (Olah, 2017). The output vector of sigmoid function, Ft, (Equation 2) was a binary comprising 0s and 1s and was then multiplied by the incoming long-term memory () to choose which parts of the long-term memory were retained.
(2) |
The input gate decided what new information is being stored in current long-term memory (). It considered the current input () and the incoming short-term memory () and transformed the values to be between 0 (unimportant) and 1 (important) using a sigmoid activation function () (Equation 3). The second layer in input gate took the incoming short-term memory () and current input () and passed them through a hyperbolic tangent activation function () to regulate the network computation (Equation 4).
(3) |
(4) |
The outputs from the forget and input gates then underwent a pointwise addition to find the current long-term memory () (Equation 5), which was then passed on to the next cell.
(5) |
Finally, the output gate utilized current input () and the incoming short-term memory () and passed them into a sigmoid function () (Equation 6). Then the current long-term memory () passed through a tanh activation function () and the outputs from these two processes were multiplied to produce the current short-term memory (Equation 7).
(6) |
(7) |
The short-term and long-term memory produced by these gates were carried over to the next cell for the process to be repeated. The output of LSTM layers for each time step () was obtained from the short-term memory, also known as the hidden state, and was subsequently passed into fully connected layers to perform the translation and classification tasks as described below.
Fully connected layers (Figure 3E)
The fully connected neural network layers contained input, hidden, and output layers (Figure 2E) with various numbers of neurons (). Every neuron in a layer was connected to neurons in the next layer (Krogh, 2008). Fully connected layers received the output of LSTM layers as input. The fully connected layers calculated a weighted sum of LSTM outputs and added a bias term to the outputs. These data were then passed to an activation function (f) to define the output for each neuron (Equations 8 and 9; Carugo and Eisenhaber, 2010).
(8) |
(9) |
where and (i, j) represent the number of hidden layers and neurons in each pair of subsequent hidden layers (). The optimized values for these parameters were found via hyperparameter tuning where is each neuron output. is the LSTM layer output and the input to the fully connected layers, and is the network output: , where and are the outputs for translation and classification tasks, respectively. We first assigned random values to all network parameters ; each neuron weight (Figure 3E), bias term , which is a constant added to calculate the neurons output and other network hyperparameters (the number of hidden layers, the number of neurons for each hidden layer and activation functions for each hidden layer) to start the optimization process for finding the best network infrastructure. Next, we estimated the network errors using MSE (Equation 10) and cross-entropy loss functions (Equation 11) to map the translation and classification tasks (Goodfellow et al., 2016; Murphy, 2012), respectively.
(10) |
(11) |
where m is the total number of LSTM layer outputs () and and are the simulated and translated adult-CM APs (the network output for translation task). The is binary indicator of class labels for iPSC-CM APs (0 for drug-free or 1 for drugged categories) and is predicted probability of APs being classified into the discussed classes. We used sum of both loss functions (Equation 12) to calculate the overall network error () for both translation and classification tasks during the network training process. We updated network parameters using adaptive momentum estimation (ADAM) optimization algorithm (Kingma and Ba, 2014) based on the average gradient of overall loss function with respect to the network parameters for 64 randomly selected simulated AP traces (mini-batch = 64) at each training iteration (Equations 13–15).
(12) |
(13) |
(14) |
(15) |
We used a rectified linear unit (ReLu) (Glorot et al., 2011) as activation function in Equation 8 to calculate the output for each hidden layer neuron at each training iteration. We used dropout regularization (Zaremba et al., 2014) to randomly drop neurons with 0.2 probability of elimination along with their connections from the LSTM and fully connected layers during training to reduce the overfitting. We kept updating the network parameters using ADAM optimization algorithm (Equation 13) to find global minimum of loss function (Equation 12). We computed the exponential average of the gradient (Equation 14) as well as the square of the gradient (Equation 15) for each parameter (), where is the learning rate equal to 0.001, are first and second momentum coefficients equal to 0.9 and 0.999, and is a small term equal to 1e-8 preventing division by 0.
Computational workflow (Figure 4)
We first preprocessed iPSC-CM and adult-CM APs by applying a digital forward and backward data filtering technique (Gustafsson, 1996) and calculated the mean values for iPSC-CM and adult-CM AP traces. We removed the calculated mean values from the corresponding AP traces to center values on zero. Next the iPSC-CM and adult-CM AP traces were divided by maximum AP values to normalize the AP values for more efficient training process. Next, we split the preprocessed data in 70:10:20 ratio into training, validation, and test datasets, respectively, and implemented the network architecture using Pytorch (Ketkar, 2017). During the training process, the multitask network received iPSC-CM AP time-course data as inputs and predicted adult-CM AP time courses. The network also received the category (drug-free and drugged) of the iPSC-CM AP data. The network next calculated the MSE (Equation 10) between predicted AP waveforms and the expected waveforms for adult-CM APs. It also calculated cross-entropy (Equation 11) between the predicted category for the iPSC-CM AP and the expected value. The cross-entropy was added to the calculated MSE to determine the total loss for training. The ADAM optimization algorithm was then used to update the network weights and bias terms.
We performed updating the network parameters (Equation 13) and monitored the network performance for the training and validation datasets until the point at which the network performance on the training dataset began to degrade compared to the validation dataset. This process was used to identify the optimal number of iterations (epochs = 300) for the training process. The last trained network was designated as the best possible model to perform both translation and classification tasks. We then used a holdout test dataset and calculated MSE (Equation 10), R2 score (Equations 16 and 17), and the error in prediction for adult-CM APD90 as evaluation metrics to assess the performance of the network for translation task and the AUROC, accuracy, recall, and precision to measure capability of network for classification task as described below. The network codes have been made publicly available at Clancy lab GitHub. (https://github.com/ClancyLabUCD/Multitask_network).
Evaluation metrics for the translation and classification tasks
As we discussed, we used MSE and cross-entropy loss functions for performance evaluation of translation and classification tasks. In addition to MSE, we computed R2 score (Devore, 2011; Equations 16 and 17) to measure how close the translated adult-CM AP was to the expected simulated adult-CM AP . We compared the histogram distribution of simulated and translated adult-CM APD90 values and the error in APD90 prediction to assess the accuracy of network prediction.
(16) |
(17) |
We used AUROC to measure the capability of the model to distinguish between drug-free and drugged iPSC-CM APs (Fawcett, 2006). AUROC is the area under the receiver operating characteristic (ROC) curve that is a plot of the false-positive rate (FPR), the probability that the network classified drug-free iPSC-CM APs into drugged categories (FP) (Equation 18) versus the true-positive rate (TPR) or recall, the probability that the network correctly classified drugged iPSC-CM APs into drugged category (TP) (Equation 19). AUROC close to 1 indicated a model with a desirable measure of separability, while a poor model had AUROC near 0, which means that it had poor separability.
In addition, we used recall, accuracy, and precision to describe the performance of the network for the classification task (Sube and Ertel, 2017), where the accuracy and precision indicated the proportion of all correct, TP + true negatives (TN), that is, predicted drug-free APs (Equation 20) and correct positive identifications (Equation 21). False negatives (FN) in Equations 19 and 20 were the total number of drugged iPSC-CM APs classified as drug-free.
(18) |
(19) |
(20) |
(21) |
Acknowledgements
This study was supported by the NIH Common Fund OT2OD026580, SPARC OT2OD025308‐01S2, American Heart Association Career Development Award 19CDA34770101, NIH NHLBI grants R01HL152681, R01HL128170 and U01HL126273, Department of Physiology and Membrane Biology Research Partnership Fund, Extreme Science and Engineering Discovery Environment (XSEDE) Grant MCB170095, National Center for Supercomputing Applications (NCSA) Blue Waters Broadening Participation Allocation, Texas Advanced Computing Center (TACC) Leadership Resource Allocation MCB20010, Oracle cloud for research allocation.
Funding Statement
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Contributor Information
Colleen E Clancy, Email: ceclancy@ucdavis.edu.
Thomas Hund, The Ohio State University, United States.
José D Faraldo-Gómez, National Heart, Lung and Blood Institute, National Institutes of Health, United States.
Funding Information
This paper was supported by the following grants:
NIH OT2OD026580 to Igor Vorobyov, Colleen E Clancy.
NIH OT2OD025308‐01S2 to Parya Aghasafari.
American Heart Association 19CDA34770101 to Igor Vorobyov.
National Heart, Lung, and Blood Institute R01HL152681 to Igor Vorobyov, Colleen E Clancy.
National Heart, Lung, and Blood Institute R01HL128170 to Colleen E Clancy.
National Heart, Lung, and Blood Institute U01HL126273 to Colleen E Clancy.
UC Davis Department of Physiology and Membrane Biology Research Partnership Fund to Igor Vorobyov, Colleen E Clancy.
National Science Foundation MCB170095 to Igor Vorobyov, Colleen E Clancy.
National Centre for Supercomputing Applications to Igor Vorobyov, Colleen E Clancy.
Texas Advanced Computing Center MCB20010 to Igor Vorobyov, Colleen E Clancy.
Oracle to Igor Vorobyov, Colleen E Clancy.
Additional information
Competing interests
No competing interests declared.
Author contributions
Conceptualization, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing - original draft, Writing - review and editing.
Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing - original draft, Writing - review and editing.
Data curation, Methodology, Writing - original draft.
Data curation, Validation.
Data curation, Validation.
Data curation, Validation.
Investigation, Writing - review and editing.
Conceptualization, Resources, Data curation, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing - original draft, Project administration, Writing - review and editing.
Additional files
Data availability
Since we used simulated data, we have made all drugged and drug-free iPSC-CM and adult-CM AP data used for training and testing the multitask network publicly available at Clancy lab Github. (https://github.com/ClancyLabUCD/Multitask_network/tree/master/data, copy archived at https://archive.softwareheritage.org/swh:1:rev:7f2b653a91f552d66ae2d9b70b720f8706b36da3). In addition, we have illustrated training and test dataset in Figure 1 and Figure 5. We have also shared the jupyter notebook for preparing clean and organized data for training the network at Clancy lab Github (https://github.com/ClancyLabUCD/Multitask_network/tree/master/jupyter). We also made experimental data used for the model validation publicly available at Clancy lab Github. (https://github.com/ClancyLabUCD/Multitask_network/blob/master/data/clean_data/experiments.csv ). Figure 7 illustrates the experimental data we used to validate the network.
References
- Aghasafari P. Multitask-network. 7f2b653GitHub. 2021 https://github.com/ClancyLabUCD/Multitask_network
- Alhusseini MI, Abuzaid F, Rogers AJ, Zaman JAB, Baykaner T, Clopton P, Bailis P, Zaharia M, Wang PJ, Rappel WJ, Narayan SM. Machine learning to classify intracardiac electrical patterns during atrial fibrillation: machine learning of atrial fibrillation. Circulation. Arrhythmia and Electrophysiology. 2020;13:e008160. doi: 10.1161/CIRCEP.119.008160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Atkinson KE. An Introduction to Numerical Analysis. John wiley & sons; 2008. [Google Scholar]
- Ballinger B. DeepHeart: semi-supervised sequence learning for cardiovascular risk prediction. In Thirty-Second AAAI Conference On Artificial Intelligence.2018. [Google Scholar]
- Bian M. An accurate lstm based video heart rate estimation method. In Chinese Conference on Pattern Recognition and Computer Vision (PRCV).2019. [Google Scholar]
- Blinova K, Stohlman J, Vicente J, Chan D, Johannesen L, Hortigon-Vinagre MP, Zamora V, Smith G, Crumb WJ, Pang L, Lyn-Cook B, Ross J, Brock M, Chvatal S, Millard D, Galeotti L, Stockbridge N, Strauss DG. Comprehensive translational assessment of Human-Induced pluripotent stem cell derived cardiomyocytes for evaluating Drug-Induced arrhythmias. Toxicological Sciences. 2017;155:234–247. doi: 10.1093/toxsci/kfw200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blinova K, Dang Q, Millard D, Smith G, Pierson J, Guo L, Brock M, Lu HR, Kraushaar U, Zeng H, Shi H, Zhang X, Sawada K, Osada T, Kanda Y, Sekino Y, Pang L, Feaster TK, Kettenhofen R, Stockbridge N, Strauss DG, Gintant G. International multisite study of Human-Induced pluripotent stem Cell-Derived cardiomyocytes for drug proarrhythmic potential assessment. Cell Reports. 2018;24:3582–3592. doi: 10.1016/j.celrep.2018.08.079. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burridge PW, Li YF, Matsa E, Wu H, Ong SG, Sharma A, Holmström A, Chang AC, Coronado MJ, Ebert AD, Knowles JW, Telli ML, Witteles RM, Blau HM, Bernstein D, Altman RB, Wu JC. Human induced pluripotent stem cell-derived cardiomyocytes recapitulate the predilection of breast Cancer patients to doxorubicin-induced cardiotoxicity. Nature Medicine. 2016;22:547–556. doi: 10.1038/nm.4087. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cai C, Guo P, Zhou Y, Zhou J, Wang Q, Zhang F, Fang J, Cheng F. Deep Learning-Based prediction of Drug-Induced cardiotoxicity. Journal of Chemical Information and Modeling. 2019;59:1073–1084. doi: 10.1021/acs.jcim.8b00769. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carugo O, Eisenhaber F. Data Mining Techniques for the Life Sciences. Springer; 2010. [DOI] [Google Scholar]
- Casini S, Verkerk AO, Remme CA. Human iPSC-Derived cardiomyocytes for investigation of disease mechanisms and therapeutic strategies in inherited arrhythmia syndromes: strengths and limitations. Cardiovascular Drugs and Therapy. 2017;31:325–344. doi: 10.1007/s10557-017-6735-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen C, Hua Z, Zhang R, Liu G, Wen W. Automated arrhythmia classification based on a combination network of CNN and LSTM. Biomedical Signal Processing and Control. 2020;57:101819. doi: 10.1016/j.bspc.2019.101819. [DOI] [Google Scholar]
- Cheng J, Dong L, Lapata M. Long short-term memory-networks for machine reading. arXiv. 2016 https://arxiv.org/abs/1601.06733
- Collins TA, Rolf MG, Pointon A. Current and future approaches to nonclinical cardiovascular safety assessment. Drug Discovery Today. 2020;25:1129–1134. doi: 10.1016/j.drudis.2020.03.011. [DOI] [PubMed] [Google Scholar]
- de Korte T, Katili PA, Mohd Yusof NAN, van Meer BJ, Saleem U, Burton FL, Smith GL, Clements P, Mummery CL, Eschenhagen T, Hansen A, Denning C. Unlocking personalized biomedicine and drug discovery with human induced pluripotent stem Cell-Derived cardiomyocytes: fit for purpose or forever elusive? Annual Review of Pharmacology and Toxicology. 2020;60:529–551. doi: 10.1146/annurev-pharmtox-010919-023309. [DOI] [PubMed] [Google Scholar]
- Devore JL. Probability and Statistics for Engineering and the Sciences. Cengage learning; 2011. [Google Scholar]
- Dickson CJ, Velez-Vega C, Duca JS. Revealing molecular determinants of hERG blocker and activator binding. Journal of Chemical Information and Modeling. 2020;60:192–203. doi: 10.1021/acs.jcim.9b00773. [DOI] [PubMed] [Google Scholar]
- Doss MX, Sachinidis A. Current challenges of iPSC-Based disease modeling and therapeutic implications. Cells. 2019;8:403. doi: 10.3390/cells8050403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fabbri A, Goversen B, Vos MA, van Veen TAB, de Boer TP. Required GK1 to suppress automaticity of iPSC-CMs depends strongly on IK1 Model Structure. Biophysical Journal. 2019;117:2303–2315. doi: 10.1016/j.bpj.2019.08.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters. 2006;27:861–874. doi: 10.1016/j.patrec.2005.10.010. [DOI] [Google Scholar]
- Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks JMLR Workshop and Conference Proceedings. In Proceedings of The Fourteenth International Conference on Artificial Intelligence and Statistics.2011. [Google Scholar]
- Gong JQX, Sobie EA. Population-based mechanistic modeling allows for quantitative predictions of drug responses across cell types. Npj Systems Biology and Applications. 2018;4:1–11. doi: 10.1038/s41540-018-0047-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goodfellow I, Bengio Y, Courville A. Deep Learning. MIT press; 2016. [Google Scholar]
- Goversen B, van der Heyden MAG, van Veen TAB, de Boer TP. The immature electrophysiological phenotype of iPSC-CMs still hampers in vitro drug screening: special focus on IK1. Pharmacology & Therapeutics. 2018;183:127–136. doi: 10.1016/j.pharmthera.2017.10.001. [DOI] [PubMed] [Google Scholar]
- Guo A, Beheshti R, Khan YM, Langabeer JR, Foraker RE. Predicting cardiovascular health trajectories in time-series electronic health records with LSTM models. BMC Medical Informatics and Decision Making. 2021;21:1–10. doi: 10.1186/s12911-020-01345-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gustafsson F. Determining the initial states in forward-backward filtering. IEEE Transactions on Signal Processing. 1996;44:988–992. doi: 10.1109/78.492552. [DOI] [Google Scholar]
- He R, Liu Y, Wang K, Zhao N, Yuan Y, Li Q, Zhang H. Automatic cardiac arrhythmia classification using combination of deep residual network and bidirectional LSTM. IEEE Access. 2019;7:102119–102135. doi: 10.1109/ACCESS.2019.2931500. [DOI] [Google Scholar]
- Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation. 1997;9:1735–1780. doi: 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]
- Hou B, Yang J, Wang P, Yan R. LSTM-Based Auto-Encoder model for ECG arrhythmias classification. IEEE Transactions on Instrumentation and Measurement. 2019;69:1232–1240. doi: 10.1109/TIM.2019.2910342. [DOI] [Google Scholar]
- Jin Z, Oresko J, Huang S, Cheng AC. HeartToGo: a personalized medicine technology for cardiovascular disease prevention and detection. IEEE/NIH Life Science Systems and Applications Workshop; 2009. [DOI] [Google Scholar]
- Kernik DC, Morotti S, Wu H, Garg P, Duff HJ, Kurokawa J, Jalife J, Wu JC, Grandi E, Clancy CE. A computational model of induced pluripotent stem-cell derived cardiomyocytes incorporating experimental variability from multiple data sources. The Journal of Physiology. 2019;597:4533–4564. doi: 10.1113/JP277724. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kernik DC, Yang PC, Kurokawa J, Wu JC, Clancy CE. A computational model of induced pluripotent stem-cell derived cardiomyocytes for high throughput risk stratification of KCNQ1 genetic variants. PLOS Computational Biology. 2020;16:e1008109. doi: 10.1371/journal.pcbi.1008109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ketkar N. Introduction to pytorch. In: Chollet François., editor. Deep Learning with Python. Springer; 2017. pp. 195–208. [DOI] [Google Scholar]
- Kingma DP, Ba J. Adam: a method for stochastic optimization. arXiv. 2014 https://arxiv.org/abs/1412.6980
- Knollmann BC. Induced pluripotent stem cell-derived cardiomyocytes: boutique science or valuable arrhythmia model? Circulation Research. 2013;112:969–976. doi: 10.1161/CIRCRESAHA.112.300567. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koivumäki JT, Naumenko N, Tuomainen T, Takalo J, Oksanen M, Puttonen KA, Lehtonen Š, Kuusisto J, Laakso M, Koistinaho J, Tavi P. Structural immaturity of human iPSC-Derived cardiomyocytes: In Silico Investigation of Effects on Function and Disease Modeling. Frontiers in Physiology. 2018;9:80. doi: 10.3389/fphys.2018.00080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krogh A. What are artificial neural networks? Nature Biotechnology. 2008;26:195–197. doi: 10.1038/nbt1386. [DOI] [PubMed] [Google Scholar]
- Lan F, Lee AS, Liang P, Sanchez-Freire V, Nguyen PK, Wang L, Han L, Yen M, Wang Y, Sun N, Abilez OJ, Hu S, Ebert AD, Navarrete EG, Simmons CS, Wheeler M, Pruitt B, Lewis R, Yamaguchi Y, Ashley EA, Bers DM, Robbins RC, Longaker MT, Wu JC. Abnormal calcium handling properties underlie familial hypertrophic cardiomyopathy pathology in patient-specific induced pluripotent stem cells. Cell Stem Cell. 2013;12:101–113. doi: 10.1016/j.stem.2012.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- LeCun Y, Denker J, Solla S. Optimal brain damage. Advances in Neural Information Processing Systems; 1989. pp. 598–605. [Google Scholar]
- Leyton-Mange JS, Mills RW, Macri VS, Jang MY, Butte FN, Ellinor PT, Milan DJ. Rapid cellular phenotyping of human pluripotent stem cell-derived cardiomyocytes using a genetically encoded fluorescent voltage sensor. Stem Cell Reports. 2014;2:163–170. doi: 10.1016/j.stemcr.2014.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li M, Kanda Y, Ashihara T, Sasano T, Nakai Y, Kodama M, Hayashi E, Sekino Y, Furukawa T, Kurokawa J. Overexpression of KCNJ2 in induced pluripotent stem cell-derived cardiomyocytes for the assessment of QT-prolonging drugs. Journal of Pharmacological Sciences. 2017;134:75–85. doi: 10.1016/j.jphs.2017.05.004. [DOI] [PubMed] [Google Scholar]
- Li Z, Mirams GR, Yoshinaga T, Ridder BJ, Han X, Chen JE, Stockbridge NL, Wisialowski TA, Damiano B, Severi S, Morissette P, Kowey PR, Holbrook M, Smith G, Rasmusson RL, Liu M, Song Z, Qu Z, Leishman DJ, Steidl-Nichols J, Rodriguez B, Bueno-Orovio A, Zhou X, Passini E, Edwards AG, Morotti S, Ni H, Grandi E, Clancy CE, Vandenberg J, Hill A, Nakamura M, Singer T, Polonchuk L, Greiter-Wilke A, Wang K, Nave S, Fullerton A, Sobie EA, Paci M, Musuamba Tshinanu F, Strauss DG. General principles for the validation of proarrhythmia risk prediction models: an extension of the CiPA in silico strategy. Clinical Pharmacology & Therapeutics. 2020;107:102–111. doi: 10.1002/cpt.1647. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lieu DK, Fu JD, Chiamvimonvat N, Tung KC, McNerney GP, Huser T, Keller G, Kong CW, Li RA. Mechanism-based facilitated maturation of human pluripotent stem cell-derived cardiomyocytes. Circulation: Arrhythmia and Electrophysiology. 2013;6:191–201. doi: 10.1161/CIRCEP.111.973420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu F, Zhou X, Cao J, Wang Z, Wang H, Zhang Y. A LSTM and CNN based assemble neural network framework for arrhythmias classification. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2019. [DOI] [Google Scholar]
- Maragatham G, Devi S. LSTM model for prediction of heart failure in big data. Journal of Medical Systems. 2019;43:1–13. doi: 10.1007/s10916-019-1243-3. [DOI] [PubMed] [Google Scholar]
- Martis RJ, Acharya UR, Lim CM, Mandana KM, Ray AK, Chakraborty C. Application of higher order cumulant features for cardiac health diagnosis using ECG signals. International Journal of Neural Systems. 2013;23:1350014. doi: 10.1142/S0129065713500147. [DOI] [PubMed] [Google Scholar]
- Matsa E, Ahrens JH, Wu JC. Human induced pluripotent stem cells as a platform for personalized and precision cardiovascular medicine. Physiological Reviews. 2016;96:1093–1126. doi: 10.1152/physrev.00036.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murphy KP. Machine Learning: A Probabilistic Perspective. MIT press; 2012. [Google Scholar]
- Navarrete EG, Liang P, Lan F, Sanchez-Freire V, Simmons C, Gong T, Sharma A, Burridge PW, Patlolla B, Lee AS, Wu H, Beygui RE, Wu SM, Robbins RC, Bers DM, Wu JC. Screening drug-induced arrhythmia [corrected] using human induced pluripotent stem cell-derived cardiomyocytes and low-impedance microelectrode arrays. Circulation. 2013;128:S3–S13. doi: 10.1161/CIRCULATIONAHA.112.000570. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O'Hara T, Virág L, Varró A, Rudy Y. Simulation of the undiseased human cardiac ventricular action potential: model formulation and experimental validation. PLOS Computational Biology. 2011;7:e1002061. doi: 10.1371/journal.pcbi.1002061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oh SL, Ng EYK, Tan RS, Acharya UR. Automated diagnosis of arrhythmia using combination of CNN and LSTM techniques with variable length heart beats. Computers in Biology and Medicine. 2018;102:278–287. doi: 10.1016/j.compbiomed.2018.06.002. [DOI] [PubMed] [Google Scholar]
- Olah C. Understanding LSTM Networks. Aug. 2015. LSTMsGithub. 2017 https://colah.github.io/posts/2015-08-Understanding-LSTMs
- Picon A, Irusta U, Álvarez-Gila A, Aramendi E, Alonso-Atienza F, Figuera C, Ayala U, Garrote E, Wik L, Kramer-Johansen J, Eftestøl T. Mixed convolutional and long short-term memory network for the detection of lethal ventricular arrhythmia. PLOS ONE. 2019;14:e0216756. doi: 10.1371/journal.pone.0216756. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Powers DM. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv. 2011 https://arxiv.org/abs/2010.16061
- Reale RA, Brugge JF, Chan JCK. Maps of auditory cortex in cats reared after unilateral cochlear ablation in the neonatal period. Developmental Brain Research. 1987;34:281–290. doi: 10.1016/0165-3806(87)90215-X. [DOI] [PubMed] [Google Scholar]
- Rogers AJ. Machine learned cellular phenotypes predict outcome in ischemic cardiomyopathy. Circulation Research. 2020;128:172–184. doi: 10.1161/CIRCRESAHA.120.317345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ryu JY, Lee MY, Lee JH, Lee BH, Oh KS. DeepHIT: a deep learning framework for prediction of hERG-induced cardiotoxicity. Bioinformatics. 2020;36:3049–3055. doi: 10.1093/bioinformatics/btaa075. [DOI] [PubMed] [Google Scholar]
- Sala L, Bellin M, Mummery CL. Integrating cardiomyocytes from human pluripotent stem cells in safety pharmacology: has the time come? British Journal of Pharmacology. 2017;174:3749–3765. doi: 10.1111/bph.13577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sayed N, Liu C, Wu JC. Translation of Human-Induced pluripotent Stem Cells: From Clinical Trial in a Dish to Precision Medicine. Journal of the American College of Cardiology. 2016;67:2161–2176. doi: 10.1016/j.jacc.2016.01.083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sevakula RK, Au-Yeung WM, Singh JP, Heist EK, Isselbacher EM, Armoundas AA. State-of-the-Art machine learning techniques aiming to improve patient outcomes pertaining to the cardiovascular system. Journal of the American Heart Association. 2020;9:e013924. doi: 10.1161/JAHA.119.013924. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shaheen N, Shiti A, Huber I, Shinnawi R, Arbel G, Gepstein A, Setter N, Goldfracht I, Gruber A, Chorna SV, Gepstein L. Human induced pluripotent stem Cell-Derived cardiac cell sheets expressing genetically encoded voltage Indicator for pharmacological and arrhythmia studies. Stem Cell Reports. 2018;10:1879–1894. doi: 10.1016/j.stemcr.2018.04.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shi K, Steigleder T, Schellenberger S, Michler F, Malessa A, Lurz F, Rohleder N, Ostgathe C, Weigel R, Koelpin A. Contactless analysis of heart rate variability during cold pressor test using radar interferometry and bidirectional LSTM networks. Scientific Reports. 2021;11:1–13. doi: 10.1038/s41598-021-81101-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sinnecker D, Goedel A, Laugwitz KL, Moretti A. Induced pluripotent stem cell-derived cardiomyocytes: a versatile tool for arrhythmia research. Circulation Research. 2013;112:961–968. doi: 10.1161/CIRCRESAHA.112.268623. [DOI] [PubMed] [Google Scholar]
- Sube R, Ertel EA. Cardiomyocytes derived from human induced pluripotent stem cells: an In-Vitro model to predict cardiac effects of drugs. Journal of Biomedical Science and Engineering. 2017;10:527–549. doi: 10.4236/jbise.2017.1011040. [DOI] [Google Scholar]
- Sun N, Yazawa M, Liu J, Han L, Sanchez-Freire V, Abilez OJ, Navarrete EG, Hu S, Wang L, Lee A, Pavlovic A, Lin S, Chen R, Hajjar RJ, Snyder MP, Dolmetsch RE, Butte MJ, Ashley EA, Longaker MT, Robbins RC, Wu JC. Patient-specific induced pluripotent stem cells as a model for familial dilated cardiomyopathy. Science Translational Medicine. 2012;4:130ra47. doi: 10.1126/scitranslmed.3003552. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tanskanen AJ, Alvarez LH. Voltage noise influences action potential duration in cardiac myocytes. Mathematical Biosciences. 2007;208:125–146. doi: 10.1016/j.mbs.2006.09.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trayanova NA, Popescu DM, Shade JK. Machine learning in arrhythmia and electrophysiology. Circulation Research. 2021;128:544–566. doi: 10.1161/CIRCRESAHA.120.317872. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tu C, Chao BS, Wu JC. Strategies for improving the maturity of human induced pluripotent stem Cell-Derived cardiomyocytes. Circulation Research. 2018;123:512–514. doi: 10.1161/CIRCRESAHA.118.313472. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tveito A, Jæger KH, Huebsch N, Charrez B, Edwards AG, Wall S, Healy KE. Inversion and computational maturation of drug response using human stem cell derived cardiomyocytes in microphysiological systems. Scientific Reports. 2018;8:1–14. doi: 10.1038/s41598-018-35858-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tveito A, Jæger KH, Maleckar MM, Giles WR, Wall S. Computational translation of drug effects from animal experiments to human ventricular myocytes. Scientific Reports. 2020;10:1–11. doi: 10.1038/s41598-020-66910-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Veerman CC, Kosmidis G, Mummery CL, Casini S, Verkerk AO, Bellin M. Immaturity of human stem-cell-derived cardiomyocytes in culture: fatal flaw or soluble problem? Stem Cells and Development. 2015;24:1035–1052. doi: 10.1089/scd.2014.0533. [DOI] [PubMed] [Google Scholar]
- Wang EK, Zhang X, Pan L. Automatic classification of CAD ECG signals with SDAE and bidirectional long Short-Term network. IEEE Access. 2019;7:182873–182880. doi: 10.1109/ACCESS.2019.2936525. [DOI] [Google Scholar]
- Wang L, Zhou X. Detection of congestive heart failure based on LSTM-Based deep network via Short-Term RR intervals. Sensors. 2019;19:1502. doi: 10.3390/s19071502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Warrick P, Homsi MN. Cardiac arrhythmia detection from ECG combining convolutional and long short-term memory networks. In 2017 Computing in Cardiology (CinC).2017. [Google Scholar]
- Wu JC, Garg P, Yoshida Y, Yamanaka S, Gepstein L, Hulot JS, Knollmann BC, Schwartz PJ. Towards precision medicine with human iPSCs for cardiac channelopathies. Circulation Research. 2019;125:653–658. doi: 10.1161/CIRCRESAHA.119.315209. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang PC, Song Y, Giles WR, Horvath B, Chen-Izu Y, Belardinelli L, Rajamani S, Clancy CE. A computational modelling approach combined with cellular electrophysiology data provides insights into the therapeutic benefit of targeting the late na+ current. The Journal of Physiology. 2015;593:1429–1442. doi: 10.1113/jphysiol.2014.279554. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang PC, DeMarco KR, Aghasafari P, Jeng MT, Dawson JRD, Bekker S, Noskov SY, Yarov-Yarovoy V, Vorobyov I, Clancy CE. A computational pipeline to predict cardiotoxicity: from the atom to the rhythm. Circulation Research. 2020;126:947–964. doi: 10.1161/CIRCRESAHA.119.316404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yildirim Ö. A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification. Computers in Biology and Medicine. 2018;96:189–202. doi: 10.1016/j.compbiomed.2018.03.016. [DOI] [PubMed] [Google Scholar]
- Yildirim O, Baloglu UB, Tan RS, Ciaccio EJ, Acharya UR. A new approach for arrhythmia classification using deep coded features and LSTM networks. Computer Methods and Programs in Biomedicine. 2019;176:121–133. doi: 10.1016/j.cmpb.2019.05.004. [DOI] [PubMed] [Google Scholar]
- Yoshinaga D, Baba S, Makiyama T, Shibata H, Hirata T, Akagi K, Matsuda K, Kohjitani H, Wuriyanghai Y, Umeda K, Yamamoto Y, Conklin BR, Horie M, Takita J, Heike T. Phenotype-Based High-Throughput classification of long QT syndrome subtypes using human induced pluripotent stem cells. Stem Cell Reports. 2019;13:394–404. doi: 10.1016/j.stemcr.2019.06.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zaremba W, Sutskever I, Vinyals O. Recurrent neural network regularization. arXiv. 2014 https://arxiv.org/abs/1409.2329
- Zhang Y, Zhao J, Wang Y, Fan Y, Zhu L, Yang Y, Chen X, Lu T, Chen Y, Liu H. Prediction of hERG K+ channel blockage using deep neural networks. Chemical Biology & Drug Design. 2019;94:1973–1985. doi: 10.1111/cbdd.13600. [DOI] [PubMed] [Google Scholar]