Abstract
The rapid delayed rectifier current carried by the human Ether-à-go-go-Related Gene (hERG) channel is susceptible to drug-induced reduction, which can lead to an increased risk of cardiac arrhythmia. Establishing the mechanism by which a specific drug compound binds to hERG can help reduce uncertainty when quantifying pro-arrhythmic risk. In this study, we introduce a methodology for optimizing experimental voltage protocols to produce data that enable different proposed models for the drug-binding mechanism to be distinguished. We demonstrate the performance of this methodology via a synthetic data study. If the underlying model of hERG current is known exactly, then the optimized protocols generated show noticeable improvements in our ability to select the true model when compared with a simple protocol used in previous studies. However, if the model is not known exactly, and we assume a discrepancy between the data-generating hERG model and the hERG model used in fitting the models, then the optimized protocols become less effective in determining the ‘true’ binding dynamics. While the introduced methodology shows promise, we must be careful to ensure that, if applied to a real data study, we have a well-calibrated model of hERG current gating.
This article is part of the theme issue ‘Uncertainty quantification for healthcare and biological systems (Part 1)’.
Keywords: hERG, mathematical model, safety pharmacology, binding mechanism, model discrepancy, experimental design
1. Introduction
Ion channels are proteins in the cell membrane that form pores through which ions can flow in and out of the cell. The resulting ion currents play an important role in several biological functions including coordinating the contraction of muscle cells. A healthy heart relies on regular, coordinated contractions of cardiomyocytes (heart muscle cells) to pump blood from the heart around the body [1]. The Kv11.1 ion channel encoded by the human Ether-à-go-go-Related Gene (hERG) is responsible for conducting the rapid delayed rectifier potassium current, , and plays a crucial role in cardiomyocytes recovering from excitation [2]. However, the hERG channel is susceptible to unintended block by pharmaceutical small molecules (referred to here as ‘compounds’ throughout); this can lead to a reduction in , lengthening the cardiac action potential, and, in some cases, increasing the risk of cardiac arrhythmia [3–5].
Markov-style computational models of ion channels define transition rates between several channel states (e.g. open, inactive and closed) and can be used to simulate channel current in response to a membrane potential. To model the interactions of drug compounds with an ion channel, additional states and rates can be introduced to an existing ion channel model to simulate various binding mechanisms [6]. When it comes to the hERG channel, it has been observed that the binding mechanism can be compound-specific [7–9]. One such example of this is the propensity of a compound to become ‘trapped’ (unable to unbind) when the channel closes. Some compounds, such as bepridil and dofetilide, are known to become trapped inside the central hERG cavity, remaining bound, while others, such as cisapride and verapamil, unbind when the channel closes [10–12]. It has also been theorized that some drugs bind preferentially to certain channel states over others, or in the extreme case, bind to a particular state only [13–17]. These compound-specific binding mechanisms suggest that a one-size-fits-all approach to modelling hERG-drug interactions is perhaps limiting [6]. To accurately model how a certain compound binds to the hERG channel, it is, therefore, important to determine the specific mechanisms at play.
The transmembrane current of a cell can be measured in response to the membrane potential via a voltage-clamp experiment, where a piecewise function defining the transmembrane voltage, , dependent on time, , is applied. We refer to this function, , as a voltage protocol. In a recent study, Lei et al. [18] considered fitting a set of 15 pharmacological models representing possible drug-binding mechanisms (trapping/non-trapping, state binding preference, etc.) to previously collected voltage-clamp data under a relatively simple protocol. After fitting the set of models, it was suggested that more information-rich experiments may be needed to distinguish between model outputs and assist in determining compound-specific binding mechanisms. Previous studies on drug-binding dynamics have considered ‘manual’ experimental design techniques to increase the information extracted from voltage-clamp experiments [19–23]. This often involves some degree of expert knowledge to design protocols that are expected to emphasize particular compound-specific behaviours. In this work, we instead consider ‘automated’ optimal experimental design (OED) techniques.
OED methods consider how the design of a data-collecting experiment can be optimized with respect to some statistical criterion, effectively maximizing the information provided by the experiment (subject to constraints). These methods have been used recently in the field of cardiac modelling with some success [24]. In this paper, we consider OED methods to design voltage protocols that can be used in voltage-clamp experiments to better distinguish between different models of drug-binding mechanism. We detail a synthetic data methodology for generating an optimized protocol and fitting models to data collected under this protocol. Our results demonstrate how the optimized designs can assist in establishing the true binding dynamics at play across a range of simulated drug compounds exhibiting differing dynamics. However, we find that introducing a discrepancy between the hidden ‘true’ data-generating model and the proposed model we work with to fit the data reduces the effectiveness of this method and suggests that further work may be needed to account for these inevitable model discrepancies when working with real data.
2. Mathematical models
(a). hERG physiological models
In this paper, we consider two physiological models of hERG; one simple four-state model that is used throughout and a slightly more complex five-state model exhibiting differing behaviour that is used when we introduce hERG model discrepancy. These models describe the voltage-dependent gating behaviour of at physiological temperature when no drug compound is present. Figure 1 shows Markov diagrams of the two hERG models we consider. Figure 1a shows the four-state model, with transition rates –, which is equivalent to physiological model B in Lei et al. [18]. The four states in this model are IC (inactive closed), C (closed), I (inactive) and O (open), where each variable represents the proportion of channels in that state. Figure 1b shows the five-state model from Lu et al. [25], with transition rates , , , , , , , , and . This model has three closed states (C1, C2 and C3), and no states that are both inactive and closed. All transition rates in both models (apart from and ) are dependent on transmembrane voltage, ; they are defined by the general equation where and are physiological model parameters taken from the literature. In both models, the rate of change of each state over time, , is defined by a differential equation. For example, for the model in figure 1a, the rate of change of open proportion is defined by
Figure 1.
Markov diagrams of physiological hERG models (a,b) and a pharmacological binding model (c). The model in (a) is a four-state symmetric hERG model as used by Lei et al. [18]. The four states are IC (inactive closed), C (closed), I (inactive) and O (open). The model in (b) is a five-state model equivalent to that derived by Lu et al. [25]. This model has three closed states (C1, C2 and C3) as well as O (open) and I (inactive) states. The pharmacological drug-binding in (c) is Model 7 from Lei et al. [18]. The I and O states represent the inactive and open states in the underlying physiological hERG models (this model could be attached to the right-hand side of either physiological model shown in (a) or (b)). The binding model additionally has two drug-bound states ID (inactive drug-bound) and OD (open drug-bound), in which no current flows.
| (2.1) |
The measured current, , is then calculated for both models via the following equation
| (2.2) |
where is conductance, is the open state proportion and is the Nernst potential (membrane potential at which there is no net flux).
(b). Pharmacological binding models for hERG
We can extend the Markov models of hERG described in the previous section to model drug-binding dynamics in hERG. We consider a set of 15 models that characterize different proposed candidate mechanisms for drug-binding as illustrated in Lei et al. ([18]; figure 2) and included in our electronic supplementary material, fig. S1. Figure 1c shows a Markov diagram of drug-binding Model 7 as an example of one of these 15 models. In this example model, we have two additional states: ID (inactive drug-bound) and OD (open drug-bound). Binding rates are described by the parameters , , and , and the rate of on-binding is dependent on the drug concentration and the Hill coefficient . This example model represents non-trapping binding behaviour as the channel cannot enter a closed state without the drug first unbinding (returning from an ID or OD state to an I or O state). In contrast, some of the 15 binding models represent trapping dynamics. The trapping component either involves a ‘mirror image’ of the physiological hERG model, which allows for a channel to close and prevent unbinding from ICD or CD states (corresponding to IC and C in the drug-bound channel) or is represented by additional trapped states. In electronic supplementary material, fig. S1, Models 4, 5, 5i, 6, 9 and 10 illustrate the ‘mirror image’ trapping, while Models 11, 12 and 13 include additional trapped states.
Figure 2.
Synthetic verapamil data under drug-binding Model 7 generated under a Milnes protocol with the Lei 37°C hERG model. On the left, we plot a single sweep of the Milnes protocol. In the top middle, we plot the control (in black) and drug currents (blue, orange, green and red corresponding to four different drug concentrations) for 10 sweeps of the Milnes protocol. We only plot the currents that occur during the 10 s pulse at 0 mV for each sweep. In the bottom middle, we plot the corresponding proportion open for each drug concentration, which is calculated by dividing each drug sweep by the control current. On the right-hand side, we include a zoomed-in look at the first (2 s) of the first pulse for both currents and proportion open. This illustrates the currents starting at approximately 0 pA and the open proportion starting at approximately 1. The noise in the initial low currents contributes to the increased noise at the beginning of the open proportion sweeps. This motivates using a noise model for the ratio of two normally distributed variables when fitting drug-binding models to this data.
3. Introducing the OED methodology
(a). Initial synthetic data
To measure the effect of drug block on the hERG channel, we can collect two sets of voltage-clamp time-series data; the control current, , before the drug compound is introduced and the current in the presence of the drug compound, (often at several different concentrations ). We begin by generating synthetic ion channel drug-binding data in a form resembling what we would expect to collect in a voltage-clamp experiment under a simple modified Milnes voltage protocol as used in the Comprehensive in vitro Proarrhythmia Assay initiative [19,26]. With these synthetic data, we fit the set of 15 drug-binding models from electronic supplementary material, fig. S1 and illustrate a need for a more complex protocol to differentiate between these models. The process by which these synthetic data is generated is described as follows. All electrophysiology simulations are performed in Myokit [27].
In figure 2, we plot the Milnes protocol, i.e. controlled voltage time-series (left), synthetic current data (top middle) and synthetic open proportion data (bottom middle). In the synthetic current plot, the control current in black, , is generated under the Lei Markov model described in figure 1a with parameters for physiological temperatures taken from Lei et al. [28]. We run 10 sweeps (repeats in series) of the Milnes protocol and plot the 10 s of each sweep corresponding to the 0 mV pulse in the protocol. We generate equivalent currents in the presence of four concentrations of verapamil (), and these are plotted in blue (30 nM), orange (100 nM), green (300 nM) and red (1000 nM). Gaussian random noise is added to all sweeps with a standard deviation of 10 pA. We use a step size of 0.5 ms between data points, and we can define as the set of all times for the plotted traces. The plotted proportion of channels open, , in the bottom middle of figure 2 is calculated by dividing the drug current sweeps, , by the control sweep, , and this is the normalized quantity we use to fit the drug-binding models, as in [26,29].
As an example for many of the figures in this paper, for our true data-generating model of drug-binding dynamics, we have used the drug-binding Model 7 described in Lei et al. [18] with parameters estimated from verapamil voltage-clamp data collected at physiological temperatures by Li et al. [26,29]. Lei et al. [18] found that this model gave ‘plausible’ fits to the Li et al. data and agrees with the literature that verapamil does not tend to become trapped. This non-trapping behaviour involves having no bound closed state in the binding model; the drug can only be bound and block the channel when the channel is in an OD or ID state. In §4, we go on to examine findings if any other binding model was the true data-generating model for other drugs as well.
(b). Initial model fitting
Next, we fit each of the 15 drug-binding models to the synthetic ‘proportion open’ data, . As a first pass, we assume that the underlying Lei hERG model is known exactly (i.e. there is no hERG model discrepancy and the model parameters are known) and we wish to fit only the parameters of the drug-binding models. The data, , are generated by dividing two current traces which both have normally distributed iid noise, i.e. the data are a ratio of two normally distributed random variables. If and are two independent normal random variables, then the ratio has probability density function (PDF) given by [30]
| (3.1) |
where , , and
| (3.2) |
For our synthetic data, is the modelled current at time in the presence of a drug compound of concentration under some drug-binding parameterization , while is the control current at time and does not depend on or . We simplify things by assuming that and , the standard deviations of the measurement error on the drug and control currents, respectively, are equal () resulting in . We can then use the PDF in equation (3.1) to derive a log-likelihood function for some drug-binding model parameterization and standard deviation given a dataset
| (3.3) |
Each of our 15 drug-binding models can then be fitted to by maximizing this log-likelihood function with respect to and .
We use the covariance matrix adaptation evolution strategy (CMA-ES) optimization algorithm [31] via the probabilistic inference on noisy time-series framework [32] to perform this maximization. We repeat the CMA-ES optimization 10 times, with each repeat starting from a different parameter initialization sampled from wide boundaries as described in [18], and take the largest obtained log-likelihood. The choice of 10 repeats is motivated by a desire to balance computation time with accuracy; on average, 8 of the 10 repeats give very similar maximized likelihoods and corresponding parameter estimates. In figure 3, we plot the fits obtained via this method for each of the 15 drug-binding models. It is difficult to visually distinguish between the quality of these fits and, at a glance, it appears that all models fit the data relatively well.
Figure 3.
Fits of the 15 binding models to the Milnes protocol proportion-open synthetic data shown in figure 2. Note that it is difficult to visually distinguish between the quality of the model fits for this protocol. The fitted models are shown with solid lines, and we plot the data shaded behind these fits.
In the left-hand plot of figure 4, we plot the maximized log-likelihood values obtained when fitting each model under the Milnes protocol. The true data-generating model (Model 7) and one with a very similar structure (Model 11) have the largest maximized log-likelihoods, while all other models fall within a range of below this value. We note that we are fitting each model to very high-resolution data ( data points), which gives rise to large log-likelihood values. A traditional model selection method such as the likelihood ratio test or the Akaike information criterion (AIC) may suggest, based on the differences in log-likelihoods, that the data-generating model is chosen over the other models. However, in a real data scenario, we generally have less confidence in the veracity of the model of observed hERG current owing to experimental artefacts [33], so we avoid using likelihood ratio testing or AIC in this context. Ultimately, we arrive at the same conclusion drawn by Lei et al. [18]; this protocol is not information-rich enough to distinguish between drug-binding models, which motivates our use of OED methods.
Figure 4.
Maximized log-likelihoods for model fits to synthetic data generated under the modified Milnes protocol (left) and an optimized protocol (right). In each plot, we include a green dashed line at below the largest maximized log-likelihood and a red dashed line at below the largest maximized log-likelihood. We note that these lines act simply as visual aids to emphasize the increased spread in the quality of model fits under the optimized protocol. We also shade, in grey, around the true data-generating binding model.
(c). Optimizing the experimental design
Our next step is to develop an improved experimental design that emphasizes differences between drug-binding models. We do this by employing OED techniques. Let us assume that we have some experimental design, , that is a function of some parameter set . We then need to establish some optimality criterion, , that is a function of the design, and determine a that maximizes . We begin by considering ; what do we want to optimize? Practically speaking, we want a protocol that accentuates the differences between the models for a specific drug compound. This motivates the following optimality criterion.
(i). Optimality criterion
Assume we have our fitted models from figure 3 and these are represented by for where is the set of 15 drug-binding models and is the maximum likelihood estimate (MLE) parameter set for model . Note we have included as an input to here, as is dependent on the experimental design. We can then calculate the pairwise squared difference between each of the model outputs, which we label , where
| (3.4) |
We can then propose that our optimality criterion seeks to maximize the median value of across all , , which we call
| (3.5) |
There are many choices of design criteria to optimize. A conventional option is the T-optimality criterion introduced by Atkinson & Fedorov [34], which involves maximizing the minimum pairwise difference. However, in our case with some pairs of very similar models, initial experimentation with a T-optimality approach often resulted in the two most similar models being separated slightly, while the optimizer would ignore the pairwise differences between the other 13 models. Here we favour the median to ensure that the objective will seek to increase the pairwise differences between at least half of the 15 models.
(ii). Optimizing the protocol
We now turn our attention to . With consideration of experimental practicality, we can establish a design space that we want to optimize the max-med criterion over. We begin by splitting the 10 s 0 mV pulse used in the Milnes protocol into three separate steps (of length 3340, 3330 and 3330 ms) each of which can be set to a voltage in the range of to mV. To increase the design space, we include two 10 s pulses per sweep and allow both pulses to each have three different voltage steps. Further to this, we allow the times between pulses at the holding potential of mV to vary; the time following the first pulse and second pulse can be set to be anywhere between 1050 and 21 000 ms. In total, we then have eight degrees of freedom for optimizing the max-med criterion; six voltage step values and two interpulse durations. Let us then define where is the th voltage step (in mV) in the th pulse in the protocol and is the time (in ms) following the th pulse in the protocol. We can then optimize the max-med criterion with respect to via CMA-ES to determine an optimal protocol . To initialize the CMA-ES optimization, we take 100 random samples from within the voltage and time-step bounds defined above ([, ] and [1050, 21 000], respectively), and then use the that gives the largest value of the max-med criterion as our initialization for the optimizer. We note here that finding the global optimum does not necessarily matter, our goal here is simply to find an improved design.
(d). Fitting models to synthetic data generated with an optimized protocol
With an optimized protocol, we can then generate a new set of synthetic data using the methodology described in §3a. Our new protocol is comprised of two multistep 10 s pulses of interest, whereas the Milnes protocol has just one single-step 10 s pulse. We, therefore, generate data for only five sweeps (cf. 10) of our new protocol to ensure that the synthetic data under the optimized protocol have the same number of data points as the Milnes synthetic data. In figure 5, we plot this new synthetic current and proportion open data (right top and bottom, respectively) alongside the optimized protocol (left). We can then fit the 15 drug-binding models to this new synthetic data using the same method described in §3b; the fits are shown in figure 6. We see that, when compared with figure 3, we now have a number of models that do not appear to fit the data very well. This is backed up by the right-hand plot in figure 4, where we see a greater spread in maximized log-likelihood values in the optimized protocol case compared with the Milnes case, making it easier to distinguish between models. Model 7, the true data-generating model, has the largest maximized log-likelihood; while no other models have maximized log-likelihoods within of Model 7. In the electronic supplementary material, fig. S2, we include a plot comparing the model parameters fitted to the Milnes data and the model parameters fitted to the optimized protocol data.
Figure 5.
Synthetic data for drug-binding Model 7 generated under an optimized protocol with the Lei 37°C hERG model. On the left-hand side, we plot our optimized protocol with two 10 s pulses, each with three optimized voltage steps. Top right shows the control and drug currents for five sweeps of the optimized protocol. We only plot the currents that occur during the two 10 s pulses per protocol sweep. The bottom right-hand plot shows the corresponding proportion open for each drug concentration which is calculated by dividing each drug sweep by the control current.
Figure 6.
Fits of the binding models to the optimized protocol proportion open synthetic data shown in figure 5, in the same style as figure 3. Note how many fits are now visually distinguishable, we can immediately see that many models provide a worse fit to these data than others.
(e). A discrepant hERG model
So far, we have been working under the assumption that the structure and parameterization of the underlying data-generating hERG model are known exactly. Bernardo & Smith [35] describe this as an M-closed model space, where the true data-generating model is included in the set of models considered for model selection. We can never guarantee this practically, so we now introduce an example where the assumed hERG model used in the model fitting and protocol optimization steps is different from the hERG model used during the data-generating process. We are now operating in an M-open model space where the true data-generating model is not within our set of candidate models.
To generate synthetic data, we now use the Lu model as illustrated in figure 1b. In figure 7, we plot a comparison between control currents under the Lu model and the 37°C Lei model previously used to generate synthetic data. We set the conductances () for both models to be equal. We note that the difference in dynamics between these two models appears to be quite significant; for example, there is an approximately 350 ms difference in the time constants of activation, (measured during the Milnes protocol step to 0 mV). Sanguinetti & Jurkiewicz [36] approximated hERG as 50 ms based on experimental data collected under approximately similar experimental conditions (a voltage step to 0 mV at 35°C). Comparing this experimental estimate with the approximated for the Lei and Lu models, we get differences of 400 and 50 ms, respectively. The difference between our model dynamics (at least regarding activation times) falls within this range of differences between models and real data. While this model difference is potentially on the larger end of what we would want from a well-calibrated model of hERG when compared with real data, we consider this a stress test of how our OED methodology performs when there is a relatively large model discrepancy.
Figure 7.
A comparison of control currents between the Lei model (figure 1a) parameterized for 37°C and the Lu model (figure 1b). To the left, we plot the currents in response to one sweep of the Milnes protocol, and to the right in response to one sweep of the optimized protocol from figure 5. Conductances have been set to 33.3 nS for both models.
We can then repeat the procedure described in §3a–d, but this time using the Lu hERG model when generating any synthetic data (Model 7 is still used as the data-generating drug-binding model). In figure 8, we plot the log-likelihoods obtained using the Lu model as the true data-generating hERG model but (incorrectly) assuming that the 37°C Lei model is the correct hERG model when fitting the drug-binding models to the data.
Figure 8.
Maximized log-likelihoods for model fits to synthetic data generated under the modified Milnes protocol (left) and an optimized protocol (right). This is for the discrepant hERG model case; the data-generating hERG model is the Lu model, while the Lei 37°C is used for model fitting. In each plot, we include a green dashed line at below the largest maximized log-likelihood and a red dashed line at below the largest maximized log-likelihood. We also shade, in grey, around the data-generating binding model.
We appear to get quite similar results to the non-discrepant case, with the optimized protocol once again spreading out the values of the log-likelihoods and pointing towards Model 7 as the data-generating model. However, as we see in §4, we find these methods are less effective for other data-generating drug-binding models when discrepancies are introduced. In the electronic supplementary material, we include equivalent plots to figures 2, 3, 5 and 6 in electronic supplementary material, figs. S3–S6, respectively. In electronic supplementary material, fig. S7, we also include a plot comparing the model parameters fitted to the Milnes data and the model parameters fitted to the optimized protocol data.
4. Applying the OED methodology
In §3, we saw how our methodology performed for synthetic data generated under one drug-binding model (Model 7) parameterized for one drug compound. We now repeat these methods across a range of models and drugs to demonstrate how the procedure performs in differing circumstances.
(a). Verapamil: non-trapping dynamics
We begin by considering verapamil again, but this time we generate synthetic data for each of the 15 drug-binding models (initially using the Lei model in figure 1a as our underlying hERG model). Starting with the previous fits to real verapamil data from [18] to generate the synthetic data, in the top half of figure 9, we plot heatmaps of the log-likelihoods obtained via the §3 methodology. This is for the case with no hERG model discrepancy. The y-axis represents which drug-binding model is used to generate the synthetic data, and we then get a corresponding log-likelihood for each fitted drug-binding model on the x-axis. We plot both the log-likelihoods obtained for the Milnes protocol data fits (top left), and for the optimized protocol data fits (top right). For each row, we highlight (with green squares) the fitted models within of the largest log-likelihood in that row. We then also highlight (with red squares) the fitted models that are within of the largest log-likelihood (that are not already highlighted in green). Black dots are plotted down the diagonal indicating where the fitted model corresponds to the data-generating model. The first notable observation from this plot is that the number of models within the and thresholds is significantly reduced in the optimized protocol case compared with the Milnes case. We also note that in both cases, the full diagonal is within the threshold. Note that there is a large amount of model nesting between drug-binding models, which we illustrate in figure 10. We consider some model, model A, to be nested within another model, model B, if by fixing one or more of the parameters in model B, we can reduce model B to model A. For this reason, we expect that (in this non-discrepant hERG case) if model A is nested within model B, and model A is the true data-generating model, then model B should also be able to fit the data at least equally well (perhaps even slightly better given higher complexity and more parameters), so the maximized log-likelihoods for Model A and B should be approximately equivalent. At the top of the tree diagram are the models that are not nested within any others; models 7, 10, 11 and 12. If we consider the heatmap in the top right-hand side of figure 9 corresponding to the optimized protocol, we see that for synthetic data generated under models 7, 10 and 11, the only fitted model with a log-likelihood within the threshold is the true data-generating model. Model 12, the other model with no nesting, only has one other model within this threshold (model 13). Considering the nested models in figure 10, we can explain some of the cases where multiple models sit within the threshold. In the optimized protocol case in the top right of figure 9, the threshold highlighted squares in the heatmap rows corresponding to models 2, 2i, 8 and 13 can all be explained by model nesting. On the other hand, data generated under models 1, 3, 4, 5, 5i, 6 and 9 all have at least one model within the threshold that cannot be explained by nesting.
Figure 9.
Top: heatmaps of maximized log-likelihoods for fitted models to Milnes (left) and optimized protocol (right) synthetic verapamil data. The fitted models within of the largest log-likelihood (very good fits) in each row are highlighted in green squares. The fitted models that are between and below the largest log-likelihood (reasonable fits) in each row are highlighted in red squares. Black dots are plotted down the diagonal of each heatmap where the fitted model corresponds to the data-generating model. Note that our optimal protocol results in a far bigger spread of maximized log-likelihoods. Bottom: equivalent heatmaps of maximized log-likelihoods with discrepancy between the data-generating hERG model and the hERG model used to fit the data. The introduced hERG model discrepancy makes identifying the true data-generating drug-binding model difficult.
Figure 10.
Graph illustrating nesting between drug-binding models. An arrow pointing from Model A to Model B indicates that Model B is nested within Model A.
We now switch to the discrepant hERG model case where synthetic data were generated using the Lu physiological hERG model, but we use the Lei model for fitting. In the bottom half of figure 9, we plot equivalent heatmaps to those at the top of the plot, but for the discrepant case. The purpose of considering this case is to address the more realistic scenario where we do not have a perfect model of hERG channel dynamics. We notice that the number of highlighted and threshold squares is once again significantly reduced in the optimized protocol case. However, in most cases, the best-fitting binding model is no longer the one that generated the data. As a result, the diagonal is highlighted significantly less than in the non-discrepant case, with only eight models (2i, 4, 5, 5i, 7, 9, 12 and 13) sitting within the threshold. Clearly, the hERG model discrepancy is causing issues with identifying the true binding mechanisms.
(b). Bepridil: trapping dynamics
We can repeat the process described in the previous section for a different drug, this time with observed trapping behaviour [10,11]. In the electronic supplementary material, fig. S8, we include an example of model output for a drug with trapping behaviour, compared with one with no trapping behaviour. In the top half of figure 11, we plot the heatmaps of maximized log-likelihoods for the non-discrepant case as previously. The optimized protocol heatmap is very similar to that obtained with verapamil; the highlighted squares in the rows corresponding to models 1, 2, 2i, 3 and 8 are identical, while there are only a couple of differences for each of the other models. We can also again consider the discrepant hERG case, and we plot the heatmaps for this in the bottom half of figure 11. Again, this shows relatively similar results to those seen with verapamil; when we have discrepancy in the hERG model, it becomes difficult to determine the true data-generating drug-binding model. In the electronic supplementary material, fig. S9, we also include heatmaps for results obtained for chlorpromazine, a fast-binding drug with a suspected open-binding preference (over inactive-binding). These results appear similar to those obtained for verapamil and bepridil.
Figure 11.
Top: heatmaps of maximized log-likelihoods for fitted models to Milnes and optimized protocol synthetic bepridil data. Bottom: equivalent heatmaps of maximized log-likelihoods with discrepancy between the data-generating hERG model and the hERG model used to fit the data. For discussion of highlighting see the caption of figure 9.
(c). Dofetilide: slow-binding dynamics
Bepridil and verapamil both have relatively fast binding dynamics, so we also consider the slow-binding drug dofetilide [37]. In the electronic supplementary material, fig. S8, we include an example of model output for a drug with slow dynamics, compared with one with fast dynamics. Once again, we plot the heatmaps of maximized log-likelihoods as shown in figure 12, and we see that nearly all models can fit the Milnes data well. Unlike the fast dynamic drugs, with dofetilide, we get much less of a reduction in and threshold squares with the optimized protocol (in both the non-discrepant and discrepant cases). It is not obvious what causes this difference between fast and slow-binding drugs, and it may indicate that our median optimization objective function is less effective in the slow-binding case. We also include, in electronic supplementary material figs. S10–S13, heatmaps for each of the four considered drugs for the case where all model fitting is done assuming the Lu model is the true model of hERG dynamics (as opposed to the Lei model). These plots show similar overall results and indicate that the results obtained in the main text are not exclusive to the Lei hERG model.
Figure 12.
Top: heatmaps of maximized log-likelihoods for fitted models to Milnes and optimized protocol synthetic dofetilide data. Bottom: equivalent heatmaps of maximized log-likelihoods with discrepancy between the data-generating hERG model and the hERG model used to fit the data. For discussion of highlighting see the caption of figure 9.
5. Discussion
In this study, we have outlined a methodology for generating optimized voltage protocol designs to assist in distinguishing between models of drug-binding dynamics. By undertaking a synthetic data study, we have seen the potential benefits of this methodology when the true physiological model of hERG is known. Log-likelihoods of models fitted to data generated under our optimized protocols indicate a divergence in the quality of fits when compared with fits to data generated under a simple Milnes protocol. This suggests that this OED procedure could assist in establishing the true binding dynamics of a compound. The method was less effective when considering synthetic data emulating a slow-binding drug, dofetilide, when compared with drugs with faster dynamics such as verapamil and bepridil.
We also considered how discrepancy in the hERG model used to fit the data (compared with the data-generating hERG model) influenced the outcomes of our methodology. We found that when we used the Lu hERG model to generate synthetic data and the Lei 37°C hERG model to fit models to these data, log-likelihoods of model fits indicated that establishing the true data-generating drug-binding model became more difficult. This suggests that the underlying hERG model does play an important role when fitting drug-binding models to data and stresses the importance of continuing to improve basic models of physiological channel behaviour. A well-calibrated model of hERG current, that approximately matches the observed dynamics in real experimental data, would reduce the influence of model discrepancy on the outcomes of the OED procedure. In their 2019 paper, Clerx et al. [38] detail the benefits and limitations of several methods for calibrating models of ion channels. Our proposed methodology could perhaps be improved by considering fitting a hERG model to obtained control currents before fitting the drug-binding models (or fitting both the hERG and drug-binding models simultaneously).
We note that our results depend on the initial drug-binding model parameterizations for each of the three drug compounds, which come from model fits obtained by Lei et al. [18]. The quality of these model fits was quite variable from model to model and from compound to compound. This represents a limitation of a synthetic data study and motivates trialling the methodology on real data.
The methods used to fit the drug-binding models in this paper were developed to improve on those used by Lei et al. [18] and others [12,18,26]. While we have similarly fitted our models to the proportion of hERG in the open state, the log-likelihoods derived from the normal ratio PDF described above differ from the simple weighted sum-of-squares method used previously. This ratio likelihood fitting method provides a more realistic noise model for the data-generating process (in this synthetic scenario), avoiding problems associated with small control currents leading to large noise on the proportion open trace. In Lei et al. [18], the weighted sum-of-squares method required low currents at the start of each 10 s pulse to be cropped out to prevent noisy open proportion data from biasing the fit, our method allows us to fit the binding models to the full data sweep.
After some consideration and testing, the max-med optimality criterion was chosen over other possible alternatives such as maximizing the mean or the minimum of the pairwise sum-of-squares difference between model current traces. Using the mean or minimum, as opposed to the median, tended to result in one or two model pairwise differences biasing the objective function score and leaving many of the other current traces indistinguishable from each other. It would be useful to perform more rigorous comparisons between optimality criteria to see if there are scenarios where alternatives are more effective.
In the results above, we noted the presence of nesting between the binding models. This nesting suggests that perhaps reducing our optimization problem to consider only pairwise differences between the non-nested models (7, 10, 11 and 12) could be an alternative starting point given all other nested models are simplified versions of these. Another avenue to explore would be the use of multiple different protocols, each optimized to emphasize particular drug-binding properties perhaps, and examining the way different models need to compromise to fit data from each protocol has proved instructive in providing a lower bound on model discrepancy [39].
To conclude, the proposed OED methodology shows promise in determining the true binding dynamics, but care must be taken to ensure that we have a well-calibrated model of control hERG current if applied to a real data study.
Acknowledgements
Model fitting and experimental design optimizations were performed using the University of Nottingham’s on-premise high-performance computing (HPC) service.
Contributor Information
Frankie Patten-Elliott, Email: pmxfp1@nottingham.ac.uk.
Chon Lok Lei, Email: chonloklei@um.edu.mo.
Simon P. Preston, Email: simon.preston@nottingham.ac.uk.
Richard D. Wilkinson, Email: r.d.wilkinson@nottingham.ac.uk.
Gary R. Mirams, Email: gary.mirams@nottingham.ac.uk.
Data accessibility
Code is freely available at: [40] under a BSD-3-clause open source licence, and a permanently archived version is available on Zenodo at [41].
Supplementary material is available online [42].
Declaration of AI use
We have not used AI-assisted technologies in creating this article.
Authors’ contributions
F.P.-E.: conceptualization, data curation, formal analysis, investigation, methodology, software, validation, visualization, writing—original draft, writing—review and editing; C.L.L.: conceptualization, funding acquisition, software, supervision, writing—review and editing; S.P.P.: conceptualization, supervision, writing—review and editing; R.D.W.: conceptualization, supervision, writing—review and editing; G.R.M.: conceptualization, funding acquisition, project administration, supervision, writing—review and editing.
All authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Conflict of interest declaration
We declare we have no competing interests.
Funding
This work was supported by the Wellcome Trust (grant no. 212203/Z/18/Z); the Science and Technology Development Fund, Macao SAR (FDCT) [reference no. 0155/2023/RIA3 and 0048/2022/A]; the University of Macau [reference no. SRG2024-00014-FHS and FHS Startup Grant]; the EPSRC [grant no. EP/R014604/1]. GRM acknowledges support from the Wellcome Trust via a Wellcome Trust Senior Research Fellowship to GRM. CLL acknowledges support from the FDCT and the University of Macau to CLL. This research was funded in whole, or in part, by the Wellcome Trust [212203/Z/18/Z]. For the purpose of open access, the authors have applied a CC-BY public copyright licence to any Author Accepted Manuscript version arising from this submission.
References
- 1. Priest BT, McDermott JS. 2015. Cardiac ion channels. Channels 9, 352–359. ( 10.1080/19336950.2015.1076597) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Vandenberg JI, Perry MD, Perrin MJ, Mann SA, Ke Y, Hill AP. 2012. hERG K + Channels: Structure, Function, and Clinical Significance. Physiol. Rev. 92, 1393–1478. ( 10.1152/physrev.00036.2011) [DOI] [PubMed] [Google Scholar]
- 3. Mirams GR, Davies MR, Cui Y, Kohl P, Noble D. 2012. Application of cardiac electrophysiology simulations to pro‐arrhythmic safety testing. Br. J. Pharmacol. 167, 932–945. ( 10.1111/j.1476-5381.2012.02020.x) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Hill AP, et al. 2016. Computational cardiology and risk stratification for sudden cardiac death: one of the grand challenges for cardiology in the 21st century. J. Physiol. 594, 6893–6908. ( 10.1113/jp272015) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Heist EK, Ruskin JN. 2010. Drug-Induced Arrhythmia. Circulation 122, 1426–1435. ( 10.1161/circulationaha.109.894725) [DOI] [PubMed] [Google Scholar]
- 6. Mirams GR. 2023. Computational cardiac safety testing. In Drug discovery and evaluation: safety and pharmacokinetic assays, pp. 1–33. Cham, Switzerland: Springer International Publishing. ( 10.1007/978-3-030-73317-9_137-1) [DOI] [Google Scholar]
- 7. Redfern WS, et al. 2003. Relationships between preclinical cardiac electrophysiology, clinical QT interval prolongation and torsade de pointes for a broad range of drugs: evidence for a provisional safety margin in drug development. Cardiovasc. Res. 58, 32–45. ( 10.1016/s0008-6363(02)00846-5) [DOI] [PubMed] [Google Scholar]
- 8. Hancox JC, McPate MJ, El Harchi A, Zhang Y. 2008. The hERG potassium channel and hERG screening for drug-induced torsades de pointes. Pharmacol. Ther. 119, 118–132. ( 10.1016/j.pharmthera.2008.05.009) [DOI] [PubMed] [Google Scholar]
- 9. Perry M, Sanguinetti M, Mitcheson J. 2010. Symposium review: Revealing the structural basis of action of hERG potassium channel activators and blockers. J. Physiol. 588, 3157–3167. ( 10.1113/jphysiol.2010.194670) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Kamiya K, Niwa R, Mitcheson JS, Sanguinetti MC. 2006. Molecular Determinants of hERG Channel Block. Mol. Pharmacol. 69, 1709–1716. ( 10.1124/mol.105.020990) [DOI] [PubMed] [Google Scholar]
- 11. Stork D, Timin EN, Berjukow S, Huber C, Hohaus A, Auer M, Hering S. 2007. State dependent dissociation of HERG channel inhibitors. Br. J. Pharmacol. 151, 1368–1376. ( 10.1038/sj.bjp.0707356) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Windley MJ, Abi-Gerges N, Fermini B, Hancox JC, Vandenberg JI, Hill AP. 2017. Measuring kinetics and potency of hERG block for CiPA. J. Pharmacol. Toxicol. Methods 87, 99–107. ( 10.1016/j.vascn.2017.02.017) [DOI] [PubMed] [Google Scholar]
- 13. Thouta S, Lo G, Grajauskas L, Claydon T. 2018. Investigating the state dependence of drug binding in hERG channels using a trapped-open channel phenotype. Sci. Rep. 8, 4962. ( 10.1038/s41598-018-23346-x) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Carmeliet E, Mubagwa K. 1998. Antiarrhythmic drugs and cardiac ion channels: mechanisms of action. Prog. Biophys. Mol. Biol. 70, 1–72. ( 10.1016/s0079-6107(98)00002-9) [DOI] [PubMed] [Google Scholar]
- 15. Hondeghem LM. 1987. Antiarrhythmic agents: modulated receptor applications. Circulation 75, 514–520. ( 10.1161/01.cir.75.3.514) [DOI] [PubMed] [Google Scholar]
- 16. Starmer CF, Nesterenko VV, Gilliam FR, Grant AO. 1990. Use of ionic currents to identify and estimate parameters in models of channel blockade. Am. J. Physiol. Heart Circ. Physiol. 259, H626–H634. ( 10.1152/ajpheart.1990.259.2.h626) [DOI] [PubMed] [Google Scholar]
- 17. Starmer CF, Courtney KR. 1986. Modeling ion channel blockade at guarded binding sites: application to tertiary drugs. Am. J. Physiol. Heart Circ. Physiol. 251, H848–H856. ( 10.1152/ajpheart.1986.251.4.h848) [DOI] [PubMed] [Google Scholar]
- 18. Lei CL, Whittaker DG, Mirams GR. 2024. The impact of uncertainty in hERG binding mechanism on in silico predictions of drug‐induced proarrhythmic risk. Br. J. Pharmacol. 181, 987–1004. ( 10.1111/bph.16250) [DOI] [PubMed] [Google Scholar]
- 19. Milnes JT, Witchel HJ, Leaney JL, Leishman DJ, Hancox JC. 2010. Investigating dynamic protocol-dependence of hERG potassium channel inhibition at 37°C: Cisapride versus dofetilide. J. Pharmacol. Toxicol. Methods 61, 178–191. ( 10.1016/j.vascn.2010.02.007) [DOI] [PubMed] [Google Scholar]
- 20. Yao JA, Du X, Lu D, Baker RL, Daharsh E, Atterson P. 2005. Estimation of potency of HERG channel blockers: Impact of voltage protocol and temperature. J. Pharmacol. Toxicol. Methods 52, 146–153. ( 10.1016/j.vascn.2005.04.008) [DOI] [PubMed] [Google Scholar]
- 21. Di Veroli GY, Davies MR, Zhang H, Abi-Gerges N, Boyett MR. 2013. High-throughput screening of drug-binding dynamics to HERG improves early drug safety assessment. Am. J. Physiol. Heart Circ. Physiol. 304, H104–H117. ( 10.1152/ajpheart.00511.2012) [DOI] [PubMed] [Google Scholar]
- 22. Gomis-Tena J, Brown BM, Cano J, Trenor B, Yang PC, Saiz J, Clancy CE, Romero L. 2020. When Does the IC 50 Accurately Assess the Blocking Potency of a Drug? J. Chem. Inf. Model. 60, 1779–1790. ( 10.1021/acs.jcim.9b01085) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Escobar F, Gomis-Tena J, Saiz J, Romero L. 2022. Automatic modeling of dynamic drug-hERG channel interactions using three voltage protocols and machine learning techniques: A simulation study. Comput. Methods Programs Biomed. 226, 107148. ( 10.1016/j.cmpb.2022.107148) [DOI] [PubMed] [Google Scholar]
- 24. Lei CL, Clerx M, Gavaghan DJ, Mirams GR. 2023. Model-driven optimal experimental design for calibrating cardiac electrophysiology models. Comput. Methods Programs Biomed. 240, 107690. ( 10.1016/j.cmpb.2023.107690) [DOI] [PubMed] [Google Scholar]
- 25. Lu Y, Mahaut‐Smith MP, Varghese A, Huang C ‐H, Kemp PR, Vandenberg JI. 2001. Effects of premature stimulation on HERG K+ channels. J. Physiol. 537, 843–851. ( 10.1111/j.1469-7793.2001.00843.x) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Li Z, et al. 2019. Assessment of an In Silico Mechanistic Model for Proarrhythmia Risk Prediction Under the Ci PA Initiative. Clin. Pharmacol. Ther. 105, 466–475. ( 10.1002/cpt.1184) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Clerx M, Collins P, de Lange E, Volders PGA. 2016. Myokit: A simple interface to cardiac cellular electrophysiology. Prog. Biophys. Mol. Biol. 120, 100–114. ( 10.1016/j.pbiomolbio.2015.12.008) [DOI] [PubMed] [Google Scholar]
- 28. Lei CL, Clerx M, Beattie KA, Melgari D, Hancox JC, Gavaghan DJ, Polonchuk L, Wang K, Mirams GR. 2019. Rapid Characterization of hERG Channel Kinetics II: Temperature Dependence. Biophys. J. 117, 2455–2470. ( 10.1016/j.bpj.2019.07.030) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Li Z, Dutta S, Sheng J, Tran PN, Wu W, Chang K, Mdluli T, Strauss DG, Colatsky T. 2017. Improving the In Silico Assessment of Proarrhythmia Risk by Combining hERG (Human Ether-à-go-go-Related Gene) Channel–Drug Binding Kinetics and Multichannel Pharmacology. Circulation 10. ( 10.1161/circep.116.004628) [DOI] [PubMed] [Google Scholar]
- 30. Díaz-Francés E, Rubio FJ. 2013. On the existence of a normal approximation to the distribution of the ratio of two independent normal random variables. Stat. Pap. 54, 309–323. ( 10.1007/s00362-012-0429-2) [DOI] [Google Scholar]
- 31. Hansen N. 2006. The CMA evolution strategy:A comparing review. In Towards a new evolutionary computation:advances in the estimation of distribution algorithms (eds Lozano JA, Larranaga P, Inza I, Bengoetxea E), pp. 75–102. Berlin,Heidelberg: Springer. ( 10.1007/11007937_4) [DOI] [Google Scholar]
- 32. Clerx M, Robinson M, Lambert B, Lei CL, Ghosh S, Mirams GR, Gavaghan DJ. 2019. Probabilistic Inference on Noisy Time Series (PINTS). J. Open Res. Softw. 7, 23. ( 10.5334/jors.252) [DOI] [Google Scholar]
- 33. Marty A, Neher E. 1995. Tight-Seal Whole-Cell Recording. In Single-channel recording, pp. 31–52. Boston, MA: Springer US. ( 10.1007/978-1-4615-7858-1_7) [DOI] [Google Scholar]
- 34. Atkinson AC, Fedorov VV. 1975. The Design of Experiments for Discriminating Between two Rival Models. Biometrika 62, 57. ( 10.2307/2334487) [DOI] [Google Scholar]
- 35. Bernardo JM, Smith AF. 2009. Bayesian theory. vol. 405. Chichester, UK: John Wiley & Sons. ( 10.1002/9780470316870) [DOI] [Google Scholar]
- 36. Sanguinetti MC, Jurkiewicz NK. 1990. Two components of cardiac delayed rectifier K+ current. Differential sensitivity to block by class III antiarrhythmic agents. J. Gen. Physiol. 96, 195–215. ( 10.1085/jgp.96.1.195) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Kiehn J, Lacerda AE, Wible B, Brown AM. 1996. Molecular Physiology and Pharmacology of hERG. Circulation 94, 2572–2579. ( 10.1161/01.cir.94.10.2572) [DOI] [PubMed] [Google Scholar]
- 38. Clerx M, Beattie KA, Gavaghan DJ, Mirams GR. 2019. Four Ways to Fit an Ion Channel Model. Biophys. J. 117, 2420–2437. ( 10.1016/j.bpj.2019.08.001) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Shuttleworth JG, Lei CL, Whittaker DG, Windley MJ, Hill AP, Preston SP, Mirams GR. 2024. Empirical Quantification of Predictive Uncertainty Due to Model Discrepancy by Training with an Ensemble of Experimental Designs: An Application to Ion Channel Kinetics. Bull. Math. Biol. 86, 2. ( 10.1007/s11538-023-01224-6) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Patten-Elliott F, Mirams GR, Lei CL . 2024. CardiacModelling/binding_model_OED. GitHub. https://github.com/CardiacModelling/binding_model_OED
- 41. Patten-Elliott F, Mirams GR, Lei CL. 2024. CardiacModelling/binding_model_OED: Second release. Zenodo. 10.5281/zenodo.14751331 [DOI]
- 42. Patten-Elliott F, Lei CL, Preston S, Wilkinson R, Mirams GR. 2025. Supplementary Material from: Optimising experimental designs for model selection of ion channel drug binding mechanisms. Figshare. ( 10.6084/m9.figshare.c.7611338) [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Code is freely available at: [40] under a BSD-3-clause open source licence, and a permanently archived version is available on Zenodo at [41].
Supplementary material is available online [42].












