Skip to main content
Nucleic Acid Therapeutics logoLink to Nucleic Acid Therapeutics
. 2013 Oct;23(5):302–310. doi: 10.1089/nat.2013.0436

Hepatotoxic Potential of Therapeutic Oligonucleotides Can Be Predicted from Their Sequence and Modification Pattern

Peter H Hagedorn 1, Victor Yakimov 2, Søren Ottosen 3, Susanne Kammler 3, Niels F Nielsen 3, Anja M Høg 3, Maj Hedtjärn 3, Michael Meldgaard 4, Marianne R Møller 5, Henrik Ørum 6, Troels Koch 6, Morten Lindow 1,2,
PMCID: PMC3760025  PMID: 23952551

Abstract

Antisense oligonucleotides that recruit RNase H and thereby cleave complementary messenger RNAs are being developed as therapeutics. Dose-dependent hepatic changes associated with hepatocyte necrosis and increases in serum alanine-aminotransferase levels have been observed after treatment with certain oligonucleotides. Although general mechanisms for drug-induced hepatic injury are known, the characteristics of oligonucleotides that determine their hepatotoxic potential are not well understood. Here, we present a comprehensive analysis of the hepatotoxic potential of locked nucleic acid-modified oligonucleotides in mice. We developed a random forests classifier, in which oligonucleotides are regarded as being composed of dinucleotide units, which distinguished between 206 oligonucleotides with high and low hepatotoxic potential with 80% accuracy as estimated by out-of-bag validation. In a validation set, 17 out of 23 oligonucleotides were correctly predicted (74% accuracy). In isolation, some dinucleotide units increase, and others decrease, the hepatotoxic potential of the oligonucleotides within which they are found. However, a complex interplay between all parts of an oligonucleotide can influence the hepatotoxic potential. Using the classifier, we demonstrate how an oligonucleotide with otherwise high hepatotoxic potential can be efficiently redesigned to abate hepatotoxic potential. These insights establish analysis of sequence and modification patterns as a powerful tool in the preclinical discovery process for oligonucleotide-based medicines.

Introduction

An essential property when developing oligonucleotides for therapeutics is that their main interactions with RNA follows Watson and Crick's base pairing rules for nucleic acids (Bennet and Swayze, 2010). Given these rules, and the sequence of an RNA molecule, designing perfectly matching oligonucleotides is straightforward. When using modern nucleic acid modification chemistries such as high-affinity locked nucleic acids (LNAs) (Obika et al., 1997; Koshkin et al., 1998) or 2′-O-methoxy ethyl (2′MOE) (Bennet and Swayze, 2010), in combination with a phosphorothioate backbone (Stein et al., 1998), a large fraction of such designs are able to bind and inhibit the targeted RNA. In contrast, for small molecules targeting proteins, screening libraries often need to contain hundreds of thousands of compounds in order to identify hits against a protein target (Hert et al., 2009). This makes oligonucleotides targeting RNA very attractive when it comes to fast and cost-effective discovery of efficacious and potent drug candidates.

Currently, the requirements for regulatory approval of oligonucleotides and small molecule drugs are similar (Schubert et al., 2012). Therefore, when it comes to effects not related to Watson-Crick guided hybridization, such as toxic liabilities, oligonucleotides and small molecule drugs are screened in a similar manner. As yet, very few oligonucleotides are on the market, but clinical and preclinical adverse effects reported for high-affinity oligonucleotides developed in recent years include injury to the liver and kidneys, two primary organs of oligonucleotide accumulation, as well as injection site reactions (LEVIN, 1999; Henry et al., 2007; Bennet and Swayze, 2010; Lindow et al., 2012). It seems that oligonucleotides as a chemical class are particularly associated with these types of toxicities. However, for any single oligonucleotide, irrespective of its modification-chemistry, the degree to which it manifests any of these liabilities, if at all, varies widely. In the case of hepatotoxicity, specific oligonucleotides with LNA modifications (Swayze et al., 2007; Seth et al., 2009; Stanton et al., 2012) and 2′MOE modifications (Swayze and Siwkowski, 2009; Burel and Henry, 2010) have been reported to induce elevations in alanine-aminotransferase (ALT), a serum biomarker of hepatocellular injury, when administered to mice even at relatively low doses. But on the other hand, many well-tolerated LNA-modified as well as 2′MOE-modified oligonucleotides have also been reported where no dose-limiting hepatotoxicity were seen during preclinical and clinical testing (Elmén et al., 2008; Bennet and Swayze, 2010; Gupta et al., 2010; Straarup et al., 2010; Janssen et al., 2011; Hildebrandt-Eriksen et al., 2012; Lindholm et al., 2012). These examples illustrate the marked differences in the hepatotoxic potential of different oligonucleotide compounds.

Discovering the characteristics of compounds that are more likely to yield safe, potent, and efficacious drugs is central for the development of drug discovery into a knowledge-based predictive science (Lipinski and Hopkins, 2004). Attempts to quantify such structure-activity relationships for small molecule compounds have shown good predictivity for specific endpoints such as solubility and permeability (Lipinski et al., 2001) or binding affinity to proteins with a known 3-dimensional structure (Wang et al., 2002). For complex endpoints such as hepatotoxicity, however, the predictivity remains poor (Low et al., 2011). Deriving descriptors from the chemical structure of small molecules that can be related to their toxic potential is not trivial (Benigni and Giuliana, 2003).

In this work, we set out to investigate whether structural descriptors of LNA-modified oligonucleotides can explain a complex endpoint, such as their hepatotoxic potential. Decomposing the chemical structure of an oligonucleotide into its nucleobase sequence and modification pattern, we report that machine learning techniques can produce a classification scheme, that captures a large part of the variation in the hepatotoxic potential of these oligonucleotides.

Materials and Methods

Oligonucleotides

LNA-modified antisense oligonucleotides were synthesized with complete phosphorothioate backbones using standard phosphoramidite protocols on an ÄKTA Oligopilot (GE Healthcare). After synthesis, the oligonucleotides were deprotected and cleaved from the solid support using aqueous ammonia at 65°C overnight. The oligonucleotides were purified by ion exchange high-performance liquid chromatography by applying a gradient of buffer B: 0%–80% in 38 column volumes (CV) followed by a washout of the column with 100% buffer B (4 CV) and 100% buffer C (4 CV) before reequilibration with 100% buffer A (4 CV). The applied buffers were A: 10mM NaOH; B: 10mM NaOH; 2M NaCl; and C: B/EtOH 4:1. The collected fractions from the purification were analyzed by ultra-performance liquid chromatography and fractions with a purity above 85% were pooled and the pH of the pool adjusted with HCl (1M) to pH 7–8. Subsequently, the pooled fractions were desalted on an Äkta CrossFlow using a Millipore membrane with a 1-kDa cutoff (Pellicon 2 Mini Ultrafiltration Module PLAC C 0.1 m2) and lyophilized. Finally, liquid chromatography–mass spectrometry (reverse phase and electrospray ionization–mass spectrometry) was used to verify compound identity and purity.

Animal experiments

Inbred C57BL/6J and outbred NMRI female mice were obtained from Taconic (Denmark) and fed standard diet ad libitum. At study start, C57BL/6J mice weighed 20±2 g (arithmetic mean±standard deviation) and NMRI mice 24±2 g. The animal facility was maintained on a 12-hour light–dark cycle throughout each study. In 25 individual studies, mice were divided into treatment groups (n=5) and dosed on days 0, 3, 7, 10, and 14. The treatment groups received either 0.9% saline or saline-formulated antisense oligonucleotide administered by intravenous injection. On day 16, mice were weighed, anesthetized (70% CO2/30% O2) before retroorbital blood sampling, and sacrificed by cervical dislocation. Livers were inspected visually and weighed. For real-time quantitative reverse transcription polymerase chain reaction (qRT-PCR), liver samples were immersed in RNAlater (Sigma-Aldrich). For histopathology, liver samples were fixed in formalin and embedded in paraffin (FFPE). All mouse protocols were approved by the Danish National Committee for Ethics in Animal Experiments.

Serum alanine-aminotransferase analysis and defining the upper limit of normal

Serum was analyzed for alanine-aminotransferase (ALT) activity using an enzymatic assay (Horiba ABX Diagnostics) according to the manufacturer's instructions adjusted to 96-well format. Measurements were correlated to a 2-fold diluted standard curve generated from an ABX Pentra MultiCal solution (Horiba ABX Diagnostics). Within each study, the mean ALT activity in the saline treated group was calculated, and all ALT levels presented as fold changes relative to that mean value. From the distribution of ALT levels in all saline-treated mice across all studies, robust measures of central tendency and variability were calculated as median, m=1.0, and median absolute deviation, σ=0.24. From this, the ALT upper limit of normal was defined as ULN=m+3σ=1.7.

RNA isolation and real-time quantitative RT-PCR

Total RNA from liver was isolated using the RNeasy kit (Qiagen) and quantification of messenger RNA (mRNA) was done using TaqMan assays (Applied Biosystems). The reverse transcription reaction was carried out with random decamers, 0.5mg total RNA and the M-MLV reverse transcriptase enzyme (Ambion) according to protocol for first strand complementary DNA (cDNA) synthesis. Depending on expression levels, cDNA was subsequently diluted 5 times in nuclease-free water before addition to the RT-PCR reaction mixture. The Applied Biosystems 7500/7900/ViiA real-time PCR instruments were used for amplification. Within each study, mRNA levels were normalized to actin, beta (Actb) or glyceraldehyde-3-phosphate dehydrogenase (Gapdh) and presented as fold-changes relative to average levels in saline controls.

Histopathology

Mouse FFPE liver samples were sectioned and stained with hematoxylin and eosin. Each section was examined by a single histopathologist and scored on 25 categories between 0 (absence of that change) and 3 (moderate to extensive) (Supplementary Table S1; Supplementary Data are available online at www.liebertpub.com/nat).

Statistical modeling and sequence analysis

The sequence composition and modification pattern of each oligonucleotide was encoded as an ordered feature vector of dinucleotide counts. With 4 DNA nucleotides (a, c, t, g) and 4 LNA variants (A, C, T, G), the dimension of the feature vector is (4+4)2=64. Each feature vector was assigned a class label as either high-tox or low-tox depending on the level of ALT measured when administering the corresponding oligonucleotide in mice (ALT >5× ULN or ALT <2× ULN, respectively). Feature vectors with associated class labels were used to train a random forests algorithm by growing unpruned classification trees from 5000 bootstrap samples of the training set (BREIMAN, 2001). For each tree, at each node, 8 out of the 64 dinucleotides were randomly selected, and the dinucleotide resulting in the best split chosen. The best split is defined as the split that maximally reduces the Gini impurity in the descendant nodes (Duda et al., 2001). For an ensemble of trees that have been trained in this manner, oligonucleotides to be classified are presented to each tree, and assigned the class voted for by the majority of trees (majority rule). An estimate of the accuracy of the prediction was calculated by averaging the prediction accuracies of each tree on the samples not included in training that tree (denoted out-of-bag samples). The accuracy of such an out-of-bag estimate has been demonstrated to be comparable to using a test set of the same size as the training set (BREIMAN, 2001). An estimate of the importance of each dinucleotide in the classification was calculated as the average decrease in Gini impurity over all splits where that dinucleotide was used (BREIMAN, 2001). Distance between sequences was calculated as the minimally possible number of substitutions, insertions, and deletions needed to transform one sequence into the other (generalized Levenshtein edit distance) (LEVENSHTEIN, 1966). Distance between a sequence and a group of sequences was calculated as the distance between the sequence and the closest member in the group of sequences (single linkage distance).

Results

Characterizing and defining hepatotoxicity

We systemically administered 236 different saline-formulated LNA-modified phosphorothioate antisense oligonucleotides by five intravenous injections of 15 mg/kg in C57BL/6J or NMRI mice over a 16-day period. Following standard guidelines for hepatotoxicity screening in rodents (PETERS, 2005), the total dose administered over approximately two weeks in this setup is more than 2-fold higher than the expected therapeutically relevant dose of a lead LNA-modified oligonucleotide, which is usually at 5–15 mg/kg a week in mice (Gupta et al., 2010; Lindholm et al., 2012; Straarup et al., 2010). About half of the oligonucleotides tested were found to induce liver injury to varying degrees as observed from highly elevated average levels of ALT in serum compared to saline-treated control mice 16 days after first dose (Fig. 1a). Histopathological evaluation of the livers from a subset of the mice confirmed that increases in ALT were significantly associated with single-cell necrosis (Fig. 1b). In addition, periportal clumping of cytoplasm, scattered eosinophilic droplets, and formation of microgranuloma, were also significantly associated with elevated levels of ALT (Fig. 1b). Variation of the dose level for a subset of the oligonucleotides resulted in dose-dependent changes of the ALT levels as measured 16 days after first dose (Supplementary Fig. S1a). Similarly, variation of the sacrifice time revealed that measurements of ALT levels 3 days after first dose, keeping the total dose administered at 75 mg/kg, were always lower than measurement of ALT levels 16 days after first dose (Supplementary Fig. S1b). Finally, the degree of hepatotoxicity observed generally correlated with increased liver/body weight ratio (Fig. 1c and Supplementary Fig. S1c, d), as have also been observed previously (Swayze et al., 2007; Henry et al., 2000). Taken together these observations qualify the utility of ALT as a biomarker for oligonucleotide-induced liver injury in mice. The 236 oligonucleotides in the data set were part of screening campaigns to find therapeutic candidates against 13 different mRNAs. For the targets where knock-down of the target mRNA was measured in the liver, no association between ALT levels and reduction in target mRNA levels was found (Fig. 1d). This suggests that the observed hepatotoxicity was not an effect of exaggerated on-target-pharmacology of the antisense oligonucleotides.

FIG. 1.

FIG. 1.

Evaluation of the hepatotoxic potential of locked nucleic acid (LNA)-modified oligonucleotides in mice. (a) Distribution of average alanine-aminotransferase (ALT) levels (n=5) relative to intrastudy average saline control mice (n=5) for 236 LNA-modified oligonucleotides. Two- and five-times upper limit of normal (ULN) ALT levels are indicated. (b) Association between histopathological observables graded between 0 (nothing observed) and 3 (moderate to extensive) and ALT levels relative to saline in individual mice [n=162 mice treated with 29 oligonucleotides or saline; boxes cover interquartile range (IQR) and median, whiskers extend to 1.5×IQR; P values calculated by one-way analysis of variance]. (c) Correlation between average liver/body weight ratio for mice treated with the same oligonucleotide and the average ALT level in these mice (n=203 oligonucleotides; Pearson's product-moment correlation). (d) For oligonucleotides targeting liver transcripts (n=67 oligonucleotides against 8 target transcripts), average ALT level in mice treated with the same oligonucleotide plotted against average liver target transcript level in these mice.

Associating hepatotoxic potential with sequence

To address whether sequence-based descriptors can be associated with the hepatotoxic potential, we grouped the 236 oligonucleotides based on the presence or absence of specific dinucleotides in their sequence, and examined the distribution of ALT levels in each group. Each oligonucleotide consists of a central region, the gap, of at least 7 DNA nucleotides, flanked by 2–3 LNA nucleotides at each end. Of the 16 possible dinucleotides appearing in the DNA-only gap-region of the oligonucleotides, 9 were found to significantly (P<0.05) associate with either increased or decreased ALT levels (Fig. 2a). In contrast, of the 16 possible dinucleotides appearing in the LNA-only flank-regions of the oligonucleotides, only one, GT, was found to significantly associate with ALT (Fig. 2b).

FIG. 2.

FIG. 2.

Association between oligonucleotide sequence and hepatotoxic potential. (a) For each of the 16 possible DNA-only dinucleotides present in the gap-region of the oligonucleotides, the distributions of ALT levels for oligonucleotides where the dinucleotide is either present (gray) or absent (white) are shown (*P<0.05, **P<0.01, ***P<0.001, Wilcoxon rank-sum test). Note that y-axis is log-scale. (b) Same as Fig. 2a, but for LNA-only dinucleotides present in the flanks of the oligonucleotides.

Predicting hepatotoxic potential from sequence

The distribution of average ALT levels for each oligonucleotide in Fig. 1a displayed a peak around 1 (equal to average ALT level in intrastudy saline-treated control mice), representing oligonucleotides with low hepatotoxic potential, and a heavy tail at higher levels of ALT, representing progressively higher potential for hepatotoxicity. From this, oligonucleotides displaying less than two times ALT upper limit of normal (ULN) levels, clearly included all oligonucleotides with no or very low hepatotoxic potential (designated low-tox oligonucleotides), and oligonucleotides displaying more than five times elevated ULN levels were defined as having a clear high hepatotoxic potential (designated high-tox oligonucleotides), see Fig. 1a. We observed no difference in the prevalence of high-tox and low-tox oligonucleotides between NMRI and C57BL/6J mice (Supplementary Tables S2 and S3). Based on these observations, from the initial set of 236 oligonucleotides we generated reduced subsets of 97 and 109 oligonucleotides, respectively, by removing oligonucleotides with intermediate ALT levels (between 2× ULN and 5× ULN). For each sequence, an ordered feature vector of all 64 possible dinucleotide counts was computed and used to train a random forests classification algorithm (BREIMAN, 2001). Classification performance was estimated from out-of-bag samples at 80% accuracy (76% specificity and 83% sensitivity; P<0.001 by Fisher's exact test; Fig. 3a). When evaluating the importance of each dinucleotide in the classification based on the average decrease in Gini impurity (Duda et al., 2001), 11 dinucleotides identified as significantly associated with hepatotoxicity when testing one feature at a time (as in Fig. 2a, b) were also identified as important in the random forests classification (Fig. 3b). Moreover, when several features are combined in the random forests, an additional set of at least 6 dinucleotides were revealed as being highly important (Fig. 3b) for the classification accuracy. We systematically tested other encoding schemes, as well as Markov chains (Durbin et al., 1998) and support vector machines (Platt, 1999) as alternative classification algorithms, without seeing improved performance compared to the random forests classifier trained on dinucleotide counts presented here (Supplementary Table S4).

FIG. 3.

FIG. 3.

Sequence-based prediction of hepatotoxic potential of oligonucleotides using random forests. (a) Out-of-bag estimate of classification performance. Results are shown in a modified 2×2 contingency table that were used to calculate the percentage of predictions that agreed with experimentally established classes of high and low hepatotoxic potential (correct predictions are colored dark gray and light gray. Specificity, sensitivity, and overall accuracy calculated). P values were calculated using Fisher's exact test. (b) Top 17 most important dinucleotides for classification, as evaluated by having a mean decrease in Gini index larger than 2. (c) Stratified 10-fold cross validation estimate of classification performance for oligonucleotides that are close, medium, or far away, respectively, as measured by the number of edits (Levensthein distance), from the closest oligonucleotide in the training set. (d) Prediction of hepatotoxic potential in an independent validation set of 13 oligonucleotides inducing ALT >5× ULN at lower total doses than 75 mg/kg (between 25 mg/kg and 60 mg/kg), and 10 oligonucleotides where ALT <2× ULN at higher total doses than 75 mg/kg (between 100 mg/kg and 125 mg/kg). (e) Distribution of Levenshtein distances for each of the 23 oligonucleotides in the validation set to the closest oligonucleotides in the training set.

Sequence distance from training set affects predictive performance

To evaluate how the predictive performance of the random forests classifier is affected by the degree of similarity between the sequence to be predicted, and the sequences used to train the classifier, we first calculated Levenshtein edit-distances (LEVENSHTEIN, 1966) between all oligonucleotides in the training set. Next, we randomly divided the training set into 10 equally sized subsets, keeping the ratio of high-tox to low-tox oligonucleotides constant. The classifier was then trained 10 times, leaving a different subset out of the training in each round (the test set), and evaluating the performance of that round's trained classifier by predicting the class of the oligonucleotides in the test set. In addition, the distance between each of the oligonucleotides in the test set, and the closest oligonucleotide in that round's training set, was recorded. The predictive performance of the classifier as estimated in this manner (Fig. 3c) is highest for sequences that are very similar to one or more sequences in the training set (when only 1 or 2 substitutions, insertions, or deletions are needed), with an accuracy of 87% (83% specificity and 90% sensitivity; P<0.001 by Fisher's exact test). For sequences that are further away (between 3 and 5 edits), the accuracy is estimated at 80% (82% specificity and 77% sensitivity; P<0.001 by Fisher's exact test). Finally, for sequences that are far away (between 6 and 8 edits), the accuracy is down to 69% (63% specificity and 77% sensitivity; P<0.01 by Fisher's exact test). The slightly decreasing accuracy of the prediction as a function of distance suggests that the trained random forests algorithm does not generalize fully to all possible combinations of dinucleotide counts, and may improve further as more training data become available.

Classification of oligonucleotides dosed at different levels

In addition to the setup with 15 mg/kg dosed five times over 16 days used for screening (Fig. 1), we also tested other dose levels. Out of 51 oligonucleotides dosed five times over 16 days, but with a total dose below 75 mg/kg (between 25 mg/kg and 60 mg/kg), 13 oligonucleotides displayed ALT increases above 5× ULN. Because of the observed dose-relationship for hepatotoxicity (Supplementary Fig. S1a), these oligonucleotides would be expected to have at least as high ALT increases, had they been screened at 75 mg/kg total dose. Similarly, out of 33 oligonucleotides dosed five times over 16 days at higher than 75 mg/kg total dose (between 100 mg/kg and 125 mg/kg), 10 oligonucleotides did not result in increases in ALT above 2× ULN. Using the same argument, had they been screened at 75 mg/kg total dose, the ALT levels would be expected to be just as low or even lower. None of these 13+10=23 oligonucleotides were included in the set of oligonucleotides used to train the random forests classifier, and they target 10 different mRNAs, 8 of which are different from those used in the training set. Consequently, they can be used as a validation set to test the performance of the classifier. When the random forests classifier was applied to this independent set of oligonucleotides, an overall accuracy of 74% was achieved (69% specificity and 80% sensitivity; P<0.05 by Fisher's exact test; Fig. 3d). When calculating distances to the training set, most were between 5 and 7 edits away (Fig. 3e), and for these distances, a classification accuracy of 74% matches well the expected classification accuracy as estimated from 10-fold cross validation (Fig. 3c).

Using the classifier in the efficient discovery of therapeutic oligonucleotides

For oligonucleotides where the dose levels needed to achieve sufficient target knockdown lies close to the dose levels that result in hepatic injury, we asked whether the trained random forests classifier could be used to identify alternative designs with reduced hepatotoxic potential. To evaluate this, we choose a LNA-modified oligonucleotide, designated seth, which has been reported as hepatotoxic by Seth et al., 2009. Under the constraints that re-designed oligonucleotides should have at least two and at most four LNAs in each flank, be between 13 nt and 16 nt in length, and at most target a region 1nt up- or downstream of the original target region, 90 redesigns are possible (Supplementary Table S5). These redesign constraints were selected based on our experience with the types of designs that have resulted in potent and efficacious oligonucleotides. The distribution of hepatotoxicity prediction scores, when applying the random forests classifier to the redesigns is shown in Fig. 4a. The hepatotoxicity prediction score is the fraction of trees in the random forests that predicts an oligonucleotide as having a high hepatotoxic potential, and the overall ensemble prediction is determined by majority vote among all trees. All designs were 5 to 7 edits from the training set (Fig. 4b), so we expect around 70% of the predictions to be accurate (Fig. 3c). The oligonucleotide seth was predicted as having a high hepatotoxic potential (seth in Fig. 4a and 4c). For experimental verification, we chose the redesigned sequence with the lowest possible prediction score, and therefore most likely to have a low hepatotoxic potential, (r3 in Fig. 4a and 4c) as well as two additional designs with intermediate predictions: one predicted as having a high hepatotoxic potential (r1 in Fig. 4a and 4c), the other as having a low hepatotoxic potential (r2 in Fig. 4a and 4c). We synthesized these four oligonucleotides and characterized their hepatotoxic potential in the 16-day mouse model. Consistent with earlier reports (Seth et al., 2009) seth induced ALT elevations 6.7 times above saline (Fig. 4c), and the redesigned oligonucleotide, r1, predicted as having a high hepatotoxic potential, indeed induced ALT levels 29 times above saline. Importantly, the redesigned oligonucleotides r2 and r3 predicted as having a low hepatotoxic potential were both confirmed to result in ALT levels below 2× ULN. These results exemplify that the random forests classifier can successfully guide lead-optimization efforts to identify subtly modified oligonucleotides (r2 and r3), that have much lower hepatotoxic potential than the original oligonucleotide (seth). Furthermore, inspection of these four sequences show, that in particular the addition of an AT in the 5′ gap-region of the r2 and r3 oligonucleotides is responsible for the reduced hepatotoxic potential (compare r1 with r2). Inspection of Fig. 2b shows that, indeed, this dinucleotide is associated with reduced hepatotoxic potential. However, the association is not statistically significant when evaluated across all oligonucleotides containing AT (Fig. 2b). Still, the random forests classifier, which also takes interactions between dinucleotides into account, is able to correctly predict that the inclusion of AT in these particular oligonucleotides will reduce the hepatotoxic potential significantly. For additional examples of oligonucleotides where the predicted hepatotoxic potential is confirmed experimentally, see Supplementary Table S6.

FIG. 4.

FIG. 4.

Model prediction and experimental evaluation of redesigned oligonucleotides. (a) Distribution of hepatotoxicity prediction scores for redesigned oligonucleotides targeting PTEN. Oligonucleotides with scores<0.5 are judged as having low hepatotoxic potential (dark gray region), and those with scores >0.5 as having high hepatotoxic potential (light gray region). (b) Distribution of Levensthein distances to training set for redesigned oligonucleotides targeting PTEN. (c) Sequence for oligonucleotide seth, and redesigned variants (r1, r2, and r3), and ALT levels when dosed 5×15 mg/kg over 16 days in NMRI mice. LNA shown in upper-case letters, DNA in lower-case letters. Error bars, standard error of the mean.

Discussion

Here, we used machine learning to evaluate the extent to which the hepatotoxic potential of oligonucleotides can be predicted from sequence and modification pattern. Using the random forests algorithm for classification, and encoding oligonucleotides as dinucleotide counts (with 4 DNAs and 4 LNAs, 64 dinucleotides are possible), as many as 17 dinucleotides were identified which alone, or in combination, were highly informative with respect to distinguishing between oligonucleotides with low or high potential for hepatotoxicity. A Levenshtein-edit distance measure was furthermore established that allowed stratification of the predictive performance based on the distance between the oligonucleotide to be predicted, and the set of oligonucleotides used to train the classifier. Most oligonucleotides, specifically those less than 6 edits away from the nearest member of the training set, were found to be predicted with at least 80% accuracy as evaluated by 10-fold cross validation. When oligonucleotides were further than 6 edits away, the estimated predictive performance was at 69% as evaluated by 10-fold cross validation. In an independent set of 23 oligonucleotides, the majority of which were 6 or more edits away, 74% were predicted correctly. Moreover, when using the trained random forests classifier to guide the efficient redesign of an oligonucleotide demonstrated to have a high hepatotoxic potential, several new designs were identified, and a low hepatotoxic potential confirmed experimentally for two of them. Taken together, these findings indicate that descriptors of sequence composition and modification pattern, here dinucleotide counts, can capture and quantify characteristics of the oligonucleotides associated with hepatic injury, and thereby distinguish with high accuracy and robustness between oligonucleotides with low and high hepatotoxic potential. Including such a predictor in the drug discovery process is expected to reduce the attrition rate due to observed hepatotoxicity in rodents.

The results presented here do not directly elucidate the mechanisms of hepatotoxicity. We speculate, however, that the hepatotoxic potential is related to the propensity of an oligonucleotide to bind to certain proteins. That oligonucleotides can interact with specific proteins is well known (Bennet and Swayze, 2010), and these interactions can be sequence dependent, as is the case with Toll-like receptors binding oligonucleotides containing cg dinucleotides (KRIEG, 2006). Another, endogenous, example is found with the genetic disorder myotonic dystrophy, a dominantly inherited disease that is the most common cause of muscular dystrophy in adults. Here, microsatellite expansions result in formation of a double-stranded hairpin structure of repeated cug trinucleotides in the 3′UTR of the dystrophia myotonica-protein kinase gene (DMPK). The protein muscleblind-like splicing regulator 1 (MBNL1) is sequestered to this hairpin structure depleting it from the nucleoplasm. The downstream effects include disruption of alternative splicing, mRNA translation and mRNA decay (Lee and Cooper, 2009). Similarly, our observations are in accord with a model where yet unidentified proteins exist, which when interacting in a sequence and modification-specific manner with an oligonucleotide, induce hepatotoxicity in mice. A possible next step is to employ pull-down methods (Rigo et al., 2012) to characterize the proteins binding preferentially to oligonucleotides with either low or high hepatotoxic potential.

Until now, a main differentiator between small molecule drugs and oligonucleotides has been that oligonucleotides can be designed efficiently to ensure a high proportion of hits when it comes to binding and inhibiting their intended molecular target. With this work we show that the sequence-based nature of oligonucleotides allows a simple decomposition, which, given sufficient amounts of data, enables prediction also of complex, nonhybridization-mediated properties like hepatotoxicity. We expect that this approach is generalizable to other complex properties such as bioavailability and biodistribution, as well as other types of toxicity.

Supplementary Material

Supplemental data
Supp_Table1.pdf (19.6KB, pdf)
Supplemental data
Supp_Fig1.pdf (93.3KB, pdf)
Supplemental data
Supp_Table2.pdf (27.2KB, pdf)
Supplemental data
Supp_Table3.pdf (25.3KB, pdf)
Supplemental data
Supp_Table4.pdf (25.5KB, pdf)
Supplemental data
Supp_Table5.pdf (27.6KB, pdf)
Supplemental data
Supp_Table6.pdf (25.4KB, pdf)

Acknowledgments

For their excellent technical assistance, we would like to thank Camilla T. Haugaard, Lisbeth Bang, and Rikke Sølberg, for handling of animal facilities, Bo R. Pedersen, Marianne B. Mogensen, for synthesis and formulation of LNA-modified oligonucleotides, Bettina Nordbo, Lene S Jørgensen, Charlotte Øverup, and Jette D Hedegaard, for liver and serum measurements; Stine Møllerud for data collection; and Roger Burnett for histopathology evaluations. Additionally, we are grateful to Bo Hansen, Yann Tessier and Mark Turner for valuable discussions and support in the preparation of the manuscript.

Author Disclosure Statement

P.H. Hagedorn, S. Ottosen, S. Kammler, N.F. Nielsen, A.M. Høg, M. Hedtjärn, M. Meldgaard, M.R. Møller, H. Ørum, T. Koch, and M. Lindow are employees of Santaris Pharma, a company that is developing LNA-modified oligonucleotides for therapeutic purposes. The work is partially funded by a grant from the Danish Strategic Research Council.

References

  1. BENNETT C.F. SWAYZE E.E. RNA targeting therapeutics: molecular mechanisms of antisense oligonucleotides as a therapeutic platform. Annu. Rev. Pharmacol. Toxicol. 2010;50:259–293. doi: 10.1146/annurev.pharmtox.010909.105654. [DOI] [PubMed] [Google Scholar]
  2. BENIGNI R. GIULIANI A. Putting the predictive toxicology challenge into perspective: reflections on the results. Bioinformatics. 2003;19:1194–1200. doi: 10.1093/bioinformatics/btg099. [DOI] [PubMed] [Google Scholar]
  3. BREIMAN L. Random forests. Mach. Learn. 2001;45:5–32. [Google Scholar]
  4. BUREL S. HENRY S. Compounds and methods for modulating toxic and proinflammatory effects. Patent. 2010. International publication number WO2010108035A1.
  5. DUDA R.O. HART P.E. STORK D.G. 2nd. John Wiley; New York: 2001. Pattern Classification; pp. 394–443. [Google Scholar]
  6. DURBIN R. EDDY S. R. KROGH A. MITCHISON G. Cambridge University Press; Cambridge, UK: 1998. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids; pp. 46–79. [Google Scholar]
  7. ELMÉN J. LINDOW M. SILAHTAROGLU A. BAK M. CHRISTENSEN M. LIND-THOMSEN A. HEDTJÄRN M. HANSEN J.B. HANSEN H.F. STRAARUP E.M., et al. Antagonism of microRNA-122 in mice by systemically administered LNA-antimiR leads to up-regulation of a large set of predicted target mRNAs in the liver. Nucleic Acids Res. 2008;36:1153–1162. doi: 10.1093/nar/gkm1113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. GUPTA N. FISKER N. ASSELIN M.-C. LINDHOLM M.W. ROSENBOHM C. ØRUM H. ELMÉN J. SEIDAH N.G. STRAARUP E.M. A locked nucleic acid antisense oligonucleotide (LNA) silences PCSK9 and enhances LDLR expression in vitro and in vivo. PLoS One. 2010;5:e10682. doi: 10.1371/journal.pone.0010682. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. HENRY S. STECKER K. BROOKS D. MONTEITH D. CONKLIN B. BENNETT C.F. Chemically modified oligonucleotides exhibit decreased immune stimulation in mice. J. Pharmacol. Exp. Ther. 2000;292:468–479. [PubMed] [Google Scholar]
  10. HENRY S.P. KIM T.-W. KRAMER-STICKLUND K. ZANARDI T.A. FEY R.A. LEVIN A.A. Toxicologic properties of 2′-O-methoxyethyl chimeric antisense inhibitors in animals and man. In: Crooke S. T., editor. Antisense Drug Technology. CRC Press; Boca Raton, FL: 2007. pp. 327–365. [Google Scholar]
  11. HERT J. IRWIN J.J. LAGGNER C. KEISER M.J. SHOICHET B.K. Quantifying biogenic bias in screening libraries. Nature Chem. Biol. 2009;5:479–483. doi: 10.1038/nchembio.180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. HILDEBRANDT-ERIKSEN E.S. AARUP V. PERSSON R. HANSEN H.F. MUNK M.E. ØRUM H. A locked nucleic acid oligonucleotide targeting microRNA 122 is well tolerated in cynomolgus monkeys. Nucleic Acid Ther. 2012;22:152–161. doi: 10.1089/nat.2011.0332. [DOI] [PubMed] [Google Scholar]
  13. JANSSEN H.L. REESINK H.W. LAWITZ E.J. ZEUZEM S. RODRIGUEZ-TORRES M. PATEL K. VAN DER MEER A. J. PATICK A. K. CHEN A. ZHOU Y., et al. Treatment of HCV infection by targeting microRNA. N. Engl. J. Med. 2013;368:1685–1694. doi: 10.1056/NEJMoa1209026. [DOI] [PubMed] [Google Scholar]
  14. KOSHKIN A.A. SINGH S.K. NIELSEN P. RAJWANSHI V.K. KUMAR R. MELDGAARD M. OLSEN C.E. WENGEL J. LNA (locked nucleic acids): synthesis of the adenine, cytosine, guanine, 5-methylcytosine, thymine and uracil bicyclonucleoside monomers, oligomerisation, and unprecedented nucleic acid recognition. Tetrahedron. 1998;54:3607–3630. [Google Scholar]
  15. KRIEG A.M. Therapeutic potential of Toll-like receptor 9 activation. Nat. Rev. Drug Discov. 2006;5:471–484. doi: 10.1038/nrd2059. [DOI] [PubMed] [Google Scholar]
  16. LEE J.E. COOPER T.A. Pathogenic mechanisms of myotonic dystrophy. Biochem. Soc. Trans. 2009;37:1281–1286. doi: 10.1042/BST0371281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. LEVENSHTEIN V.I. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys. Dokl. 1966;10:707–710. [Google Scholar]
  18. LEVIN A.A. A review of the issues in the pharmacokinetics and toxicology of phosphorothioate antisense oligonucleotides. Biochim. Biophys. Acta. 1999;1489:69–84. doi: 10.1016/s0167-4781(99)00140-2. [DOI] [PubMed] [Google Scholar]
  19. LINDHOLM M.W. ELMÉN J. FISKER N. HANSEN H.F. PERSSON R. MØLLER M.R. ROSENBOHM C. ØRUM H. STRAARUP E.M. KOCH T. PCSK9 LNA antisense oligonucleotides induce sustained reduction of LDL cholesterol in nonhuman primates. Mol. Ther. 2012;20:376–381. doi: 10.1038/mt.2011.260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. LINDOW M. VORNLOCHER H.-P. RILEY D. KORNBRUST D.J. BURCHARD J. WHITELEY L.O. KAMENS J. THOMPSON J.D. NOCHUR S. YOUNIS H., et al. Assessing unintended hybridization-induced biological effects of oligonucleotides. Nat. Biotechnol. 2012;30:920–923. doi: 10.1038/nbt.2376. [DOI] [PubMed] [Google Scholar]
  21. LIPINSKI C.A. LOMBARDO F. DOMINY B.W. FEENEY P.J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 2001;46:3–26. doi: 10.1016/s0169-409x(00)00129-0. [DOI] [PubMed] [Google Scholar]
  22. LIPINSKI C.A. HOPKINS A.L. Navigating chemical space for biology and medicine. Nature. 2004;432:855–861. doi: 10.1038/nature03193. [DOI] [PubMed] [Google Scholar]
  23. LOW Y. UEHARA T. MINOWA Y. YAMADA H. OHNO Y. URUSHIDANI T. SEDYKH A. MURATOV E. KUZ'MIN V. FOURCHES D., et al. Predicting drug-induced hepatotoxicity using QSAR and toxicogenomics approaches. Chem. Res. Toxicol. 2011;24:1251–1262. doi: 10.1021/tx200148a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. OBIKA S. NANBU D. HARI Y. MORIO K.-I. IN Y. ISHIDA T. IMANISHI T. Synthesis of 2′-O,4′-C-methyleneuridine and -cytidine. Novel bicyclic nucleosides having a fixed C3, -endo sugar puckering. Tetrahedron Lett. 1997;38:8735–8738. [Google Scholar]
  25. PETERS T.S. Do preclinical testing strategies help predict human hepatotoxic potentials? Toxicol. Pathol. 2005;33:146–154. doi: 10.1080/01926230590522121. [DOI] [PubMed] [Google Scholar]
  26. PLATT J.C. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Smola A., editor; A., editor; A. Bartlett P., editor; Schölkopf B., editor; Shuurmans D., editor. Advances in Large Margin Classifiers. MIT Press: Cambridge, MA; 1999. [Google Scholar]
  27. RIGO F. HUA Y. CHUN S.J. PRAKASH T.P. KRAINER A.R. BENNETT C.F. Synthetic oligonucleotides recruit ILF2/3 to RNA transcripts to modulate splicing. Nat. Chem. Biol. 2012;8:555–561. doi: 10.1038/nchembio.939. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. SCHUBERT D. LEVIN A. A. KORNBRUST D. BERMAN C. L. CAVAGNARO J. HENRY S. SEGUIN R. FERRARI N. SHREWSBURY S.B. The Oligonucleotide Safety Working Group (OSWG) Nucleic Acid Ther. 2012;4:211–212. doi: 10.1089/nat.2012.0383. [DOI] [PubMed] [Google Scholar]
  29. SETH P.P. SIWKOWSKI A. ALLERSON C.R. VASQUEZ G. LEE S. PRAKASH T.P. WANCEWICZ E.V. WITCHELL D. SWAYZE E.E. Short antisense oligonucleotides with novel 2′–4′ conformationaly restricted nucleoside analogues show improved potency without increased toxicity in animals. J. Med. Chem. 2009;52:10–13. doi: 10.1021/jm801294h. [DOI] [PubMed] [Google Scholar]
  30. STANTON R. SCIABOLA S. SALATTO C. WENG Y. MOSHINSKY D. LITTLE J. WALTERS E. KREEGER J. DIMATTIA D. CHEN T., et al. Chemical Modification Study of Antisense Gapmers. Nucleic Acid Ther. 2012;22:344–359. doi: 10.1089/nat.2012.0366. [DOI] [PubMed] [Google Scholar]
  31. STEIN C.A. SUBASINGHE C. SHINOZUKA K. COHEN J.S. Physicochemical properties of phosphorothioate oligodeoxynucleotides. Nucleic Acids Res. 1988;16:3209–3221. doi: 10.1093/nar/16.8.3209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. STRAARUP E.M. FISKER N. HEDTJÄRN M. LINDHOLM M.W. ROSENBOHM C. AARUP V. HANSEN H.F. ØRUM H. HANSEN J.B.R. KOCH T. Short locked nucleic acid antisense oligonucleotides potently reduce apolipoprotein B mRNA and serum cholesterol in mice and non-human primates. Nucleic Acids Res. 2010;38:7100–7111. doi: 10.1093/nar/gkq457. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. SWAYZE E.E. SIWKOWSKI A.M. WANCEWICZ E.V. MIGAWA M.T. WYRZYKIEWICZ T.K. HUNG G. MONIA B.P. BENNETT C.F. Antisense oligonucleotides containing locked nucleic acid improve potency but cause significant hepatotoxicity in animals. Nucleic Acids Res. 2007;35:687–700. doi: 10.1093/nar/gkl1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. SWAYZE E.E. SIWKOWSKI A.M. Oligomeric compounds composing bicyclic nucleosides and having reduced toxicity. Patent. 2009. International publication number WO09124295A2.
  35. WANG R. LAI L. WANG S. Further development and validation of empirical scoring functions for structure-based binding affinity prediction. J. Comput. Aided Mol. Des. 2002;16:11–26. doi: 10.1023/a:1016357811882. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental data
Supp_Table1.pdf (19.6KB, pdf)
Supplemental data
Supp_Fig1.pdf (93.3KB, pdf)
Supplemental data
Supp_Table2.pdf (27.2KB, pdf)
Supplemental data
Supp_Table3.pdf (25.3KB, pdf)
Supplemental data
Supp_Table4.pdf (25.5KB, pdf)
Supplemental data
Supp_Table5.pdf (27.6KB, pdf)
Supplemental data
Supp_Table6.pdf (25.4KB, pdf)

Articles from Nucleic Acid Therapeutics are provided here courtesy of Mary Ann Liebert, Inc.

RESOURCES